ABSTRACT
NP105-113-B*07:02-specific CD8+ T cell responses are considered among the most dominant in SARS-CoV-2-infected individuals. We found strong association of this response with mild disease. Analysis of NP105-113-B*07:02-specific T cell clones and single-cell sequencing were performed concurrently, with functional avidity and antiviral efficacy assessed using an in vitro SARS-CoV-2 infection system, and were correlated with T cell receptor usage, transcriptome signature and disease severity (acute n = 77, convalescent n = 52). We demonstrated a beneficial association of NP105-113-B*07:02-specific T cells in COVID-19 disease progression, linked with expansion of T cell precursors, high functional avidity and antiviral effector function. Broad immune memory pools were narrowed postinfection but NP105-113-B*07:02-specific T cells were maintained 6 months after infection with preserved antiviral efficacy to the SARS-CoV-2 Victoria strain, as well as Alpha, Beta, Gamma and Delta variants. Our data show that NP105-113-B*07:02-specific T cell responses associate with mild disease and high antiviral efficacy, pointing to inclusion for future vaccine design.
Subject(s)
HLA-B7 Antigen/immunology , Immunodominant Epitopes/immunology , Nucleocapsid Proteins/immunology , SARS-CoV-2/immunology , T-Lymphocytes, Cytotoxic/immunology , Aged , Amino Acid Sequence , Antibodies, Viral/immunology , Antibody Affinity/immunology , COVID-19/immunology , COVID-19/pathology , Cell Line, Transformed , Female , Gene Expression Profiling , Humans , Immunologic Memory/immunology , Male , Middle Aged , Receptors, Antigen, T-Cell/immunology , Severity of Illness Index , Vaccinia virus/genetics , Vaccinia virus/immunology , Vaccinia virus/metabolismABSTRACT
Recent advances in single-cell technologies have enabled high-throughput molecular profiling of cells across modalities and locations. Single-cell transcriptomics data can now be complemented by chromatin accessibility, surface protein expression, adaptive immune receptor repertoire profiling and spatial information. The increasing availability of single-cell data across modalities has motivated the development of novel computational methods to help analysts derive biological insights. As the field grows, it becomes increasingly difficult to navigate the vast landscape of tools and analysis steps. Here, we summarize independent benchmarking studies of unimodal and multimodal single-cell analysis across modalities to suggest comprehensive best-practice workflows for the most common analysis steps. Where independent benchmarks are not available, we review and contrast popular methods. Our article serves as an entry point for novices in the field of single-cell (multi-)omic analysis and guides advanced users to the most recent best practices.
Subject(s)
Gene Expression Profiling , Proteomics , Gene Expression Profiling/methods , Single-Cell Analysis/methodsABSTRACT
Single-cell RNA sequencing (scRNA-seq) is a widely used method for identifying cell types and trajectories in biologically heterogeneous samples, but it is limited in its detection and quantification of lowly expressed genes. This results in missing important biological signals, such as the expression of key transcription factors (TFs) driving cellular differentiation. We show that targeted sequencing of â¼1000 TFs (scCapture-seq) in iPSC-derived neuronal cultures greatly improves the biological information garnered from scRNA-seq. Increased TF resolution enhanced cell type identification, developmental trajectories, and gene regulatory networks. This allowed us to resolve differences among neuronal populations, which were generated in two different laboratories using the same differentiation protocol. ScCapture-seq improved TF-gene regulatory network inference and thus identified divergent patterns of neurogenesis into either excitatory cortical neurons or inhibitory interneurons. Furthermore, scCapture-seq revealed a role for of retinoic acid signaling in the developmental divergence between these different neuronal populations. Our results show that TF targeting improves the characterization of human cellular models and allows identification of the essential differences between cellular populations, which would otherwise be missed in traditional scRNA-seq. scCapture-seq TF targeting represents a cost-effective enhancement of scRNA-seq, which could be broadly applied to improve scRNA-seq resolution.
Subject(s)
Induced Pluripotent Stem Cells , Single-Cell Analysis , Gene Expression Profiling/methods , Gene Regulatory Networks , Humans , Induced Pluripotent Stem Cells/metabolism , Sequence Analysis, RNA/methods , Single-Cell Analysis/methods , Transcription Factors/genetics , Transcription Factors/metabolismABSTRACT
RNA-seq is the standard method for profiling gene expression in many biological systems. Due to the wide dynamic range and complex nature of the transcriptome, RNA-seq provides an incomplete characterization, especially of lowly expressed genes and transcripts. Targeted RNA sequencing (RNA CaptureSeq) focuses sequencing on genes of interest, providing exquisite sensitivity for transcript detection and quantification. However, uses of CaptureSeq have focused on bulk samples and its performance on very small populations of cells is unknown. Here we show CaptureSeq greatly enhances transcriptomic profiling of target genes in ultra-low-input samples and provides equivalent performance to that on bulk samples. We validate the performance of CaptureSeq using multiple probe sets on samples of iPSC-derived cortical neurons. We demonstrate up to 275-fold enrichment for target genes, the detection of 10% additional genes and a greater than 5-fold increase in identified gene isoforms. Analysis of spike-in controls demonstrated CaptureSeq improved both detection sensitivity and expression quantification. Comparison to the CORTECON database of cerebral cortex development revealed CaptureSeq enhanced the identification of sample differentiation stage. CaptureSeq provides sensitive, reliable and quantitative expression measurements on hundreds-to-thousands of target genes from ultra-low-input samples and has the potential to greatly enhance transcriptomic profiling when samples are limiting.
Subject(s)
Gene Expression Profiling , High-Throughput Nucleotide Sequencing , Sequence Analysis, RNA , Transcriptome , Cell Differentiation/genetics , Computational Biology/methods , Gene Expression Profiling/methods , High-Throughput Nucleotide Sequencing/methods , Humans , Neurons/cytology , Neurons/metabolism , Sequence Analysis, RNA/methods , Transcription Factors/metabolismABSTRACT
The study of immunology, traditionally reliant on proteomics to evaluate individual immune cells, has been revolutionized by single-cell RNA sequencing. Computational immunologists play a crucial role in analysing these datasets, moving beyond traditional protein marker identification to encompass a more detailed view of cellular phenotypes and their functional roles. Recent technological advancements allow the simultaneous measurements of multiple cellular components-transcriptome, proteome, chromatin, epigenetic modifications and metabolites-within single cells, including in spatial contexts within tissues. This has led to the generation of complex multiscale datasets that can include multimodal measurements from the same cells or a mix of paired and unpaired modalities. Modern machine learning (ML) techniques allow for the integration of multiple "omics" data without the need for extensive independent modelling of each modality. This review focuses on recent advancements in ML integrative approaches applied to immunological studies. We highlight the importance of these methods in creating a unified representation of multiscale data collections, particularly for single-cell and spatial profiling technologies. Finally, we discuss the challenges of these holistic approaches and how they will be instrumental in the development of a common coordinate framework for multiscale studies, thereby accelerating research and enabling discoveries in the computational immunology field.
Subject(s)
Computational Biology , Machine Learning , Humans , Computational Biology/methods , Single-Cell Analysis/methods , Allergy and Immunology , Animals , ImmunoinformaticsABSTRACT
Single-cell multiomic analysis of the epigenome, transcriptome, and proteome allows for comprehensive characterization of the molecular circuitry that underpins cell identity and state. However, the holistic interpretation of such datasets presents a challenge given a paucity of approaches for systematic, joint evaluation of different modalities. Here, we present Panpipes, a set of computational workflows designed to automate multimodal single-cell and spatial transcriptomic analyses by incorporating widely-used Python-based tools to perform quality control, preprocessing, integration, clustering, and reference mapping at scale. Panpipes allows reliable and customizable analysis and evaluation of individual and integrated modalities, thereby empowering decision-making before downstream investigations.
Subject(s)
Single-Cell Analysis , Software , Transcriptome , Single-Cell Analysis/methods , Gene Expression Profiling/methods , Humans , WorkflowABSTRACT
Single-cell multiplexing techniques (cell hashing and genetic multiplexing) combine multiple samples, optimizing sample processing and reducing costs. Cell hashing conjugates antibody-tags or chemical-oligonucleotides to cell membranes, while genetic multiplexing allows to mix genetically diverse samples and relies on aggregation of RNA reads at known genomic coordinates. We develop hadge (hashing deconvolution combined with genotype information), a Nextflow pipeline that combines 12 methods to perform both hashing- and genotype-based deconvolution. We propose a joint deconvolution strategy combining best-performing methods and demonstrate how this approach leads to the recovery of previously discarded cells in a nuclei hashing of fresh-frozen brain tissue.
Subject(s)
Single-Cell Analysis , Single-Cell Analysis/methods , Humans , Brain/metabolism , Brain/cytology , Software , GenotypeABSTRACT
With progressive digitalization of healthcare systems worldwide, large-scale collection of electronic health records (EHRs) has become commonplace. However, an extensible framework for comprehensive exploratory analysis that accounts for data heterogeneity is missing. Here we introduce ehrapy, a modular open-source Python framework designed for exploratory analysis of heterogeneous epidemiology and EHR data. ehrapy incorporates a series of analytical steps, from data extraction and quality control to the generation of low-dimensional representations. Complemented by rich statistical modules, ehrapy facilitates associating patients with disease states, differential comparison between patient clusters, survival analysis, trajectory inference, causal inference and more. Leveraging ontologies, ehrapy further enables data sharing and training EHR deep learning models, paving the way for foundational models in biomedical research. We demonstrate ehrapy's features in six distinct examples. We applied ehrapy to stratify patients affected by unspecified pneumonia into finer-grained phenotypes. Furthermore, we reveal biomarkers for significant differences in survival among these groups. Additionally, we quantify medication-class effects of pneumonia medications on length of stay. We further leveraged ehrapy to analyze cardiovascular risks across different data modalities. We reconstructed disease state trajectories in patients with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) based on imaging data. Finally, we conducted a case study to demonstrate how ehrapy can detect and mitigate biases in EHR data. ehrapy, thus, provides a framework that we envision will standardize analysis pipelines on EHR data and serve as a cornerstone for the community.
ABSTRACT
Current protocols for producing cerebellar neurons from human pluripotent stem cells (hPSCs) often rely on animal co-culture and mostly exist as monolayers, limiting their capability to recapitulate the complex processes in the developing cerebellum. Here, we employed a robust method, without the need for mouse co-culture to generate three-dimensional cerebellar organoids from hPSCs that display hallmarks of in vivo cerebellar development. Single-cell profiling followed by comparison to human and mouse cerebellar atlases revealed the presence and maturity of transcriptionally distinct populations encompassing major cerebellar cell types. Encapsulation with Matrigel aimed to provide more physiologically-relevant conditions through recapitulation of basement-membrane signalling, influenced both growth dynamics and cellular composition of the organoids, altering developmentally relevant gene expression programmes. We identified enrichment of cerebellar disease genes in distinct cell populations in the hPSC-derived cerebellar organoids. These findings ascertain xeno-free human cerebellar organoids as a unique model to gain insight into cerebellar development and its associated disorders.
Subject(s)
Cell Differentiation , Cerebellum/cytology , Induced Pluripotent Stem Cells/metabolism , Organoids/cytology , Aged , Animals , Biomarkers , Cell Culture Techniques , Cell Line , Collagen , Computational Biology/methods , Drug Combinations , Female , Gene Expression Profiling , Humans , Induced Pluripotent Stem Cells/cytology , Laminin , Proteoglycans , Purkinje Cells/metabolismABSTRACT
The genome-wide activity of transcription factors (TFs) on multiple regulatory elements precludes their use as gene-specific regulators. Here we show that ectopic expression of a TF in a cell-specific context can be used to silence the expression of a specific gene as a therapeutic approach to regulate gene expression in human disease. We selected the TF Krüppel-like factor 15 (KLF15) based on its putative ability to recognize a specific DNA sequence motif present in the rhodopsin (RHO) promoter and its lack of expression in terminally differentiated rod photoreceptors (the RHO-expressing cells). Adeno-associated virus (AAV) vector-mediated ectopic expression of KLF15 in rod photoreceptors of pigs enables Rho silencing with limited genome-wide transcriptional perturbations. Suppression of a RHO mutant allele by KLF15 corrects the phenotype of a mouse model of retinitis pigmentosa with no observed toxicity. Cell-specific-context conditioning of TF activity may prove a novel mode for somatic gene-targeted manipulation.
Subject(s)
Gene Silencing , Gene Targeting/methods , Kruppel-Like Transcription Factors/genetics , Nuclear Proteins/genetics , Rhodopsin/genetics , Animals , Dependovirus/genetics , Ectopic Gene Expression , Female , Genetic Therapy/methods , Genetic Vectors , Kruppel-Like Transcription Factors/physiology , Mice, Transgenic , Mutation , Nuclear Proteins/physiology , Retinal Rod Photoreceptor Cells/metabolism , Retinitis Pigmentosa/genetics , Retinitis Pigmentosa/therapy , Rhodopsin/metabolism , SwineABSTRACT
Transcription factors (TFs) operate by the combined activity of their DNA-binding domains (DBDs) and effector domains (EDs) enabling the coordination of gene expression on a genomic scale. Here we show that in vivo delivery of an engineered DNA-binding protein uncoupled from the repressor domain can produce efficient and gene-specific transcriptional silencing. To interfere with RHODOPSIN (RHO) gain-of-function mutations we engineered the ZF6-DNA-binding protein (ZF6-DB) that targets 20 base pairs (bp) of a RHOcis-regulatory element (CRE) and demonstrate Rho specific transcriptional silencing upon adeno-associated viral (AAV) vector-mediated expression in photoreceptors. The data show that the 20 bp-long genomic DNA sequence is necessary for RHO expression and that photoreceptor delivery of the corresponding cognate synthetic trans-acting factor ZF6-DB without the intrinsic transcriptional repression properties of the canonical ED blocks Rho expression with negligible genome-wide transcript perturbations. The data support DNA-binding-mediated silencing as a novel mode to treat gain-of-function mutations.