Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 13 de 13
Filtrar
1.
Mol Syst Biol ; 19(6): e11517, 2023 06 12.
Artículo en Inglés | MEDLINE | ID: mdl-37154091

RESUMEN

Recent advances in multiplexed single-cell transcriptomics experiments facilitate the high-throughput study of drug and genetic perturbations. However, an exhaustive exploration of the combinatorial perturbation space is experimentally unfeasible. Therefore, computational methods are needed to predict, interpret, and prioritize perturbations. Here, we present the compositional perturbation autoencoder (CPA), which combines the interpretability of linear models with the flexibility of deep-learning approaches for single-cell response modeling. CPA learns to in silico predict transcriptional perturbation response at the single-cell level for unseen dosages, cell types, time points, and species. Using newly generated single-cell drug combination data, we validate that CPA can predict unseen drug combinations while outperforming baseline models. Additionally, the architecture's modularity enables incorporating the chemical representation of the drugs, allowing the prediction of cellular response to completely unseen drugs. Furthermore, CPA is also applicable to genetic combinatorial screens. We demonstrate this by imputing in silico 5,329 missing combinations (97.6% of all possibilities) in a single-cell Perturb-seq experiment with diverse genetic interactions. We envision CPA will facilitate efficient experimental design and hypothesis generation by enabling in silico response prediction at the single-cell level and thus accelerate therapeutic applications using single-cell technologies.


Asunto(s)
Biología Computacional , Perfilación de la Expresión Génica , Ensayos Analíticos de Alto Rendimiento , Análisis de Expresión Génica de una Sola Célula
2.
Nat Methods ; 16(8): 715-721, 2019 08.
Artículo en Inglés | MEDLINE | ID: mdl-31363220

RESUMEN

Accurately modeling cellular response to perturbations is a central goal of computational biology. While such modeling has been based on statistical, mechanistic and machine learning models in specific settings, no generalization of predictions to phenomena absent from training data (out-of-sample) has yet been demonstrated. Here, we present scGen (https://github.com/theislab/scgen), a model combining variational autoencoders and latent space vector arithmetics for high-dimensional single-cell gene expression data. We show that scGen accurately models perturbation and infection response of cells across cell types, studies and species. In particular, we demonstrate that scGen learns cell-type and species-specific responses implying that it captures features that distinguish responding from non-responding genes and cells. With the upcoming availability of large-scale atlases of organs in a healthy state, we envision scGen to become a tool for experimental design through in silico screening of perturbation response in the context of disease and drug treatment.


Asunto(s)
Algoritmos , Biología Computacional/métodos , Leucocitos Mononucleares/metabolismo , Aprendizaje Automático , Fagocitos/metabolismo , Análisis de la Célula Individual/métodos , Transcriptoma , Animales , Simulación por Computador , Perfilación de la Expresión Génica , Humanos , Leucocitos Mononucleares/citología , Ratones , Fagocitos/citología , Especificidad de la Especie
3.
Nat Methods ; 16(1): 43-49, 2019 01.
Artículo en Inglés | MEDLINE | ID: mdl-30573817

RESUMEN

Single-cell transcriptomics is a versatile tool for exploring heterogeneous cell populations, but as with all genomics experiments, batch effects can hamper data integration and interpretation. The success of batch-effect correction is often evaluated by visual inspection of low-dimensional embeddings, which are inherently imprecise. Here we present a user-friendly, robust and sensitive k-nearest-neighbor batch-effect test (kBET; https://github.com/theislab/kBET ) for quantification of batch effects. We used kBET to assess commonly used batch-regression and normalization approaches, and to quantify the extent to which they remove batch effects while preserving biological variability. We also demonstrate the application of kBET to data from peripheral blood mononuclear cells (PBMCs) from healthy donors to distinguish cell-type-specific inter-individual variability from changes in relative proportions of cell populations. This has important implications for future data-integration efforts, central to projects such as the Human Cell Atlas.


Asunto(s)
Análisis de Secuencia de ARN/métodos , Análisis de la Célula Individual/métodos , Algoritmos , Análisis por Conglomerados
4.
Bioinformatics ; 36(Suppl_2): i610-i617, 2020 12 30.
Artículo en Inglés | MEDLINE | ID: mdl-33381839

RESUMEN

MOTIVATION: While generative models have shown great success in sampling high-dimensional samples conditional on low-dimensional descriptors (stroke thickness in MNIST, hair color in CelebA, speaker identity in WaveNet), their generation out-of-distribution poses fundamental problems due to the difficulty of learning compact joint distribution across conditions. The canonical example of the conditional variational autoencoder (CVAE), for instance, does not explicitly relate conditions during training and, hence, has no explicit incentive of learning such a compact representation. RESULTS: We overcome the limitation of the CVAE by matching distributions across conditions using maximum mean discrepancy in the decoder layer that follows the bottleneck. This introduces a strong regularization both for reconstructing samples within the same condition and for transforming samples across conditions, resulting in much improved generalization. As this amount to solving a style-transfer problem, we refer to the model as transfer VAE (trVAE). Benchmarking trVAE on high-dimensional image and single-cell RNA-seq, we demonstrate higher robustness and higher accuracy than existing approaches. We also show qualitatively improved predictions by tackling previously problematic minority classes and multiple conditions in the context of cellular perturbation response to treatment and disease based on high-dimensional single-cell gene expression data. For generic tasks, we improve Pearson correlations of high-dimensional estimated means and variances with their ground truths from 0.89 to 0.97 and 0.75 to 0.87, respectively. We further demonstrate that trVAE learns cell-type-specific responses after perturbation and improves the prediction of most cell-type-specific genes by 65%. AVAILABILITY AND IMPLEMENTATION: The trVAE implementation is available via github.com/theislab/trvae. The results of this article can be reproduced via github.com/theislab/trvae_reproducibility.


Asunto(s)
Reproducibilidad de los Resultados , RNA-Seq , Secuenciación del Exoma
5.
Nat Methods ; 13(10): 845-8, 2016 10.
Artículo en Inglés | MEDLINE | ID: mdl-27571553

RESUMEN

The temporal order of differentiating cells is intrinsically encoded in their single-cell expression profiles. We describe an efficient way to robustly estimate this order according to diffusion pseudotime (DPT), which measures transitions between cells using diffusion-like random walks. Our DPT software implementations make it possible to reconstruct the developmental progression of cells and identify transient or metastable states, branching decisions and differentiation endpoints.


Asunto(s)
Diferenciación Celular/genética , Linaje de la Célula/genética , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Modelos Genéticos , Modelos Estadísticos , Análisis de la Célula Individual/métodos , Algoritmos , Animales , Análisis por Conglomerados , Simulación por Computador , Difusión , Células Madre Embrionarias/citología , Ratones , Análisis Numérico Asistido por Computador
6.
Bioinformatics ; 33(20): 3211-3219, 2017 Oct 15.
Artículo en Inglés | MEDLINE | ID: mdl-28582478

RESUMEN

MOTIVATION: The identification of heterogeneities in cell populations by utilizing single-cell technologies such as single-cell RNA-Seq, enables inference of cellular development and lineage trees. Several methods have been proposed for such inference from high-dimensional single-cell data. They typically assign each cell to a branch in a differentiation trajectory. However, they commonly assume specific geometries such as tree-like developmental hierarchies and lack statistically sound methods to decide on the number of branching events. RESULTS: We present K-Branches, a solution to the above problem by locally fitting half-lines to single-cell data, introducing a clustering algorithm similar to K-Means. These halflines are proxies for branches in the differentiation trajectory of cells. We propose a modified version of the GAP statistic for model selection, in order to decide on the number of lines that best describe the data locally. In this manner, we identify the location and number of subgroups of cells that are associated with branching events and full differentiation, respectively. We evaluate the performance of our method on single-cell RNA-Seq data describing the differentiation of myeloid progenitors during hematopoiesis, single-cell qPCR data of mouse blastocyst development, single-cell qPCR data of human myeloid monocytic leukemia and artificial data. AVAILABILITY AND IMPLEMENTATION: An R implementation of K-Branches is freely available at https://github.com/theislab/kbranches. CONTACT: fabian.theis@helmholtz-muenchen.de. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Diferenciación Celular , Modelos Biológicos , Análisis de Secuencia de ARN/métodos , Análisis de la Célula Individual/métodos , Programas Informáticos , Algoritmos , Animales , Análisis por Conglomerados , Perfilación de la Expresión Génica/métodos , Humanos , Ratones
7.
Cell Syst ; 12(6): 522-537, 2021 06 16.
Artículo en Inglés | MEDLINE | ID: mdl-34139164

RESUMEN

Cell biology is fundamentally limited in its ability to collect complete data on cellular phenotypes and the wide range of responses to perturbation. Areas such as computer vision and speech recognition have addressed this problem of characterizing unseen or unlabeled conditions with the combined advances of big data, deep learning, and computing resources in the past 5 years. Similarly, recent advances in machine learning approaches enabled by single-cell data start to address prediction tasks in perturbation response modeling. We first define objectives in learning perturbation response in single-cell omics; survey existing approaches, resources, and datasets (https://github.com/theislab/sc-pert); and discuss how a perturbation atlas can enable deep learning models to construct an informative perturbation latent space. We then examine future avenues toward more powerful and explainable modeling using deep neural networks, which enable the integration of disparate information sources and an understanding of heterogeneous, complex, and unseen systems.


Asunto(s)
Aprendizaje Automático , Redes Neurales de la Computación
8.
Nat Biotechnol ; 38(12): 1408-1414, 2020 12.
Artículo en Inglés | MEDLINE | ID: mdl-32747759

RESUMEN

RNA velocity has opened up new ways of studying cellular differentiation in single-cell RNA-sequencing data. It describes the rate of gene expression change for an individual gene at a given time point based on the ratio of its spliced and unspliced messenger RNA (mRNA). However, errors in velocity estimates arise if the central assumptions of a common splicing rate and the observation of the full splicing dynamics with steady-state mRNA levels are violated. Here we present scVelo, a method that overcomes these limitations by solving the full transcriptional dynamics of splicing kinetics using a likelihood-based dynamical model. This generalizes RNA velocity to systems with transient cell states, which are common in development and in response to perturbations. We apply scVelo to disentangling subpopulation kinetics in neurogenesis and pancreatic endocrinogenesis. We infer gene-specific rates of transcription, splicing and degradation, recover each cell's position in the underlying differentiation processes and detect putative driver genes. scVelo will facilitate the study of lineage decisions and gene regulation.


Asunto(s)
Modelos Genéticos , ARN/genética , Animales , Ciclo Celular , Linaje de la Célula , Giro Dentado/metabolismo , Sistema Endocrino/metabolismo , Humanos , Cinética , Ratones , Neurogénesis/genética , Empalme del ARN/genética , Análisis de la Célula Individual , Células Madre/metabolismo , Procesos Estocásticos , Transcripción Genética
9.
Genome Biol ; 20(1): 59, 2019 03 19.
Artículo en Inglés | MEDLINE | ID: mdl-30890159

RESUMEN

Single-cell RNA-seq quantifies biological heterogeneity across both discrete cell types and continuous cell transitions. Partition-based graph abstraction (PAGA) provides an interpretable graph-like map of the arising data manifold, based on estimating connectivity of manifold partitions ( https://github.com/theislab/paga ). PAGA maps preserve the global topology of data, allow analyzing data at different resolutions, and result in much higher computational efficiency of the typical exploratory data analysis workflow. We demonstrate the method by inferring structure-rich cell maps with consistent topology across four hematopoietic datasets, adult planaria and the zebrafish embryo and benchmark computational performance on one million neurons.


Asunto(s)
Biología Computacional/métodos , Gráficos por Computador , Regulación del Desarrollo de la Expresión Génica , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Análisis de Secuencia de ARN/métodos , Análisis de la Célula Individual/métodos , Algoritmos , Animales , Embrión no Mamífero/citología , Embrión no Mamífero/metabolismo , Células Madre Hematopoyéticas/citología , Células Madre Hematopoyéticas/metabolismo , Humanos , Planarias/citología , Planarias/genética , Estándares de Referencia , Programas Informáticos , Pez Cebra/crecimiento & desarrollo , Pez Cebra/metabolismo
10.
Genome Biol ; 19(1): 15, 2018 02 06.
Artículo en Inglés | MEDLINE | ID: mdl-29409532

RESUMEN

SCANPY is a scalable toolkit for analyzing single-cell gene expression data. It includes methods for preprocessing, visualization, clustering, pseudotime and trajectory inference, differential expression testing, and simulation of gene regulatory networks. Its Python-based implementation efficiently deals with data sets of more than one million cells ( https://github.com/theislab/Scanpy ). Along with SCANPY, we present ANNDATA, a generic class for handling annotated data matrices ( https://github.com/theislab/anndata ).


Asunto(s)
Perfilación de la Expresión Génica/métodos , Programas Informáticos , Redes Reguladoras de Genes , Análisis de la Célula Individual
11.
Science ; 360(6391)2018 05 25.
Artículo en Inglés | MEDLINE | ID: mdl-29674432

RESUMEN

Flatworms of the species Schmidtea mediterranea are immortal-adult animals contain a large pool of pluripotent stem cells that continuously differentiate into all adult cell types. Therefore, single-cell transcriptome profiling of adult animals should reveal mature and progenitor cells. By combining perturbation experiments, gene expression analysis, a computational method that predicts future cell states from transcriptional changes, and a lineage reconstruction method, we placed all major cell types onto a single lineage tree that connects all cells to a single stem cell compartment. We characterized gene expression changes during differentiation and discovered cell types important for regeneration. Our results demonstrate the importance of single-cell transcriptome analysis for mapping and reconstructing fundamental processes of developmental and regenerative biology at high resolution.


Asunto(s)
Atlas como Asunto , Linaje de la Célula/genética , Células/clasificación , Perfilación de la Expresión Génica/métodos , Planarias/citología , Análisis de la Célula Individual/métodos , Animales , Diferenciación Celular/genética , Células/metabolismo , Planarias/genética , Planarias/metabolismo , Células Madre Pluripotentes/citología , Células Madre Pluripotentes/metabolismo , Regeneración/genética , Transcriptoma
13.
Nat Commun ; 8(1): 463, 2017 09 06.
Artículo en Inglés | MEDLINE | ID: mdl-28878212

RESUMEN

We show that deep convolutional neural networks combined with nonlinear dimension reduction enable reconstructing biological processes based on raw image data. We demonstrate this by reconstructing the cell cycle of Jurkat cells and disease progression in diabetic retinopathy. In further analysis of Jurkat cells, we detect and separate a subpopulation of dead cells in an unsupervised manner and, in classifying discrete cell cycle stages, we reach a sixfold reduction in error rate compared to a recent approach based on boosting on image features. In contrast to previous methods, deep learning based predictions are fast enough for on-the-fly analysis in an imaging flow cytometer.The interpretation of information-rich, high-throughput single-cell data is a challenge requiring sophisticated computational tools. Here the authors demonstrate a deep convolutional neural network that can classify cell cycle status on-the-fly.


Asunto(s)
Retinopatía Diabética/patología , Progresión de la Enfermedad , Aprendizaje Automático , Redes Neurales de la Computación , Ciclo Celular , División Celular , Simulación por Computador , ADN/análisis , Citometría de Flujo , Humanos , Células Jurkat , Mitosis , Reproducibilidad de los Resultados
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA