Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
1.
Proc Natl Acad Sci U S A ; 120(21): e2209124120, 2023 05 23.
Artigo em Inglês | MEDLINE | ID: mdl-37192164

RESUMO

Detecting differentially expressed genes is important for characterizing subpopulations of cells. In scRNA-seq data, however, nuisance variation due to technical factors like sequencing depth and RNA capture efficiency obscures the underlying biological signal. Deep generative models have been extensively applied to scRNA-seq data, with a special focus on embedding cells into a low-dimensional latent space and correcting for batch effects. However, little attention has been paid to the problem of utilizing the uncertainty from the deep generative model for differential expression (DE). Furthermore, the existing approaches do not allow for controlling for effect size or the false discovery rate (FDR). Here, we present lvm-DE, a generic Bayesian approach for performing DE predictions from a fitted deep generative model, while controlling the FDR. We apply the lvm-DE framework to scVI and scSphere, two deep generative models. The resulting approaches outperform state-of-the-art methods at estimating the log fold change in gene expression levels as well as detecting differentially expressed genes between subpopulations of cells.


Assuntos
RNA , Análise de Célula Única , Teorema de Bayes , Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos , Perfilação da Expressão Gênica/métodos
2.
Nat Methods ; 18(3): 272-282, 2021 03.
Artigo em Inglês | MEDLINE | ID: mdl-33589839

RESUMO

The paired measurement of RNA and surface proteins in single cells with cellular indexing of transcriptomes and epitopes by sequencing (CITE-seq) is a promising approach to connect transcriptional variation with cell phenotypes and functions. However, combining these paired views into a unified representation of cell state is made challenging by the unique technical characteristics of each measurement. Here we present Total Variational Inference (totalVI; https://scvi-tools.org ), a framework for end-to-end joint analysis of CITE-seq data that probabilistically represents the data as a composite of biological and technical factors, including protein background and batch effects. To evaluate totalVI's performance, we profiled immune cells from murine spleen and lymph nodes with CITE-seq, measuring over 100 surface proteins. We demonstrate that totalVI provides a cohesive solution for common analysis tasks such as dimensionality reduction, the integration of datasets with different measured proteins, estimation of correlations between molecules and differential expression testing.


Assuntos
Linfonodos/metabolismo , Proteínas/análise , Análise de Célula Única/métodos , Baço/metabolismo , Transcriptoma/genética , Animais , Células Cultivadas , Análise de Dados , Feminino , Ensaios de Triagem em Larga Escala/métodos , Linfonodos/citologia , Camundongos , Camundongos Endogâmicos C57BL , RNA/análise , RNA/genética , Baço/citologia
3.
Mol Syst Biol ; 17(1): e9620, 2021 01.
Artigo em Inglês | MEDLINE | ID: mdl-33491336

RESUMO

As the number of single-cell transcriptomics datasets grows, the natural next step is to integrate the accumulating data to achieve a common ontology of cell types and states. However, it is not straightforward to compare gene expression levels across datasets and to automatically assign cell type labels in a new dataset based on existing annotations. In this manuscript, we demonstrate that our previously developed method, scVI, provides an effective and fully probabilistic approach for joint representation and analysis of scRNA-seq data, while accounting for uncertainty caused by biological and measurement noise. We also introduce single-cell ANnotation using Variational Inference (scANVI), a semi-supervised variant of scVI designed to leverage existing cell state annotations. We demonstrate that scVI and scANVI compare favorably to state-of-the-art methods for data integration and cell state annotation in terms of accuracy, scalability, and adaptability to challenging settings. In contrast to existing methods, scVI and scANVI integrate multiple datasets with a single generative model that can be directly used for downstream tasks, such as differential expression. Both methods are easily accessible through scvi-tools.


Assuntos
Biologia Computacional/métodos , Análise de Célula Única/métodos , Bases de Dados Genéticas , Perfilação da Expressão Gênica , Humanos , Anotação de Sequência Molecular , Análise de Sequência de RNA , Aprendizado de Máquina Supervisionado
4.
Nat Methods ; 15(12): 1053-1058, 2018 12.
Artigo em Inglês | MEDLINE | ID: mdl-30504886

RESUMO

Single-cell transcriptome measurements can reveal unexplored biological diversity, but they suffer from technical noise and bias that must be modeled to account for the resulting uncertainty in downstream analyses. Here we introduce single-cell variational inference (scVI), a ready-to-use scalable framework for the probabilistic representation and analysis of gene expression in single cells ( https://github.com/YosefLab/scVI ). scVI uses stochastic optimization and deep neural networks to aggregate information across similar cells and genes and to approximate the distributions that underlie observed expression values, while accounting for batch effects and limited sensitivity. We used scVI for a range of fundamental analysis tasks including batch correction, visualization, clustering, and differential expression, and achieved high accuracy for each task.


Assuntos
Biologia Computacional/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Modelos Biológicos , Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos , Transcriptoma , Algoritmos , Animais , Encéfalo/citologia , Encéfalo/metabolismo , Análise por Conglomerados , Variação Genética , Células-Tronco Hematopoéticas/citologia , Células-Tronco Hematopoéticas/metabolismo , Humanos , Leucócitos Mononucleares/citologia , Leucócitos Mononucleares/metabolismo , Camundongos
5.
Mol Syst Biol ; 16(9): e9198, 2020 09.
Artigo em Inglês | MEDLINE | ID: mdl-32975352

RESUMO

Generative models provide a well-established statistical framework for evaluating uncertainty and deriving conclusions from large data sets especially in the presence of noise, sparsity, and bias. Initially developed for computer vision and natural language processing, these models have been shown to effectively summarize the complexity that underlies many types of data and enable a range of applications including supervised learning tasks, such as assigning labels to images; unsupervised learning tasks, such as dimensionality reduction; and out-of-sample generation, such as de novo image synthesis. With this early success, the power of generative models is now being increasingly leveraged in molecular biology, with applications ranging from designing new molecules with properties of interest to identifying deleterious mutations in our genomes and to dissecting transcriptional variability between single cells. In this review, we provide a brief overview of the technical notions behind generative models and their implementation with deep learning techniques. We then describe several different ways in which these models can be utilized in practice, using several recent applications in molecular biology as examples.


Assuntos
Aprendizado Profundo , Modelos Estatísticos , Biologia Molecular , Pesquisa Biomédica , Tomada de Decisões , Redes Neurais de Computação
6.
ArXiv ; 2024 Jan 29.
Artigo em Inglês | MEDLINE | ID: mdl-38351930

RESUMO

Deep Generative Models (DGMs) are versatile tools for learning data representations while adequately incorporating domain knowledge such as the specification of conditional probability distributions. Recently proposed DGMs tackle the important task of comparing data sets from different sources. One such example is the setting of contrastive analysis that focuses on describing patterns that are enriched in a target data set compared to a background data set. The practical deployment of those models often assumes that DGMs naturally infer interpretable and modular latent representations, which is known to be an issue in practice. Consequently, existing methods often rely on ad-hoc regularization schemes, although without any theoretical grounding. Here, we propose a theory of identifiability for comparative DGMs by extending recent advances in the field of non-linear independent component analysis. We show that, while these models lack identifiability across a general class of mixing functions, they surprisingly become identifiable when the mixing function is piece-wise affine (e.g., parameterized by a ReLU neural network). We also investigate the impact of model misspecification, and empirically show that previously proposed regularization techniques for fitting comparative DGMs help with identifiability when the number of latent variables is not known in advance. Finally, we introduce a novel methodology for fitting comparative DGMs that improves the treatment of multiple data sources via multi-objective optimization and that helps adjust the hyperparameter for the regularization in an interpretable manner, using constrained optimization. We empirically validate our theory and new methodology using simulated data as well as a recent data set of genetic perturbations in cells profiled via single-cell RNA sequencing.

7.
Nat Biotechnol ; 40(9): 1360-1369, 2022 09.
Artigo em Inglês | MEDLINE | ID: mdl-35449415

RESUMO

Most spatial transcriptomics technologies are limited by their resolution, with spot sizes larger than that of a single cell. Although joint analysis with single-cell RNA sequencing can alleviate this problem, current methods are limited to assessing discrete cell types, revealing the proportion of cell types inside each spot. To identify continuous variation of the transcriptome within cells of the same type, we developed Deconvolution of Spatial Transcriptomics profiles using Variational Inference (DestVI). Using simulations, we demonstrate that DestVI outperforms existing methods for estimating gene expression for every cell type inside every spot. Applied to a study of infected lymph nodes and of a mouse tumor model, DestVI provides high-resolution, accurate spatial characterization of the cellular organization of these tissues and identifies cell-type-specific changes in gene expression between different tissue regions or between conditions. DestVI is available as part of the open-source software package scvi-tools ( https://scvi-tools.org ).


Assuntos
Neoplasias , Transcriptoma , Animais , Perfilação da Expressão Gênica/métodos , Camundongos , Neoplasias/genética , Análise de Célula Única/métodos , Software , Transcriptoma/genética , Sequenciamento do Exoma
8.
Bone Marrow Transplant ; 56(6): 1422-1425, 2021 06.
Artigo em Inglês | MEDLINE | ID: mdl-33454725

RESUMO

We included 255 patients from the L.E.A. French long-term follow-up cohort. All had received hematopoietic stem cell transplantation (HSCT) and/or testicular radiation for childhood acute leukemia and were older than 18 years at last L.E.A. evaluation. Total testosterone deficiency was defined as a <12 nmol/l level or by substitutive therapy, partial deficiency as normal testosterone with elevated luteinizing hormone (>10 UI/l). After myeloablative total body irradiation (n = 178), 55.6% had total deficiency, 15.7% partial deficiency, and 28.7% were normal. A 4-6 Gy testicular boost and a younger age at HSCT increased significantly the risk. After a Busulfan-containing myeloablative conditioning regimen (n = 53), 28.3% had total deficiency, 15.1% partial deficiency, 56.6% were normal (62.5% vs. 0% in patients without or with additional testicular radiation). A 24-Gy testicular radiation without HSCT induced total or partial deficiency in 71.4% and 28.6%, respectively (n = 21). Total testosterone deficiency increased the risk of metabolic syndrome: 25% vs. 12.1% in men with partial testosterone deficiency and 8.8% when Leydig cell function was normal (p = 0.031).


Assuntos
Doença Enxerto-Hospedeiro , Transplante de Células-Tronco Hematopoéticas , Leucemia Mieloide Aguda , Bussulfano/efeitos adversos , Criança , Transplante de Células-Tronco Hematopoéticas/efeitos adversos , Humanos , Leucemia Mieloide Aguda/terapia , Masculino , Testosterona , Condicionamento Pré-Transplante/efeitos adversos , Irradiação Corporal Total/efeitos adversos
9.
Cell Syst ; 8(4): 281-291.e9, 2019 04 24.
Artigo em Inglês | MEDLINE | ID: mdl-30954476

RESUMO

Single-cell RNA-sequencing has become a widely used, powerful approach for studying cell populations. However, these methods often generate multiplet artifacts, where two or more cells receive the same barcode, resulting in a hybrid transcriptome. In most experiments, multiplets account for several percent of transcriptomes and can confound downstream data analysis. Here, we present Single-Cell Remover of Doublets (Scrublet), a framework for predicting the impact of multiplets in a given analysis and identifying problematic multiplets. Scrublet avoids the need for expert knowledge or cell clustering by simulating multiplets from the data and building a nearest neighbor classifier. To demonstrate the utility of this approach, we test Scrublet on several datasets that include independent knowledge of cell multiplets. Scrublet is freely available for download at github.com/AllonKleinLab/scrublet.


Assuntos
RNA-Seq/métodos , Análise de Célula Única/métodos , Software , Transcriptoma , Animais , Artefatos , Humanos , Camundongos , RNA-Seq/normas , Análise de Célula Única/normas
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA