Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 7 de 7
Filter
Add more filters










Database
Language
Publication year range
1.
Nat Methods ; 2024 Feb 16.
Article in English | MEDLINE | ID: mdl-38366243

ABSTRACT

Analysis of single-cell datasets generated from diverse organisms offers unprecedented opportunities to unravel fundamental evolutionary processes of conservation and diversification of cell types. However, interspecies genomic differences limit the joint analysis of cross-species datasets to homologous genes. Here we present SATURN, a deep learning method for learning universal cell embeddings that encodes genes' biological properties using protein language models. By coupling protein embeddings from language models with RNA expression, SATURN integrates datasets profiled from different species regardless of their genomic similarity. SATURN can detect functionally related genes coexpressed across species, redefining differential expression for cross-species analysis. Applying SATURN to three species whole-organism atlases and frog and zebrafish embryogenesis datasets, we show that SATURN can effectively transfer annotations across species, even when they are evolutionarily remote. We also demonstrate that SATURN can be used to find potentially divergent gene functions between glaucoma-associated genes in humans and four other species.

2.
bioRxiv ; 2023 Sep 24.
Article in English | MEDLINE | ID: mdl-36778387

ABSTRACT

Analysis of single-cell datasets generated from diverse organisms offers unprecedented opportunities to unravel fundamental evolutionary processes of conservation and diversification of cell types. However, inter-species genomic differences limit the joint analysis of cross-species datasets to homologous genes. Here, we present SATURN, a deep learning method for learning universal cell embeddings that encodes genes' biological properties using protein language models. By coupling protein embeddings from language models with RNA expression, SATURN integrates datasets profiled from different species regardless of their genomic similarity. SATURN has a unique ability to detect functionally related genes co-expressed across species, redefining differential expression for cross-species analysis. We apply SATURN to three species whole-organism atlases and frog and zebrafish embryogenesis datasets. We show that cell embeddings learnt in SATURN can be effectively used to transfer annotations across species and identify both homologous and species-specific cell types, even across evolutionarily remote species. Finally, we use SATURN to reannotate the five species Cell Atlas of Human Trabecular Meshwork and Aqueous Outflow Structures and find evidence of potentially divergent functions between glaucoma associated genes in humans and other species.

3.
Cell Rep Methods ; 2(4): 100200, 2022 04 25.
Article in English | MEDLINE | ID: mdl-35497495

ABSTRACT

Recent advances in CRISPR-Cas9 engineering and single-cell assays have enabled the simultaneous measurement of single-cell transcriptomic and phylogenetic profiles. However, there are few computational tools enabling users to integrate and derive insight from a joint analysis of these two modalities. Here, we describe "PhyloVision": an open-source software for interactively exploring data from both modalities and for identifying and interpreting heritable gene modules whose concerted expression are associated with phylogenetic relationships. PhyloVision provides a feature-rich, interactive, and shareable web-based report for investigating these modules while also supporting several other data and meta-data exploration capabilities. We demonstrate the utility of PhyloVision using a published dataset of metastatic lung adenocarcinoma cells, whose phylogeny was resolved using a CRISPR-Cas9-based lineage-tracing system. Together, we anticipate that PhyloVision and the methods it implements will be a useful resource for scalable and intuitive data exploration for any assay that simultaneously measures cell state and lineage.


Subject(s)
Computational Biology , Transcriptome , Transcriptome/genetics , Phylogeny , Computational Biology/methods , Software , Gene Expression Profiling
4.
Nat Methods ; 17(8): 793-798, 2020 08.
Article in English | MEDLINE | ID: mdl-32719530

ABSTRACT

Massively parallel single-cell and single-nucleus RNA sequencing has opened the way to systematic tissue atlases in health and disease, but as the scale of data generation is growing, so is the need for computational pipelines for scaled analysis. Here we developed Cumulus-a cloud-based framework for analyzing large-scale single-cell and single-nucleus RNA sequencing datasets. Cumulus combines the power of cloud computing with improvements in algorithm and implementation to achieve high scalability, low cost, user-friendliness and integrated support for a comprehensive set of features. We benchmark Cumulus on the Human Cell Atlas Census of Immune Cells dataset of bone marrow cells and show that it substantially improves efficiency over conventional frameworks, while maintaining or improving the quality of results, enabling large-scale studies.


Subject(s)
Cloud Computing/economics , Computational Biology/methods , High-Throughput Nucleotide Sequencing/methods , Sequence Analysis, RNA/methods , Single-Cell Analysis/methods , Computational Biology/economics , High-Throughput Nucleotide Sequencing/economics , Sequence Analysis, RNA/economics
6.
Nat Med ; 26(5): 792-802, 2020 05.
Article in English | MEDLINE | ID: mdl-32405060

ABSTRACT

Single-cell genomics is essential to chart tumor ecosystems. Although single-cell RNA-Seq (scRNA-Seq) profiles RNA from cells dissociated from fresh tumors, single-nucleus RNA-Seq (snRNA-Seq) is needed to profile frozen or hard-to-dissociate tumors. Each requires customization to different tissue and tumor types, posing a barrier to adoption. Here, we have developed a systematic toolbox for profiling fresh and frozen clinical tumor samples using scRNA-Seq and snRNA-Seq, respectively. We analyzed 216,490 cells and nuclei from 40 samples across 23 specimens spanning eight tumor types of varying tissue and sample characteristics. We evaluated protocols by cell and nucleus quality, recovery rate and cellular composition. scRNA-Seq and snRNA-Seq from matched samples recovered the same cell types, but at different proportions. Our work provides guidance for studies in a broad range of tumors, including criteria for testing and selecting methods from the toolbox for other tumors, thus paving the way for charting tumor atlases.


Subject(s)
Algorithms , Cell Nucleus/genetics , Genomics/methods , Neoplasms/genetics , RNA-Seq/methods , Single-Cell Analysis/methods , Adult , Animals , Cell Nucleus/chemistry , Cell Nucleus/metabolism , Child , Computational Biology/methods , Female , Freezing , Gene Expression Profiling/methods , Gene Expression Regulation, Neoplastic , Humans , Mice , Mice, Knockout , Mice, Nude , Neoplasms/metabolism , Neoplasms/pathology , Sequence Analysis, RNA/methods , Tumor Cells, Cultured , Exome Sequencing/methods
7.
Nature ; 569(7757): 503-508, 2019 05.
Article in English | MEDLINE | ID: mdl-31068700

ABSTRACT

Large panels of comprehensively characterized human cancer models, including the Cancer Cell Line Encyclopedia (CCLE), have provided a rigorous framework with which to study genetic variants, candidate targets, and small-molecule and biological therapeutics and to identify new marker-driven cancer dependencies. To improve our understanding of the molecular features that contribute to cancer phenotypes, including drug responses, here we have expanded the characterizations of cancer cell lines to include genetic, RNA splicing, DNA methylation, histone H3 modification, microRNA expression and reverse-phase protein array data for 1,072 cell lines from individuals of various lineages and ethnicities. Integration of these data with functional characterizations such as drug-sensitivity, short hairpin RNA knockdown and CRISPR-Cas9 knockout data reveals potential targets for cancer drugs and associated biomarkers. Together, this dataset and an accompanying public data portal provide a resource for the acceleration of cancer research using model cancer cell lines.


Subject(s)
Cell Line, Tumor , Neoplasms/genetics , Neoplasms/pathology , Antineoplastic Agents/pharmacology , Biomarkers, Tumor , DNA Methylation , Drug Resistance, Neoplasm , Ethnicity/genetics , Gene Editing , Histones/metabolism , Humans , MicroRNAs/genetics , Molecular Targeted Therapy , Neoplasms/metabolism , Protein Array Analysis , RNA Splicing
SELECTION OF CITATIONS
SEARCH DETAIL
...