Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 30
Filter
1.
Cell ; 179(7): 1455-1467, 2019 12 12.
Article in English | MEDLINE | ID: mdl-31835027

ABSTRACT

Understanding the genetic and molecular drivers of phenotypic heterogeneity across individuals is central to biology. As new technologies enable fine-grained and spatially resolved molecular profiling, we need new computational approaches to integrate data from the same organ across different individuals into a consistent reference and to construct maps of molecular and cellular organization at histological and anatomical scales. Here, we review previous efforts and discuss challenges involved in establishing such a common coordinate framework, the underlying map of tissues and organs. We focus on strategies to handle anatomical variation across individuals and highlight the need for new technologies and analytical methods spanning multiple hierarchical scales of spatial resolution.


Subject(s)
Anatomic Variation , Diagnostic Imaging/standards , Physical Examination/standards , Diagnostic Imaging/methods , Humans , Physical Examination/methods , Reference Standards
2.
Brief Bioinform ; 24(3)2023 05 19.
Article in English | MEDLINE | ID: mdl-37096588

ABSTRACT

The advances of single-cell transcriptomic technologies have led to increasing use of single-cell RNA sequencing (scRNA-seq) data in large-scale patient cohort studies. The resulting high-dimensional data can be summarized and incorporated into patient outcome prediction models in several ways; however, there is a pressing need to understand the impact of analytical decisions on such model quality. In this study, we evaluate the impact of analytical choices on model choices, ensemble learning strategies and integrate approaches on patient outcome prediction using five scRNA-seq COVID-19 datasets. First, we examine the difference in performance between using single-view feature space versus multi-view feature space. Next, we survey multiple learning platforms from classical machine learning to modern deep learning methods. Lastly, we compare different integration approaches when combining datasets is necessary. Through benchmarking such analytical combinations, our study highlights the power of ensemble learning, consistency among different learning methods and robustness to dataset normalization when using multiple datasets as the model input.


Subject(s)
Benchmarking , COVID-19 , Humans , Gene Expression Profiling , Machine Learning , Sequence Analysis, RNA/methods
3.
Bioinformatics ; 39(9)2023 09 02.
Article in English | MEDLINE | ID: mdl-37698995

ABSTRACT

MOTIVATION: Imaging-based spatial transcriptomics (ST) technologies have achieved subcellular resolution, enabling detection of individual molecules in their native tissue context. Data associated with these technologies promise unprecedented opportunity toward understanding cellular and subcellular biology. However, in R/Bioconductor, there is a scarcity of existing computational infrastructure to represent such data, and particularly to summarize and transform it for existing widely adopted computational tools in single-cell transcriptomics analysis, including SingleCellExperiment and SpatialExperiment (SPE) classes. With the emergence of several commercial offerings of imaging-based ST, there is a pressing need to develop consistent data structure standards for these technologies at the individual molecule-level. RESULTS: To this end, we have developed MoleculeExperiment, an R/Bioconductor package, which (i) stores molecule and cell segmentation boundary information at the molecule-level, (ii) standardizes this molecule-level information across different imaging-based ST technologies, including 10× Genomics' Xenium, and (iii) streamlines transition from a MoleculeExperiment object to a SpatialExperiment object. Overall, MoleculeExperiment is generally applicable as a data infrastructure class for consistent analysis of molecule-resolved spatial omics data. AVAILABILITY AND IMPLEMENTATION: The MoleculeExperiment package is publicly available on Bioconductor at https://bioconductor.org/packages/release/bioc/html/MoleculeExperiment.html. Source code is available on Github at: https://github.com/SydneyBioX/MoleculeExperiment. The vignette for MoleculeExperiment can be found at https://bioconductor.org/packages/release/bioc/html/MoleculeExperiment.html.


Subject(s)
Gene Expression Profiling , Genomics , Software
4.
Nat Methods ; 17(8): 799-806, 2020 08.
Article in English | MEDLINE | ID: mdl-32661426

ABSTRACT

Single-cell genomics has transformed our ability to examine cell fate choice. Examining cells along a computationally ordered 'pseudotime' offers the potential to unpick subtle changes in variability and covariation among key genes. We describe an approach, scHOT-single-cell higher-order testing-which provides a flexible and statistically robust framework for identifying changes in higher-order interactions among genes. scHOT can be applied for cells along a continuous trajectory or across space and accommodates various higher-order measurements including variability or correlation. We demonstrate the use of scHOT by studying coordinated changes in higher-order interactions during embryonic development of the mouse liver. Additionally, scHOT identifies subtle changes in gene-gene correlations across space using spatially resolved transcriptomics data from the mouse olfactory bulb. scHOT meaningfully adds to first-order differential expression testing and provides a framework for interrogating higher-order interactions using single-cell data.


Subject(s)
Liver/embryology , Single-Cell Analysis/methods , Animals , Computational Biology , Databases, Nucleic Acid , Hepatocytes/physiology , Liver/cytology , Mice , Oligonucleotide Array Sequence Analysis , Sequence Analysis, RNA , Software
5.
Bioinformatics ; 38(11): 3128-3131, 2022 05 26.
Article in English | MEDLINE | ID: mdl-35482478

ABSTRACT

SUMMARY: SpatialExperiment is a new data infrastructure for storing and accessing spatially-resolved transcriptomics data, implemented within the R/Bioconductor framework, which provides advantages of modularity, interoperability, standardized operations and comprehensive documentation. Here, we demonstrate the structure and user interface with examples from the 10x Genomics Visium and seqFISH platforms, and provide access to example datasets and visualization tools in the STexampleData, TENxVisiumData and ggspavis packages. AVAILABILITY AND IMPLEMENTATION: The SpatialExperiment, STexampleData, TENxVisiumData and ggspavis packages are available from Bioconductor. The package versions described in this manuscript are available in Bioconductor version 3.15 onwards. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Software , Transcriptome , Genomics
6.
Proc Natl Acad Sci U S A ; 116(20): 9775-9784, 2019 05 14.
Article in English | MEDLINE | ID: mdl-31028141

ABSTRACT

Concerted examination of multiple collections of single-cell RNA sequencing (RNA-seq) data promises further biological insights that cannot be uncovered with individual datasets. Here we present scMerge, an algorithm that integrates multiple single-cell RNA-seq datasets using factor analysis of stably expressed genes and pseudoreplicates across datasets. Using a large collection of public datasets, we benchmark scMerge against published methods and demonstrate that it consistently provides improved cell type separation by removing unwanted factors; scMerge can also enhance biological discovery through robust data integration, which we show through the inference of development trajectory in a liver dataset collection.


Subject(s)
Meta-Analysis as Topic , Sequence Analysis, RNA , Single-Cell Analysis , Software , Algorithms , Animals , Embryonic Development , Factor Analysis, Statistical , Gene Expression , Humans , Mice
8.
Prostate ; 80(6): 508-517, 2020 05.
Article in English | MEDLINE | ID: mdl-32119131

ABSTRACT

BACKGROUND: As a rare subtype of prostate carcinoma, basal cell carcinoma (BCC) has not been studied extensively and thus lacks systematic molecular characterization. METHODS: Here, we applied single-cell genomic amplification and RNA-Seq to a specimen of human prostate BCC (CK34ßE12+ /P63+ /PAP- /PSA- ). The mutational landscape was obtained via whole exome sequencing of the amplification mixture of 49 single cells, and the transcriptomes of 69 single cells were also obtained. RESULTS: The five putative driver genes mutated in BCC are CASC5, NUTM1, PTPRC, KMT2C, and TBX3, and the top three nucleotide substitutions are C>T, T>C, and C>A, similar to common prostate cancer. The distribution of the variant allele frequency values indicated that these single cells are from the same tumor clone. The 69 single cells were clustered into tumor, stromal, and immune cells based on their global transcriptomic profiles. The tumor cells specifically express basal cell markers like KRT5, KRT14, and KRT23 and epithelial markers EPCAM, CDH1, and CD24. The transcription factor covariance network analysis showed that the BCC tumor cells have distinct regulatory networks. By comparison with current prostate cancer datasets, we found that some of the bulk samples exhibit basal cell signatures. Interestingly, at single-cell resolution the gene expression patterns of prostate BCC tumor cells show uniqueness compared with that of common prostate cancer-derived circulating tumor cells. CONCLUSIONS: This study, for the first time, discloses the comprehensive mutational and transcriptomic landscapes of prostate BCC, which lays a foundation for the understanding of its tumorigenesis mechanism and provides new insights into prostate cancers in general.


Subject(s)
Carcinoma, Basal Cell/genetics , Prostatic Neoplasms/genetics , Biopsy, Needle , Carcinoma, Basal Cell/pathology , Gene Amplification , Gene Expression Profiling/methods , Gene Frequency , Humans , Immunohistochemistry , Male , Middle Aged , Mutation , Prostatic Neoplasms/pathology , Single-Cell Analysis/methods , Stromal Cells/pathology , Transcriptome , Exome Sequencing
9.
Bioinformatics ; 35(5): 823-829, 2019 03 01.
Article in English | MEDLINE | ID: mdl-30102408

ABSTRACT

MOTIVATION: Genes act as a system and not in isolation. Thus, it is important to consider coordinated changes of gene expression rather than single genes when investigating biological phenomena such as the aetiology of cancer. We have developed an approach for quantifying how changes in the association between pairs of genes may inform the outcome of interest called Differential Correlation across Ranked Samples (DCARS). Modelling gene correlation across a continuous sample ranking does not require the dichotomisation of samples into two distinct classes and can identify differences in gene correlation across early, mid or late stages of the outcome of interest. RESULTS: When we evaluated DCARS against the typical Fisher Z-transformation test for differential correlation, as well as a typical approach testing for interaction within a linear model, on real TCGA data, DCARS significantly ranked gene pairs containing known cancer genes more highly across several cancers. Similar results are found with our simulation study. DCARS was applied to 13 cancers datasets in TCGA, revealing several distinct relationships for which survival ranking was found to be associated with a change in correlation between genes. Furthermore, we demonstrated that DCARS can be used in conjunction with network analysis techniques to extract biological meaning from multi-layered and complex data. AVAILABILITY AND IMPLEMENTATION: DCARS R package and sample data are available at https://github.com/shazanfar/DCARS. Publicly available data from The Cancer Genome Atlas (TCGA) was used using the TCGABiolinks R package. Supplementary Files and DCARS R package is available at https://github.com/shazanfar/DCARS. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Neoplasms , Genome , Humans , Software
10.
Proteomics ; 16(13): 1868-71, 2016 07.
Article in English | MEDLINE | ID: mdl-27145998

ABSTRACT

Mass spectrometry (MS)-based quantitative phosphoproteomics has become a key approach for proteome-wide profiling of phosphorylation in tissues and cells. Traditional experimental design often compares a single treatment with a control, whereas increasingly more experiments are designed to compare multiple treatments with respect to a control. To this end, the development of bioinformatic tools that can integrate multiple treatments and visualise kinases and substrates under combinatorial perturbations is vital for dissecting concordant and/or independent effects of each treatment. Here, we propose a hypothesis driven kinase perturbation analysis (KinasePA) to annotate and visualise kinases and their substrates that are perturbed by various combinatorial effects of treatments in phosphoproteomics experiments. We demonstrate the utility of KinasePA through its application to two large-scale phosphoproteomics datasets and show its effectiveness in dissecting kinases and substrates within signalling pathways driven by unique combinations of cellular stimuli and inhibitors. We implemented and incorporated KinasePA as part of the "directPA" R package available from the comprehensive R archive network (CRAN). Furthermore, KinasePA also has an interactive web interface that can be readily applied to annotate user provided phosphoproteomics data (http://kinasepa.pengyiyang.org).


Subject(s)
Protein Kinases/metabolism , Proteomics/methods , Cell Line , Chromones/pharmacology , Databases, Protein , Heterocyclic Compounds, 3-Ring/pharmacology , Humans , Insulin/metabolism , Morpholines/pharmacology , Naphthyridines/pharmacology , Phosphorylation , Protein Kinase Inhibitors/pharmacology , Signal Transduction/drug effects , Sirolimus/pharmacology , TOR Serine-Threonine Kinases/antagonists & inhibitors , TOR Serine-Threonine Kinases/metabolism
11.
Nat Biotechnol ; 42(2): 284-292, 2024 Feb.
Article in English | MEDLINE | ID: mdl-37231260

ABSTRACT

Currently available single-cell omics technologies capture many unique features with different biological information content. Data integration aims to place cells, captured with different technologies, onto a common embedding to facilitate downstream analytical tasks. Current horizontal data integration techniques use a set of common features, thereby ignoring non-overlapping features and losing information. Here we introduce StabMap, a mosaic data integration technique that stabilizes mapping of single-cell data by exploiting the non-overlapping features. StabMap first infers a mosaic data topology based on shared features, then projects all cells onto supervised or unsupervised reference coordinates by traversing shortest paths along the topology. We show that StabMap performs well in various simulation contexts, facilitates 'multi-hop' mosaic data integration where some datasets do not share any features and enables the use of spatial gene expression features for mapping dissociated single-cell data onto a spatial transcriptomic reference.


Subject(s)
Gene Expression Profiling , Software , Computer Simulation , Technology , Transcriptome
12.
Nat Commun ; 15(1): 509, 2024 Jan 13.
Article in English | MEDLINE | ID: mdl-38218939

ABSTRACT

Recent advances in subcellular imaging transcriptomics platforms have enabled high-resolution spatial mapping of gene expression, while also introducing significant analytical challenges in accurately identifying cells and assigning transcripts. Existing methods grapple with cell segmentation, frequently leading to fragmented cells or oversized cells that capture contaminated expression. To this end, we present BIDCell, a self-supervised deep learning-based framework with biologically-informed loss functions that learn relationships between spatially resolved gene expression and cell morphology. BIDCell incorporates cell-type data, including single-cell transcriptomics data from public repositories, with cell morphology information. Using a comprehensive evaluation framework consisting of metrics in five complementary categories for cell segmentation performance, we demonstrate that BIDCell outperforms other state-of-the-art methods according to many metrics across a variety of tissue types and technology platforms. Our findings underscore the potential of BIDCell to significantly enhance single-cell spatial expression analyses, enabling great potential in biological discovery.


Subject(s)
Benchmarking , Gene Expression Profiling , Erythrocytes, Abnormal , Histocompatibility Testing , Supervised Machine Learning
13.
Sci Adv ; 10(25): eadk8501, 2024 Jun 21.
Article in English | MEDLINE | ID: mdl-38905342

ABSTRACT

Single-cell technology has allowed researchers to probe tissue complexity and dynamics at unprecedented depth in health and disease. However, the generation of high-dimensionality single-cell atlases and virtual three-dimensional tissues requires integrated reference maps that harmonize disparate experimental designs, analytical pipelines, and taxonomies. Here, we present a comprehensive single-cell transcriptome integration map of cardiac fibrosis, which underpins pathophysiology in most cardiovascular diseases. Our findings reveal similarity between cardiac fibroblast (CF) identities and dynamics in ischemic versus pressure overload models of cardiomyopathy. We also describe timelines for commitment of activated CFs to proliferation and myofibrogenesis, profibrotic and antifibrotic polarization of myofibroblasts and matrifibrocytes, and CF conservation across mouse and human healthy and diseased hearts. These insights have the potential to inform knowledge-based therapies.


Subject(s)
Fibroblasts , Fibrosis , Single-Cell Analysis , Transcriptome , Animals , Single-Cell Analysis/methods , Humans , Fibroblasts/metabolism , Mice , Myocardium/metabolism , Myocardium/pathology , Myofibroblasts/metabolism , Myofibroblasts/pathology , Gene Expression Profiling
14.
F1000Res ; 12: 261, 2023.
Article in English | MEDLINE | ID: mdl-38434622

ABSTRACT

Background: Globally, scientists now have the ability to generate a vast amount of high throughput biomedical data that carry critical information for important clinical and public health applications. This data revolution in biology is now creating a plethora of new single-cell datasets. Concurrently, there have been significant methodological advances in single-cell research. Integrating these two resources, creating tailor-made, efficient, and purpose-specific data analysis approaches can assist in accelerating scientific discovery. Methods: We developed a series of living workshops for building data stories, using Single-cell data integrative analysis (scdney). scdney is a wrapper package with a collection of single-cell analysis R packages incorporating data integration, cell type annotation, higher order testing and more. Results: Here, we illustrate two specific workshops. The first workshop examines how to characterise the identity and/or state of cells and the relationship between them, known as phenotyping. The second workshop focuses on extracting higher-order features from cells to predict disease progression. Conclusions: Through these workshops, we not only showcase current solutions, but also highlight critical thinking points. In particular, we highlight the Thinking Process Template that provides a structured framework for the decision-making process behind such single-cell analyses. Furthermore, our workshop will incorporate dynamic contributions from the community in a collaborative learning approach, thus the term 'living'.

15.
Nat Cell Biol ; 24(7): 1114-1128, 2022 07.
Article in English | MEDLINE | ID: mdl-35817961

ABSTRACT

The mammalian heart arises from various populations of Mesp1-expressing cardiovascular progenitors (CPs) that are specified during the early stages of gastrulation. Mesp1 is a transcription factor that acts as a master regulator of CP specification and differentiation. However, how Mesp1 regulates the chromatin landscape of nascent mesodermal cells to define the temporal and spatial patterning of the distinct populations of CPs remains unknown. Here, by combining ChIP-seq, RNA-seq and ATAC-seq during mouse pluripotent stem cell differentiation, we defined the dynamic remodelling of the chromatin landscape mediated by Mesp1. We identified different enhancers that are temporally regulated to erase the pluripotent state and specify the pools of CPs that mediate heart development. We identified Zic2 and Zic3 as essential cofactors that act with Mesp1 to regulate its transcription-factor activity at key mesodermal enhancers, thereby regulating the chromatin remodelling and gene expression associated with the specification of the different populations of CPs in vivo. Our study identifies the dynamics of the chromatin landscape and enhancer remodelling associated with temporal patterning of early mesodermal cells into the distinct populations of CPs that mediate heart development.


Subject(s)
Basic Helix-Loop-Helix Transcription Factors , Chromatin , Animals , Basic Helix-Loop-Helix Transcription Factors/metabolism , Cell Differentiation/genetics , Chromatin/genetics , Chromatin/metabolism , Enhancer Elements, Genetic/genetics , Gene Expression Regulation, Developmental , Heart , Homeodomain Proteins/metabolism , Mammals/metabolism , Mesoderm , Mice , Transcription Factors/genetics , Transcription Factors/metabolism
16.
Genome Biol ; 22(1): 197, 2021 07 05.
Article in English | MEDLINE | ID: mdl-34225769

ABSTRACT

BACKGROUND: Single-cell technologies are transforming biomedical research, including the recent demonstration that unspliced pre-mRNA present in single-cell RNA-Seq permits prediction of future expression states. Here we apply this RNA velocity concept to an extended timecourse dataset covering mouse gastrulation and early organogenesis. RESULTS: Intriguingly, RNA velocity correctly identifies epiblast cells as the starting point, but several trajectory predictions at later stages are inconsistent with both real-time ordering and existing knowledge. The most striking discrepancy concerns red blood cell maturation, with velocity-inferred trajectories opposing the true differentiation path. Investigating the underlying causes reveals a group of genes with a coordinated step-change in transcription, thus violating the assumptions behind current velocity analysis suites, which do not accommodate time-dependent changes in expression dynamics. Using scRNA-Seq analysis of chimeric mouse embryos lacking the major erythroid regulator Gata1, we show that genes with the step-changes in expression dynamics during erythroid differentiation fail to be upregulated in the mutant cells, thus underscoring the coordination of modulating transcription rate along a differentiation trajectory. In addition to the expected block in erythroid maturation, the Gata1-chimera dataset reveals induction of PU.1 and expansion of megakaryocyte progenitors. Finally, we show that erythropoiesis in human fetal liver is similarly characterized by a coordinated step-change in gene expression. CONCLUSIONS: By identifying a limitation of the current velocity framework coupled with in vivo analysis of mutant cells, we reveal a coordinated step-change in gene expression kinetics during erythropoiesis, with likely implications for many other differentiation processes.


Subject(s)
Erythroid Cells/metabolism , Erythropoiesis/genetics , GATA1 Transcription Factor/genetics , Gene Expression Regulation, Developmental , Organogenesis/genetics , Animals , Cell Differentiation , Datasets as Topic , Embryo, Mammalian , Erythroid Cells/cytology , Fetus , GATA1 Transcription Factor/deficiency , Gastrula/growth & development , Gastrula/metabolism , Humans , Kinetics , Liver/cytology , Liver/growth & development , Liver/metabolism , Mice , Proto-Oncogene Proteins/genetics , Proto-Oncogene Proteins/metabolism , Single-Cell Analysis , Trans-Activators/genetics , Trans-Activators/metabolism , Transcriptional Activation
17.
Dev Cell ; 56(1): 141-153.e6, 2021 01 11.
Article in English | MEDLINE | ID: mdl-33308481

ABSTRACT

Somite formation is foundational to creating the vertebrate segmental body plan. Here, we describe three transcriptional trajectories toward somite formation in the early mouse embryo. Precursors of the anterior-most somites ingress through the primitive streak before E7 and migrate anteriorly by E7.5, while a second wave of more posterior somites develops in the vicinity of the streak. Finally, neuromesodermal progenitors (NMPs) are set aside for subsequent trunk somitogenesis. Single-cell profiling of T-/- chimeric embryos shows that the anterior somites develop in the absence of T and suggests a cell-autonomous function of T as a gatekeeper between paraxial mesoderm production and the building of the NMP pool. Moreover, we identify putative regulators of early T-independent somites and challenge the T-Sox2 cross-antagonism model in early NMPs. Our study highlights the concept of molecular flexibility during early cell-type specification, with broad relevance for pluripotent stem cell differentiation and disease modeling.


Subject(s)
Body Patterning/genetics , Chimera/metabolism , Fetal Proteins/metabolism , Gene Expression Regulation, Developmental/genetics , Mesoderm/cytology , SOXB1 Transcription Factors/metabolism , Somites/cytology , T-Box Domain Proteins/metabolism , Animals , Cell Differentiation/genetics , Cell Differentiation/physiology , Cell Line , Chimera/embryology , Chimera/genetics , Embryo, Mammalian , Female , Fetal Proteins/genetics , Gene Expression Profiling , Germ Cells/cytology , Germ Cells/metabolism , Heterozygote , Male , Mesoderm/metabolism , Mice , Mice, Inbred C57BL , Single-Cell Analysis , Somites/metabolism , T-Box Domain Proteins/genetics , Transcriptome/genetics
18.
G3 (Bethesda) ; 11(10)2021 09 27.
Article in English | MEDLINE | ID: mdl-34568906

ABSTRACT

Genetic and environmental factors play a major role in metabolic health. However, they do not act in isolation, as a change in an environmental factor such as diet may exert different effects based on an individual's genotype. Here, we sought to understand how such gene-diet interactions influenced nutrient storage and utilization, a major determinant of metabolic disease. We subjected 178 inbred strains from the Drosophila genetic reference panel (DGRP) to diets varying in sugar, fat, and protein. We assessed starvation resistance, a holistic phenotype of nutrient storage and utilization that can be robustly measured. Diet influenced the starvation resistance of most strains, but the effect varied markedly between strains such that some displayed better survival on a high carbohydrate diet (HCD) compared to a high-fat diet while others had opposing responses, illustrating a considerable gene × diet interaction. This demonstrates that genetics plays a major role in diet responses. Furthermore, heritability analysis revealed that the greatest genetic variability arose from diets either high in sugar or high in protein. To uncover the genetic variants that contribute to the heterogeneity in starvation resistance, we mapped 566 diet-responsive SNPs in 293 genes, 174 of which have human orthologs. Using whole-body knockdown, we identified two genes that were required for glucose tolerance, storage, and utilization. Strikingly, flies in which the expression of one of these genes, CG4607 a putative homolog of a mammalian glucose transporter, was reduced at the whole-body level, displayed lethality on a HCD. This study provides evidence that there is a strong interplay between diet and genetics in governing survival in response to starvation, a surrogate measure of nutrient storage efficiency and obesity. It is likely that a similar principle applies to higher organisms thus supporting the case for nutrigenomics as an important health strategy.


Subject(s)
Drosophila Proteins , Drosophila , Animals , Diet, High-Fat , Drosophila/genetics , Drosophila Proteins/genetics , Drosophila melanogaster , Genotype , Humans , Phenotype
19.
Genome Biol ; 22(1): 333, 2021 12 06.
Article in English | MEDLINE | ID: mdl-34872616

ABSTRACT

scRNA-seq datasets are increasingly used to identify gene panels that can be probed using alternative technologies, such as spatial transcriptomics, where choosing the best subset of genes is vital. Existing methods are limited by a reliance on pre-existing cell type labels or by difficulties in identifying markers of rare cells. We introduce an iterative approach, geneBasis, for selecting an optimal gene panel, where each newly added gene captures the maximum distance between the true manifold and the manifold constructed using the currently selected gene panel. Our approach outperforms existing strategies and can resolve cell types and subtle cell state differences.


Subject(s)
RNA-Seq , Sequence Analysis, RNA/methods , Single-Cell Analysis/methods , Algorithms , Cluster Analysis , Gene Expression Profiling , Humans , Transcriptome , Exome Sequencing
20.
J Hematol Oncol ; 14(1): 22, 2021 02 02.
Article in English | MEDLINE | ID: mdl-33531041

ABSTRACT

Genetic heterogeneity of tumor is closely related to its clonal evolution, phenotypic diversity and treatment resistance, and such heterogeneity has only been characterized at single-cell sub-chromosomal scale in liver cancer. Here we reconstructed the single-variant resolution clonal evolution in human liver cancer based on single-cell mutational profiles. The results indicated that key genetic events occurred early during tumorigenesis, and an early metastasis followed by independent evolution was observed in primary liver tumor and intrahepatic metastatic portal vein tumor thrombus. By parallel single-cell RNA-Seq, the transcriptomic phenotype of HCC was found to be related with genetic heterogeneity. For the first time we reconstructed the single-cell and single-variant clonal evolution in human liver cancer, and dissection of both genetic and phenotypic heterogeneity will facilitate better understanding of their relationship.


Subject(s)
Carcinoma, Hepatocellular/genetics , Clonal Evolution , Liver Neoplasms/genetics , Humans , Mutation , Single-Cell Analysis , Tumor Cells, Cultured
SELECTION OF CITATIONS
SEARCH DETAIL