Your browser doesn't support javascript.
loading
Visualization, benchmarking and characterization of nested single-cell heterogeneity as dynamic forest mixtures.
Anchang, Benedict; Mendez-Giraldez, Raul; Xu, Xiaojiang; Archer, Trevor K; Chen, Qing; Hu, Guang; Plevritis, Sylvia K; Motsinger-Reif, Alison Anne; Li, Jian-Liang.
Affiliation
  • Anchang B; Biostatistics and Computational Biology Branch, National Institute of Environmental Health Sciences, Stanford, California, USA.
  • Mendez-Giraldez R; Biostatistics and Computational Biology Branch, National Institute of Environmental Health Sciences, Stanford, California, USA.
  • Xu X; Integrative Bioinformatics Support Group, National Institute of Environmental Health Sciences, Stanford, California, USA.
  • Archer TK; Epigenetics & Stem Cell Biology Laboratory/Chromatin & Gene Expression Group, National Institute of Environmental Health Sciences, Stanford, California, USA.
  • Chen Q; Epigenetics & Stem Cell Biology Laboratory/Chromatin & Gene Expression Group, National Institute of Environmental Health Sciences, Stanford, California, USA.
  • Hu G; Epigenetics & Stem Cell Biology Laboratory/Chromatin & Gene Expression Group, National Institute of Environmental Health Sciences, Stanford, California, USA.
  • Plevritis SK; Department of Biomedical Data Science, Center for Cancer Systems Biology, Stanford University, Stanford, California, USA.
  • Motsinger-Reif AA; Biostatistics and Computational Biology Branch, National Institute of Environmental Health Sciences, Stanford, California, USA.
  • Li JL; Integrative Bioinformatics Support Group, National Institute of Environmental Health Sciences, Stanford, California, USA.
Brief Bioinform ; 23(2)2022 03 10.
Article de En | MEDLINE | ID: mdl-35192692
ABSTRACT
A major topic of debate in developmental biology centers on whether development is continuous, discontinuous, or a mixture of both. Pseudo-time trajectory models, optimal for visualizing cellular progression, model cell transitions as continuous state manifolds and do not explicitly model real-time, complex, heterogeneous systems and are challenging for benchmarking with temporal models. We present a data-driven framework that addresses these limitations with temporal single-cell data collected at discrete time points as inputs and a mixture of dependent minimum spanning trees (MSTs) as outputs, denoted as dynamic spanning forest mixtures (DSFMix). DSFMix uses decision-tree models to select genes that account for variations in multimodality, skewness and time. The genes are subsequently used to build the forest using tree agglomerative hierarchical clustering and dynamic branch cutting. We first motivate the use of forest-based algorithms compared to single-tree approaches for visualizing and characterizing developmental processes. We next benchmark DSFMix to pseudo-time and temporal approaches in terms of feature selection, time correlation, and network similarity. Finally, we demonstrate how DSFMix can be used to visualize, compare and characterize complex relationships during biological processes such as epithelial-mesenchymal transition, spermatogenesis, stem cell pluripotency, early transcriptional response from hormones and immune response to coronavirus disease. Our results indicate that the expression of genes during normal development exhibits a high proportion of non-uniformly distributed profiles that are mostly right-skewed and multimodal; the latter being a characteristic of major steady states during development. Our study also identifies and validates gene signatures driving complex dynamic processes during somatic or germline differentiation.
Sujet(s)
Mots clés

Texte intégral: 1 Collection: 01-internacional Base de données: MEDLINE Sujet principal: Référenciation / Analyse sur cellule unique / Modèles théoriques Type d'étude: Prognostic_studies Limites: Animals / Humans Langue: En Journal: Brief Bioinform Sujet du journal: BIOLOGIA / INFORMATICA MEDICA Année: 2022 Type de document: Article Pays d'affiliation: États-Unis d'Amérique

Texte intégral: 1 Collection: 01-internacional Base de données: MEDLINE Sujet principal: Référenciation / Analyse sur cellule unique / Modèles théoriques Type d'étude: Prognostic_studies Limites: Animals / Humans Langue: En Journal: Brief Bioinform Sujet du journal: BIOLOGIA / INFORMATICA MEDICA Année: 2022 Type de document: Article Pays d'affiliation: États-Unis d'Amérique
...