Your browser doesn't support javascript.
loading
A machine learning method for the discovery of minimum marker gene combinations for cell type identification from single-cell RNA sequencing.
Aevermann, Brian; Zhang, Yun; Novotny, Mark; Keshk, Mohamed; Bakken, Trygve; Miller, Jeremy; Hodge, Rebecca; Lelieveldt, Boudewijn; Lein, Ed; Scheuermann, Richard H.
Afiliação
  • Aevermann B; J. Craig Venter Institute, La Jolla, California 92037, USA.
  • Zhang Y; J. Craig Venter Institute, La Jolla, California 92037, USA.
  • Novotny M; J. Craig Venter Institute, La Jolla, California 92037, USA.
  • Keshk M; J. Craig Venter Institute, La Jolla, California 92037, USA.
  • Bakken T; Allen Institute for Brain Science, Seattle, Washington 98109, USA.
  • Miller J; Allen Institute for Brain Science, Seattle, Washington 98109, USA.
  • Hodge R; Allen Institute for Brain Science, Seattle, Washington 98109, USA.
  • Lelieveldt B; Department of Radiology, Leiden University Medical Center, 2300 Leiden, The Netherlands.
  • Lein E; Department of Intelligent Systems, Delft University of Technology, 2628 Delft, The Netherlands.
  • Scheuermann RH; Allen Institute for Brain Science, Seattle, Washington 98109, USA.
Genome Res ; 31(10): 1767-1780, 2021 10.
Article em En | MEDLINE | ID: mdl-34088715
ABSTRACT
Single-cell genomics is rapidly advancing our knowledge of the diversity of cell phenotypes, including both cell types and cell states. Driven by single-cell/-nucleus RNA sequencing (scRNA-seq), comprehensive cell atlas projects characterizing a wide range of organisms and tissues are currently underway. As a result, it is critical that the transcriptional phenotypes discovered are defined and disseminated in a consistent and concise manner. Molecular biomarkers have historically played an important role in biological research, from defining immune cell types by surface protein expression to defining diseases by their molecular drivers. Here, we describe a machine learning-based marker gene selection algorithm, NS-Forest version 2.0, which leverages the nonlinear attributes of random forest feature selection and a binary expression scoring approach to discover the minimal marker gene expression combinations that optimally capture the cell type identity represented in complete scRNA-seq transcriptional profiles. The marker genes selected provide an expression barcode that serves as both a useful tool for downstream biological investigation and the necessary and sufficient characteristics for semantic cell type definition. The use of NS-Forest to identify marker genes for human brain middle temporal gyrus cell types reveals the importance of cell signaling and noncoding RNAs in neuronal cell type identity.
Assuntos

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Perfilação da Expressão Gênica / Análise de Célula Única Tipo de estudo: Diagnostic_studies / Prognostic_studies Idioma: En Revista: Genome Res Assunto da revista: BIOLOGIA MOLECULAR / GENETICA Ano de publicação: 2021 Tipo de documento: Article País de afiliação: Estados Unidos

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Perfilação da Expressão Gênica / Análise de Célula Única Tipo de estudo: Diagnostic_studies / Prognostic_studies Idioma: En Revista: Genome Res Assunto da revista: BIOLOGIA MOLECULAR / GENETICA Ano de publicação: 2021 Tipo de documento: Article País de afiliação: Estados Unidos