Your browser doesn't support javascript.
loading
Identifying gene expression programs in single-cell RNA-seq data using linear correlation explanation.
Nussbaum, Yulia I; Hossain, K S M Tozammel; Kaifi, Jussuf; Warren, Wesley C; Shyu, Chi-Ren; Mitchem, Jonathan B.
Afiliação
  • Nussbaum YI; Institute for Data Science and Informatics, University of Missouri, Columbia, MO 65201, USA.
  • Hossain KSMT; Department of Information Science, University of North Texas, 3940 N Elm St, Denton, TX 76203, USA.
  • Kaifi J; Institute for Data Science and Informatics, University of Missouri, Columbia, MO 65201, USA; Department of Surgery, University of Missouri Hospital, 1 Hospital Dr., Columbia, MO 65212, USA; Harry S. Truman Memorial Veterans' Hospital, 800 Hospital Dr., Columbia, MO 65201, USA; Siteman Cancer Center,
  • Warren WC; Department of Surgery, University of Missouri Hospital, 1 Hospital Dr., Columbia, MO 65212, USA; Bond Life Sciences Center, University of Missouri, 1201 Rollin St., Columbia, MO 65211, USA.
  • Shyu CR; Institute for Data Science and Informatics, University of Missouri, Columbia, MO 65201, USA.
  • Mitchem JB; VA Northeast Ohio Healthcare System, 10701 East Boulevard, Cleveland, OH 44106, USA; Department of Colon and Rectal Surgery, Cleveland Clinic, 9500 Euclid Avenue, Cleveland, OH, 44195, USA; Department of Inflammation and Immunity, Lerner Research Institute, 9500 Euclid Avenue, Cleveland, OH 44195, U
J Biomed Inform ; 154: 104644, 2024 Jun.
Article em En | MEDLINE | ID: mdl-38631462
ABSTRACT

OBJECTIVE:

Gene expression analysis through single-cell RNA sequencing (scRNA-seq) has revolutionized our understanding of gene regulation in diverse cell types, tissues, and organisms. While existing methods primarily focus on identifying cell type-specific gene expression programs (GEPs), the characterization of GEPs associated with biological processes and stimuli responses remains limited. In this study, we aim to infer biologically meaningful GEPs that are associated with both cellular phenotypes and activity programs directly from scRNA-seq data.

METHODS:

We applied linear CorEx, a machine-learning-based approach, to infer GEPs by grouping genes based on total correlation optimization function in simulated and real-world scRNA-seq datasets. Additionally, we utilized a transfer learning approach to project CorEx-inferred GEPs to other scRNA-seq datasets.

RESULTS:

By leveraging total correlation optimization, linear CorEx groups genes and demonstrates superior performance in identifying cell types and activity programs compared to similar methods using simulated data. Furthermore, we apply this same approach to real-world scRNA-seq data from the mouse dentate gyrus and embryonic colon development, uncovering biologically relevant GEPs related to cell types, developmental ages, and cell cycle programs. We also demonstrate the potential for transfer learning by evaluating similar datasets, showcasing the cross-species sensitivity of linear CorEx.

CONCLUSION:

Our findings validate linear CorEx as a valuable tool for comprehensively analyzing complex signals in scRNA-seq data, leading to deeper insights into gene expression dynamics, cellular heterogeneity, and regulatory mechanisms.
Assuntos
Palavras-chave

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Aprendizado de Máquina / RNA-Seq / Análise da Expressão Gênica de Célula Única Limite: Animals / Humans Idioma: En Ano de publicação: 2024 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Aprendizado de Máquina / RNA-Seq / Análise da Expressão Gênica de Célula Única Limite: Animals / Humans Idioma: En Ano de publicação: 2024 Tipo de documento: Article