DeepGSEA: explainable deep gene set enrichment analysis for single-cell transcriptomic data.

Xiong, Guangzhi; LeRoy, Nathan J; Bekiranov, Stefan; Sheffield, Nathan C; Zhang, Aidong

Xiong, Guangzhi; LeRoy, Nathan J; Bekiranov, Stefan; Sheffield, Nathan C; Zhang, Aidong.

Afiliação

Xiong G; Department of Computer Science, University of Virginia, Charlottesville, VA, 22904, United States.
LeRoy NJ; Center for Public Health Genomics, University of Virginia, Charlottesville, VA, 22904, United States.
Bekiranov S; Department of Biochemistry and Molecular Genetics, University of Virginia, Charlottesville, VA, 22908, United States.
Sheffield NC; Center for Public Health Genomics, University of Virginia, Charlottesville, VA, 22904, United States.
Zhang A; Department of Computer Science, University of Virginia, Charlottesville, VA, 22904, United States.

Bioinformatics ; 40(7)2024 Jul 01.

Article em En | MEDLINE | ID: mdl-38950178

ABSTRACT

ABSTRACT

MOTIVATION Gene set enrichment (GSE) analysis allows for an interpretation of gene expression through pre-defined gene set databases and is a critical step in understanding different phenotypes. With the rapid development of single-cell RNA sequencing (scRNA-seq) technology, GSE analysis can be performed on fine-grained gene expression data to gain a nuanced understanding of phenotypes of interest. However, with the cellular heterogeneity in single-cell gene profiles, current statistical GSE analysis methods sometimes fail to identify enriched gene sets. Meanwhile, deep learning has gained traction in applications like clustering and trajectory inference in single-cell studies due to its prowess in capturing complex data patterns. However, its use in GSE analysis remains limited, due to interpretability challenges.

RESULTS:

In this paper, we present DeepGSEA, an explainable deep gene set enrichment analysis approach which leverages the expressiveness of interpretable, prototype-based neural networks to provide an in-depth analysis of GSE. DeepGSEA learns the ability to capture GSE information through our designed classification tasks, and significance tests can be performed on each gene set, enabling the identification of enriched sets. The underlying distribution of a gene set learned by DeepGSEA can be explicitly visualized using the encoded cell and cellular prototype embeddings. We demonstrate the performance of DeepGSEA over commonly used GSE analysis methods by examining their sensitivity and specificity with four simulation studies. In addition, we test our model on three real scRNA-seq datasets and illustrate the interpretability of DeepGSEA by showing how its results can be explained. AVAILABILITY AND IMPLEMENTATION https//github.com/Teddy-XiongGZ/DeepGSEA.

Assuntos

Aprendizado Profundo; Análise de Célula Única; Transcriptoma; Análise de Célula Única/métodos; Transcriptoma/genética; Humanos; Perfilação da Expressão Gênica/métodos; Análise de Sequência de RNA/métodos; Biologia Computacional/métodos; Redes Neurais de Computação; Software

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Análise de Célula Única / Transcriptoma / Aprendizado Profundo Limite: Humans Idioma: En Ano de publicação: 2024 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google