Your browser doesn't support javascript.
loading
Learning biologically-interpretable latent representations for gene expression data: Pathway Activity Score Learning Algorithm.
Karagiannaki, Ioulia; Gourlia, Krystallia; Lagani, Vincenzo; Pantazis, Yannis; Tsamardinos, Ioannis.
Afiliação
  • Karagiannaki I; Institute of Electronic Structure and Laser, Foundation for Research and Technology-Hellas (IESL-FORTH), Heraklion, Greece.
  • Gourlia K; Department of Computer Science, University of Crete, Heraklion, Greece.
  • Lagani V; Institute of Chemical Biology, Ilia State University, Tbilisi, 0162 Georgia.
  • Pantazis Y; JADBio, Gnosis Data Analysis PC, Heraklion, Crete Greece.
  • Tsamardinos I; Institute of Applied and Computational Mathematics, Foundation for Research and Technology - Hellas, Heraklion, Greece.
Mach Learn ; 112(11): 4257-4287, 2023.
Article em En | MEDLINE | ID: mdl-37900054
ABSTRACT
Molecular gene-expression datasets consist of samples with tens of thousands of measured quantities (i.e., high dimensional data). However, lower-dimensional representations that retain the useful biological information do exist. We present a novel algorithm for such dimensionality reduction called Pathway Activity Score Learning (PASL). The major novelty of PASL is that the constructed features directly correspond to known molecular pathways (genesets in general) and can be interpreted as pathway activity scores. Hence, unlike PCA and similar methods, PASL's latent space has a fairly straightforward biological interpretation. PASL is shown to outperform in predictive performance the state-of-the-art method (PLIER) on two collections of breast cancer and leukemia gene expression datasets. PASL is also trained on a large corpus of 50000 gene expression samples to construct a universal dictionary of features across different tissues and pathologies. The dictionary validated on 35643 held-out samples for reconstruction error. It is then applied on 165 held-out datasets spanning a diverse range of diseases. The AutoML tool JADBio is employed to show that the predictive information in the PASL-created feature space is retained after the transformation. The code is available at https//github.com/mensxmachina/PASL.
Palavras-chave

Texto completo: 1 Base de dados: MEDLINE Idioma: En Ano de publicação: 2023 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Idioma: En Ano de publicação: 2023 Tipo de documento: Article