Your browser doesn't support javascript.
loading
Factorized embeddings learns rich and biologically meaningful embedding spaces using factorized tensor decomposition.
Trofimov, Assya; Cohen, Joseph Paul; Bengio, Yoshua; Perreault, Claude; Lemieux, Sébastien.
Afiliação
  • Trofimov A; Department of Computer Science, Univerity of Montreal, Québec, Canada.
  • Cohen JP; Institute for Research in Immunology and Cancer, Univerity of Montreal, Québec, Canada.
  • Bengio Y; Mila, Univerity of Montreal, Québec, Canada.
  • Perreault C; Department of Computer Science, Univerity of Montreal, Québec, Canada.
  • Lemieux S; Mila, Univerity of Montreal, Québec, Canada.
Bioinformatics ; 36(Suppl_1): i417-i426, 2020 07 01.
Article em En | MEDLINE | ID: mdl-32657403
MOTIVATION: The recent development of sequencing technologies revolutionized our understanding of the inner workings of the cell as well as the way disease is treated. A single RNA sequencing (RNA-Seq) experiment, however, measures tens of thousands of parameters simultaneously. While the results are information rich, data analysis provides a challenge. Dimensionality reduction methods help with this task by extracting patterns from the data by compressing it into compact vector representations. RESULTS: We present the factorized embeddings (FE) model, a self-supervised deep learning algorithm that learns simultaneously, by tensor factorization, gene and sample representation spaces. We ran the model on RNA-Seq data from two large-scale cohorts and observed that the sample representation captures information on single gene and global gene expression patterns. Moreover, we found that the gene representation space was organized such that tissue-specific genes, highly correlated genes as well as genes participating in the same GO terms were grouped. Finally, we compared the vector representation of samples learned by the FE model to other similar models on 49 regression tasks. We report that the representations trained with FE rank first or second in all of the tasks, surpassing, sometimes by a considerable margin, other representations. AVAILABILITY AND IMPLEMENTATION: A toy example in the form of a Jupyter Notebook as well as the code and trained embeddings for this project can be found at: https://github.com/TrofimovAssya/FactorizedEmbeddings. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Algoritmos / RNA Tipo de estudo: Prognostic_studies Idioma: En Ano de publicação: 2020 Tipo de documento: Article

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Algoritmos / RNA Tipo de estudo: Prognostic_studies Idioma: En Ano de publicação: 2020 Tipo de documento: Article