Variational autoencoders learn transferrable representations of metabolomics data.

Gomari, Daniel P; Schweickart, Annalise; Cerchietti, Leandro; Paietta, Elisabeth; Fernandez, Hugo; Al-Amin, Hassen; Suhre, Karsten; Krumsiek, Jan

Gomari, Daniel P; Schweickart, Annalise; Cerchietti, Leandro; Paietta, Elisabeth; Fernandez, Hugo; Al-Amin, Hassen; Suhre, Karsten; Krumsiek, Jan.

Afiliação

Gomari DP; Institute of Computational Biology, Helmholtz Center Munich-German Research Center for Environmental Health, 85764, Neuherberg, Germany.
Schweickart A; Technical University of Munich-School of Life Sciences, 85354, Freising, Germany.
Cerchietti L; Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA.
Paietta E; Department of Physiology and Biophysics, Weill Cornell Medicine, Institute for Computational Biomedicine, Englander Institute for Precision Medicine, New York, NY, 10021, USA.
Fernandez H; Department of Medicine, Hematology and Oncology Division, Weill Cornell Medicine, New York, 10065, NY, USA.
Al-Amin H; Albert Einstein College of Medicine-Montefiore Medical Center, Bronx, NY, USA.
Suhre K; Moffitt Malignant Hematology & Cellular Therapy at Memorial Healthcare System, Pembroke Pines, FL, USA.
Krumsiek J; Department of Psychiatry, Weill Cornell Medicine-Qatar, Education City, P.O. Box 24144, Doha, Qatar.

Commun Biol ; 5(1): 645, 2022 06 30.

Article em En | MEDLINE | ID: mdl-35773471

ABSTRACT

ABSTRACT

Dimensionality reduction approaches are commonly used for the deconvolution of high-dimensional metabolomics datasets into underlying core metabolic processes. However, current state-of-the-art methods are widely incapable of detecting nonlinearities in metabolomics data. Variational Autoencoders (VAEs) are a deep learning method designed to learn nonlinear latent representations which generalize to unseen data. Here, we trained a VAE on a large-scale metabolomics population cohort of human blood samples consisting of over 4500 individuals. We analyzed the pathway composition of the latent space using a global feature importance score, which demonstrated that latent dimensions represent distinct cellular processes. To demonstrate model generalizability, we generated latent representations of unseen metabolomics datasets on type 2 diabetes, acute myeloid leukemia, and schizophrenia and found significant correlations with clinical patient groups. Notably, the VAE representations showed stronger effects than latent dimensions derived by linear and non-linear principal component analysis. Taken together, we demonstrate that the VAE is a powerful method that learns biologically meaningful, nonlinear, and transferrable latent representations of metabolomics data.

Assuntos

Diabetes Mellitus Tipo 2; Humanos; Metabolômica; Análise de Componente Principal

Texto completo

Adicionar na Minha BVS

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Bases de dados: MEDLINE Assunto principal: Diabetes Mellitus Tipo 2 Tipo de estudo: Prognostic_studies Limite: Humans Idioma: En Revista: Commun Biol Ano de publicação: 2022 Tipo de documento: Article País de afiliação: Alemanha

Texto completo

Adicionar na Minha BVS

Imprimir

XML

PubMed Links

Buscar no Google