Machine Learning Approach to Metabolomic Data Predicts Type 2 Diabetes Mellitus Incidence.

Leiherer, Andreas; Muendlein, Axel; Mink, Sylvia; Mader, Arthur; Saely, Christoph H; Festa, Andreas; Fraunberger, Peter; Drexel, Heinz

Leiherer, Andreas; Muendlein, Axel; Mink, Sylvia; Mader, Arthur; Saely, Christoph H; Festa, Andreas; Fraunberger, Peter; Drexel, Heinz.

Afiliação

Leiherer A; Vorarlberg Institute for Vascular Investigation and Treatment (VIVIT), A-6800 Feldkirch, Austria.
Muendlein A; Central Medical Laboratories, A-6800 Feldkirch, Austria.
Mink S; Faculty of Medical Sciences, Private University of the Principality of Liechtenstein, FL-9495 Triesen, Liechtenstein.
Mader A; Vorarlberg Institute for Vascular Investigation and Treatment (VIVIT), A-6800 Feldkirch, Austria.
Saely CH; Central Medical Laboratories, A-6800 Feldkirch, Austria.
Festa A; Faculty of Medical Sciences, Private University of the Principality of Liechtenstein, FL-9495 Triesen, Liechtenstein.
Fraunberger P; Vorarlberg Institute for Vascular Investigation and Treatment (VIVIT), A-6800 Feldkirch, Austria.
Drexel H; Department of Internal Medicine III, Academic Teaching Hospital Feldkirch, A-6800 Feldkirch, Austria.

Int J Mol Sci ; 25(10)2024 May 14.

Article em En | MEDLINE | ID: mdl-38791370

ABSTRACT

ABSTRACT

Metabolomics, with its wealth of data, offers a valuable avenue for enhancing predictions and decision-making in diabetes. This observational study aimed to leverage machine learning (ML) algorithms to predict the 4-year risk of developing type 2 diabetes mellitus (T2DM) using targeted quantitative metabolomics data. A cohort of 279 cardiovascular risk patients who underwent coronary angiography and who were initially free of T2DM according to American Diabetes Association (ADA) criteria was analyzed at baseline, including anthropometric data and targeted metabolomics, using liquid chromatography (LC)-mass spectroscopy (MS) and flow injection analysis (FIA)-MS, respectively. All patients were followed for four years. During this time, 11.5% of the patients developed T2DM. After data preprocessing, 362 variables were used for ML, employing the Caret package in R. The dataset was divided into training and test sets (7525 ratio) and we used an oversampling approach to address the classifier imbalance of T2DM incidence. After an additional recursive feature elimination step, identifying a set of 77 variables that were the most valuable for model generation, a Support Vector Machine (SVM) model with a linear kernel demonstrated the most promising predictive capabilities, exhibiting an F1 score of 50%, a specificity of 93%, and balanced and unbalanced accuracies of 72% and 88%, respectively. The top-ranked features were bile acids, ceramides, amino acids, and hexoses, whereas anthropometric features such as age, sex, waist circumference, or body mass index had no contribution. In conclusion, ML analysis of metabolomics data is a promising tool for identifying individuals at risk of developing T2DM and opens avenues for personalized and early intervention strategies.

Assuntos

Diabetes Mellitus Tipo 2; Aprendizado de Máquina; Metabolômica; Humanos; Diabetes Mellitus Tipo 2/metabolismo; Diabetes Mellitus Tipo 2/epidemiologia; Masculino; Metabolômica/métodos; Feminino; Pessoa de Meia-Idade; Incidência; Idoso; Máquina de Vetores de Suporte; Biomarcadores/metabolismo

Palavras-chave

ML; accuracy; artificial intelligence; diabetes; incidence; machine learning; metabolomics; support vector machine

Texto completo

Adicionar na Minha BVS

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Bases de dados: MEDLINE Assunto principal: Diabetes Mellitus Tipo 2 / Metabolômica / Aprendizado de Máquina Limite: Aged / Female / Humans / Male / Middle aged Idioma: En Revista: Int J Mol Sci Ano de publicação: 2024 Tipo de documento: Article País de afiliação: Áustria

Texto completo

Adicionar na Minha BVS

Imprimir

XML

PubMed Links

Buscar no Google