Machine learning-based reproducible prediction of type 2 diabetes subtypes.

Tanabe, Hayato; Sato, Masahiro; Miyake, Akimitsu; Shimajiri, Yoshinori; Ojima, Takafumi; Narita, Akira; Saito, Haruka; Tanaka, Kenichi; Masuzaki, Hiroaki; Kazama, Junichiro J; Katagiri, Hideki; Tamiya, Gen; Kawakami, Eiryo; Shimabukuro, Michio

Tanabe, Hayato; Sato, Masahiro; Miyake, Akimitsu; Shimajiri, Yoshinori; Ojima, Takafumi; Narita, Akira; Saito, Haruka; Tanaka, Kenichi; Masuzaki, Hiroaki; Kazama, Junichiro J; Katagiri, Hideki; Tamiya, Gen; Kawakami, Eiryo; Shimabukuro, Michio.

Afiliação

Tanabe H; Department of Diabetes, Endocrinology, and Metabolism, Fukushima Medical University School of Medicine, Fukushima, Japan.
Sato M; Department of Diabetes, Metabolism and Endocrinology, Tohoku University Graduate School of Medicine, Miyagi, Japan.
Miyake A; Department of Diabetes, Endocrinology, and Metabolism, Fukushima Medical University School of Medicine, Fukushima, Japan.
Shimajiri Y; Department of AI and Innovative Medicine, Tohoku University School of Medicine, Miyagi, Japan.
Ojima T; Shimajiri Kinsermae Diabetes Care Clinic, Okinawa, Japan.
Narita A; Department of AI and Innovative Medicine, Tohoku University School of Medicine, Miyagi, Japan.
Saito H; Department of Statistical Genetics, Osaka University Graduate School of Medicine, Osaka, Japan.
Tanaka K; Tohoku Medical Megabank Organization, Tohoku University, Miyagi, Japan.
Masuzaki H; Department of Diabetes, Endocrinology, and Metabolism, Fukushima Medical University School of Medicine, Fukushima, Japan.
Kazama JJ; Department of Nephrology and Hypertension, Fukushima Medical University School of Medicine, Fukushima, Japan.
Katagiri H; Division of Endocrinology and Metabolism, Second Department of Internal Medicine, University of the Ryukyus Graduate School of Medicine, Okinawa, Japan.
Tamiya G; Department of Nephrology and Hypertension, Fukushima Medical University School of Medicine, Fukushima, Japan.
Kawakami E; Department of Diabetes, Metabolism and Endocrinology, Tohoku University Graduate School of Medicine, Miyagi, Japan.
Shimabukuro M; Department of AI and Innovative Medicine, Tohoku University School of Medicine, Miyagi, Japan.

Diabetologia ; 2024 Aug 21.

Article em En | MEDLINE | ID: mdl-39168869

ABSTRACT

ABSTRACT

AIMS/

HYPOTHESIS:

Clustering-based subclassification of type 2 diabetes, which reflects pathophysiology and genetic predisposition, is a promising approach for providing personalised and effective therapeutic strategies. Ahlqvist's classification is currently the most vigorously validated method because of its superior ability to predict diabetes complications but it does not have strong consistency over time and requires HOMA2 indices, which are not routinely available in clinical practice and standard cohort studies. We developed a machine learning (ML) model to classify individuals with type 2 diabetes into Ahlqvist's subtypes consistently over time.

METHODS:

Cohort 1 dataset comprised 619 Japanese individuals with type 2 diabetes who were divided into training and test sets for ML models in a 73 ratio. Cohort 2 dataset, comprising 597 individuals with type 2 diabetes, was used for external validation. Participants were pre-labelled (T2Dkmeans) by unsupervised k-means clustering based on Ahlqvist's variables (age at diagnosis, BMI, HbA1c, HOMA2-B and HOMA2-IR) to four subtypes severe insulin-deficient diabetes (SIDD), severe insulin-resistant diabetes (SIRD), mild obesity-related diabetes (MOD) and mild age-related diabetes (MARD). We adopted 15 variables for a multiclass classification random forest (RF) algorithm to predict type 2 diabetes subtypes (T2DRF15). The proximity matrix computed by RF was visualised using a uniform manifold approximation and projection. Finally, we used a putative subset with missing insulin-related variables to test the predictive performance of the validation cohort, consistency of subtypes over time and prediction ability of diabetes complications.

RESULTS:

T2DRF15 demonstrated a 94% accuracy for predicting T2Dkmeans type 2 diabetes subtypes (AUCs ≥0.99 and F1 score [an indicator calculated by harmonic mean from precision and recall] ≥0.9) and retained the predictive performance in the external validation cohort (86.3%). T2DRF15 showed an accuracy of 82.9% for detecting T2Dkmeans, also in a putative subset with missing insulin-related variables, when used with an imputation algorithm. In Kaplan-Meier analysis, the diabetes clusters of T2DRF15 demonstrated distinct accumulation risks of diabetic retinopathy in SIDD and that of chronic kidney disease in SIRD during a median observation period of 11.6 (4.5-18.3) years, similarly to the subtypes using T2Dkmeans. The predictive accuracy was improved after excluding individuals with low predictive probability, who were categorised as an 'undecidable' cluster. T2DRF15, after excluding undecidable individuals, showed higher consistency (100% for SIDD, 68.6% for SIRD, 94.4% for MOD and 97.9% for MARD) than T2Dkmeans. CONCLUSIONS/

INTERPRETATION:

The new ML model for predicting Ahlqvist's subtypes of type 2 diabetes has great potential for application in clinical practice and cohort studies because it can classify individuals with missing HOMA2 indices and predict glycaemic control, diabetic complications and treatment outcomes with long-term consistency by using readily available variables. Future studies are needed to assess whether our approach is applicable to research and/or clinical practice in multiethnic populations.

Palavras-chave

Clustering; Diabetes subtypes; Machine learning; Random forest; Type 2 diabetes

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Idioma: En Ano de publicação: 2024 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links