Explainable machine-learning algorithms to differentiate bipolar disorder from major depressive disorder using self-reported symptoms, vital signs, and blood-based markers.

Zhu, Ting; Liu, Xiaofei; Wang, Junren; Kou, Ran; Hu, Yao; Yuan, Minlan; Yuan, Cui; Luo, Li; Zhang, Wei

Zhu, Ting; Liu, Xiaofei; Wang, Junren; Kou, Ran; Hu, Yao; Yuan, Minlan; Yuan, Cui; Luo, Li; Zhang, Wei.

Afiliación

Zhu T; West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu, China; Med-X Center for Informatics, Sichuan University, Chengdu, China.
Liu X; Business School, Sichuan University, Chengdu, China.
Wang J; West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu, China; Med-X Center for Informatics, Sichuan University, Chengdu, China.
Kou R; Business School, Sichuan University, Chengdu, China.
Hu Y; West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu, China; Med-X Center for Informatics, Sichuan University, Chengdu, China.
Yuan M; Mental Health Center of West China Hospital, Sichuan University, Chengdu, China.
Yuan C; Sichuan Provincial Center for Mental Health, The Center of Psychosomatic Medicine of Sichuan Provincial People's Hospital, University of Electronic Science and Technology of China, Chengdu, China.
Luo L; Business School, Sichuan University, Chengdu, China.
Zhang W; West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu, China; Med-X Center for Informatics, Sichuan University, Chengdu, China; Mental Health Center of West China Hospital, Sichuan University, Chengdu, China. Electronic address: weizhanghx@163.com.

Comput Methods Programs Biomed ; 240: 107723, 2023 Oct.

Article en En | MEDLINE | ID: mdl-37480646

RESUMEN

BACKGROUND AND OBJECTIVE: Caused by shared genetic risk factors and similar neuropsychological symptoms, bipolar disorder (BD) and major depressive disorder (MDD) are at high risk of misdiagnosis, which is associated with ineffective treatment and worsening of outcomes. We aimed to develop a machine learning (ML)-based diagnostic system, based on electronic medical records (EMR) data, to mimic the clinical reasoning of human physicians to differentiate MDD and BD (especially BD depressive episodes) patients about to be admitted to a hospital and, hence, reduce the misdiagnosis of BD as MDD on admission. In addition, we examined to what extent our ML model could be made interpretable by quantifying and visualizing the features that drive the predictions. METHODS: By identifying 16,311 patients admitted to a hospital located in western China between 2009 and 2018 with a recorded main diagnosis of MDD or BD, we established three sub-cohorts with different combinations of features for both the MDD-BD cohort and the MDD-BD depressive episodes cohort, respectively. Four different ML algorithms (logistic regression, extreme gradient boosting (XGBoost), random forest, and support vector machine) and four train-test splits were used to train and validate diagnostic models, and explainable methods (SHAP and Break Down) were utilized to analyze the contribution of each of the features at both population-level and individual-level, including feature importance, feature interaction, and feature effect on prediction decision for a specific subject. RESULTS: The XGBoost algorithm provided the best test performance (AUC: 0.838 (0.810-0.867), PPV: 0.810 and NPV: 0.834) for separating patients with BD from those with MDD. Core predictors included symptoms (mood-up, exciting, bad sleep, loss of interest, talking, mood-down, provoke), along with age, job, myocardial enzyme markers (creatine kinase, hydroxybutyrate dehydrogenase), diabetes-associated marker (glucose), bone function marker (alkaline phosphatase), non-enzymatic antioxidant (uric acid), markers of immune/inflammation (white blood cell count, lymphocyte count, basophil percentage, monocyte count), cardiovascular function marker (low density lipoprotein), renal marker (total protein), liver biochemistry marker (indirect bilirubin), and vital signs like pulse. For separating patients with BD depressive episodes from those with MDD, the test AUC was 0.777 (0.732-0.822), with PPV 0.576 and NPV 0.899. Additional validation in models built with self-reported symptoms removed from the feature set, showed test AUC of 0.701 (0.666-0.736) for differentiating BD and MDD, and AUC of 0.564 (0.515-0.614) for detecting patients in BD depressive episodes from MDD patients. Validation in the datasets without removing the patients with comorbidity showed an AUC of 0.826 (0.806-0.846). CONCLUSION: The diagnostic system accurately identified patients with BD in various clinical scenarios, and differences in patterns of peripheral markers between BD and MDD could enrich our understanding of potential underlying pathophysiological mechanisms of them.

Asunto(s)

Trastorno Bipolar; Trastorno Depresivo Mayor; Humanos; Trastorno Depresivo Mayor/diagnóstico; Trastorno Bipolar/diagnóstico; Autoinforme; Algoritmos; Frecuencia Cardíaca

Palabras clave

Bipolar disorder; Electronic medical records; Machine learning; Major depressive disorder; SHAP and Break Down; diagnostic markers

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Contexto en salud: 1_ASSA2030 Problema de salud: 1_sistemas_informacao_saude Asunto principal: Trastorno Bipolar / Trastorno Depresivo Mayor Tipo de estudio: Diagnostic_studies / Prognostic_studies / Risk_factors_studies Límite: Humans Idioma: En Revista: Comput Methods Programs Biomed Asunto de la revista: INFORMATICA MEDICA Año: 2023 Tipo del documento: Article País de afiliación: China

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google