Using Data-Driven Algorithms with Large-Scale Plasma Proteomic Data to Discover Novel Biomarkers for Diagnosing Depression.

Ma, Simeng; Li, Ruiling; Gong, Qian; Lv, Honggang; Deng, Zipeng; Wang, Beibei; Yao, Lihua; Kang, Lijun; Xiang, Dan; Yang, Jun; Liu, Zhongchun

Ma, Simeng; Li, Ruiling; Gong, Qian; Lv, Honggang; Deng, Zipeng; Wang, Beibei; Yao, Lihua; Kang, Lijun; Xiang, Dan; Yang, Jun; Liu, Zhongchun.

Afiliación

Ma S; Department of Psychiatry, Renmin Hospital of Wuhan University, Wuhan 430060, China.
Li R; Department of Psychiatry, Renmin Hospital of Wuhan University, Wuhan 430060, China.
Gong Q; Department of Psychiatry, Renmin Hospital of Wuhan University, Wuhan 430060, China.
Lv H; Department of Psychiatry, Renmin Hospital of Wuhan University, Wuhan 430060, China.
Deng Z; Department of Psychiatry, Renmin Hospital of Wuhan University, Wuhan 430060, China.
Wang B; Department of Psychiatry, Renmin Hospital of Wuhan University, Wuhan 430060, China.
Yao L; Department of Psychiatry, Renmin Hospital of Wuhan University, Wuhan 430060, China.
Kang L; Department of Psychiatry, Renmin Hospital of Wuhan University, Wuhan 430060, China.
Xiang D; Department of Psychiatry, Renmin Hospital of Wuhan University, Wuhan 430060, China.
Yang J; School of Information Engineering, Wuhan University of Technology, Wuhan 430070, China.
Liu Z; Department of Psychiatry, Renmin Hospital of Wuhan University, Wuhan 430060, China.

J Proteome Res ; 23(9): 4043-4054, 2024 Sep 06.

Article en En | MEDLINE | ID: mdl-39150755

ABSTRACT

ABSTRACT

Given recent technological advances in proteomics, it is now possible to quantify plasma proteomes in large cohorts of patients to screen for biomarkers and to guide the early diagnosis and treatment of depression. Here we used CatBoost machine learning to model and discover biomarkers of depression in UK Biobank data sets (depression n = 4,479, healthy control n = 19,821). CatBoost was employed for model construction, with Shapley Additive Explanations (SHAP) being utilized to interpret the resulting model. Model performance was corroborated through 5-fold cross-validation, and its diagnostic efficacy was evaluated based on the area under the receiver operating characteristic (AUC) curve. A total of 45 depression-related proteins were screened based on the top 20 important features output by the CatBoost model in six data sets. Of the nine diagnostic models for depression, the performance of the traditional risk factor model was improved after the addition of proteomic data, with the best model having an average AUC of 0.764 in the test sets. KEGG pathway analysis of 45 screened proteins showed that the most significant pathway involved was the cytokine-cytokine receptor interaction. It is feasible to explore diagnostic biomarkers of depression using data-driven machine learning methods and large-scale data sets, although the results require validation.

Asunto(s)

Biomarcadores; Depresión; Aprendizaje Automático; Proteómica; Humanos; Biomarcadores/sangre; Proteómica/métodos; Depresión/sangre; Depresión/diagnóstico; Algoritmos; Curva ROC; Área Bajo la Curva; Proteoma/análisis; Proteoma/metabolismo; Proteínas Sanguíneas/análisis; Proteínas Sanguíneas/metabolismo; Masculino; Femenino

Palabras clave

Biomarkers; CatBoost; Depression; Proteomic

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Biomarcadores / Proteómica / Depresión / Aprendizaje Automático Límite: Female / Humans / Male Idioma: En Revista: J Proteome Res Asunto de la revista: BIOQUIMICA Año: 2024 Tipo del documento: Article País de afiliación: China Pais de publicación: Estados Unidos

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google