Machine-learning model for predicting depression in second-hand smokers in cross-sectional data using the Korea National Health and Nutrition Examination Survey.

Kim, Na Hyun; Kim, Myeongju; Han, Jong Soo; Sohn, Hyoju; Oh, Bumjo; Lee, Ji Won; Ahn, Sumin

Kim, Na Hyun; Kim, Myeongju; Han, Jong Soo; Sohn, Hyoju; Oh, Bumjo; Lee, Ji Won; Ahn, Sumin.

Afiliação

Kim NH; Health Promotion Center, Seoul National University Bundang Hospital, Seongnam, South Korea.
Kim M; Center for Artificial Intelligence in Healthcare, Seoul National University Bundang Hospital Healthcare Innovation Park, Seongnam, South Korea.
Han JS; Health Promotion Center, Seoul National University Bundang Hospital, Seongnam, South Korea.
Sohn H; Center for Artificial Intelligence in Healthcare, Seoul National University Bundang Hospital Healthcare Innovation Park, Seongnam, South Korea.
Oh B; Department of Family Medicine, SMG-SNU Boramae Medical Center, Seoul, Republic of Korea.
Lee JW; Department of Urology, Seoul National University Bundang Hospital, Seongnam, South Korea.
Ahn S; Department of Digital Healthcare, Seoul National University Bundang Hospital, Seongnam, South Korea.

Digit Health ; 10: 20552076241257046, 2024.

Article em En | MEDLINE | ID: mdl-38784054

ABSTRACT

ABSTRACT

Objective:

Depression among non-smokers at risk of second-hand smoke (SHS) exposure has been a neglected public health concern despite their vulnerability. The objective of this study was to develop high-performance machine-learning (ML) models for the prediction of depression in non-smokers and to identify important predictors of depression for second-hand smokers.

Methods:

ML algorithms were created using demographic and clinical data from the Korea National Health and Nutrition Examination Survey (KNHANES) participants from 2014, 2016, and 2018 (N = 11,463). The Patient Health Questionnaire was used to diagnose depression with a total score of 10 or higher. The final model was selected according to the area under the curve (AUC) or sensitivity. Shapley additive explanations (SHAP) were used to identify influential features.

Results:

The light gradient boosting machine (LGBM) with the highest positive predictive value (PPV; 0.646) was selected as the best model among the ML algorithms, whereas the support vector machine (SVM) had the highest AUC (0.900). The most influential factors identified using the LGBM were stress perception, followed by subjective health status and quality of life. Among the smoking-related features, urine cotinine levels were the most important, and no linear relationship existed between the smoking-related features and the values of SHAP.

Conclusions:

Compared with the previously developed ML models, our LGBM models achieved excellent and even superior performance in predicting depression among non-smokers at risk of SHS exposure, suggesting potential goals for depression-preventive interventions for non-smokers during public health crises.

Palavras-chave

Depression; machine learning; risk factor; second-hand smoke

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Ano de publicação: 2024 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Ano de publicação: 2024 Tipo de documento: Article