Your browser doesn't support javascript.
loading
An explainable artificial intelligence framework for risk prediction of COPD in smokers.
Wang, Xuchun; Qiao, Yuchao; Cui, Yu; Ren, Hao; Zhao, Ying; Linghu, Liqin; Ren, Jiahui; Zhao, Zhiyang; Chen, Limin; Qiu, Lixia.
Afiliação
  • Wang X; Department of Health Statistics, School of Public Health, Shanxi Medical University, 56 South XinJian Road, Taiyuan, 030001, P.R. China.
  • Qiao Y; Department of Health Statistics, School of Public Health, Shanxi Medical University, 56 South XinJian Road, Taiyuan, 030001, P.R. China.
  • Cui Y; Department of Health Statistics, School of Public Health, Shanxi Medical University, 56 South XinJian Road, Taiyuan, 030001, P.R. China.
  • Ren H; Department of Health Statistics, School of Public Health, Shanxi Medical University, 56 South XinJian Road, Taiyuan, 030001, P.R. China.
  • Zhao Y; Shanxi Centre for Disease Control and Prevention, Taiyuan, Shanxi, 030012, China.
  • Linghu L; Department of Health Statistics, School of Public Health, Shanxi Medical University, 56 South XinJian Road, Taiyuan, 030001, P.R. China.
  • Ren J; Shanxi Centre for Disease Control and Prevention, Taiyuan, Shanxi, 030012, China.
  • Zhao Z; Department of Health Statistics, School of Public Health, Shanxi Medical University, 56 South XinJian Road, Taiyuan, 030001, P.R. China.
  • Chen L; Department of Health Statistics, School of Public Health, Shanxi Medical University, 56 South XinJian Road, Taiyuan, 030001, P.R. China.
  • Qiu L; The Fifth Hospital (Shanxi People's Hospital) of Shanxi Medical University, Taiyuan, Shanxi, 030012, P.R. China. sxchenlimin@163.com.
BMC Public Health ; 23(1): 2164, 2023 11 06.
Article em En | MEDLINE | ID: mdl-37932692
ABSTRACT

BACKGROUND:

Since the inconspicuous nature of early signs associated with Chronic Obstructive Pulmonary Disease (COPD), individuals often remain unidentified, leading to suboptimal opportunities for timely prevention and treatment. The purpose of this study was to create an explainable artificial intelligence framework combining data preprocessing methods, machine learning methods, and model interpretability methods to identify people at high risk of COPD in the smoking population and to provide a reasonable interpretation of model predictions.

METHODS:

The data comprised questionnaire information, physical examination data and results of pulmonary function tests before and after bronchodilatation. First, the factorial analysis for mixed data (FAMD), Boruta and NRSBoundary-SMOTE resampling methods were used to solve the missing data, high dimensionality and category imbalance problems. Then, seven classification models (CatBoost, NGBoost, XGBoost, LightGBM, random forest, SVM and logistic regression) were applied to model the risk level, and the best machine learning (ML) model's decisions were explained using the Shapley additive explanations (SHAP) method and partial dependence plot (PDP).

RESULTS:

In the smoking population, age and 14 other variables were significant factors for predicting COPD. The CatBoost, random forest, and logistic regression models performed reasonably well in unbalanced datasets. CatBoost with NRSBoundary-SMOTE had the best classification performance in balanced datasets when composite indicators (the AUC, F1-score, and G-mean) were used as model comparison criteria. Age, COPD Assessment Test (CAT) score, gross annual income, body mass index (BMI), systolic blood pressure (SBP), diastolic blood pressure (DBP), anhelation, respiratory disease, central obesity, use of polluting fuel for household heating, region, use of polluting fuel for household cooking, and wheezing were important factors for predicting COPD in the smoking population.

CONCLUSION:

This study combined feature screening methods, unbalanced data processing methods, and advanced machine learning methods to enable early identification of COPD risk groups in the smoking population. COPD risk factors in the smoking population were identified using SHAP and PDP, with the goal of providing theoretical support for targeted screening strategies and smoking population self-management strategies.
Assuntos
Palavras-chave

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Doença Pulmonar Obstrutiva Crônica / Fumantes Limite: Adolescent / Humans Idioma: En Revista: BMC Public Health Ano de publicação: 2023 Tipo de documento: Article

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Doença Pulmonar Obstrutiva Crônica / Fumantes Limite: Adolescent / Humans Idioma: En Revista: BMC Public Health Ano de publicação: 2023 Tipo de documento: Article