Your browser doesn't support javascript.
loading
A data-driven interpretable ensemble framework based on tree models for forecasting the occurrence of COVID-19 in the USA.
Zheng, Hu-Li; An, Shu-Yi; Qiao, Bao-Jun; Guan, Peng; Huang, De-Sheng; Wu, Wei.
Affiliation
  • Zheng HL; Department of Epidemiology, School of Public Health, China Medical University, No. 77 Puhe Road, Shenyang, Liaoning Province, China.
  • An SY; Liaoning Provincial Center for Disease Control and Prevention, Shenyang, Liaoning, China.
  • Qiao BJ; Liaoning Provincial Center for Disease Control and Prevention, Shenyang, Liaoning, China.
  • Guan P; Department of Epidemiology, School of Public Health, China Medical University, No. 77 Puhe Road, Shenyang, Liaoning Province, China.
  • Huang DS; Department of Mathematics, School of Intelligent Medicine, China Medical University, Shenyang, Liaoning, China.
  • Wu W; Department of Epidemiology, School of Public Health, China Medical University, No. 77 Puhe Road, Shenyang, Liaoning Province, China. wuwei@cmu.edu.cn.
Environ Sci Pollut Res Int ; 30(5): 13648-13659, 2023 Jan.
Article in En | MEDLINE | ID: mdl-36131178
This prevalence of coronavirus disease 2019 (COVID-19) has become one of the most serious public health crises. Tree-based machine learning methods, with the advantages of high efficiency, and strong interpretability, have been widely used in predicting diseases. A data-driven interpretable ensemble framework based on tree models was designed to forecast daily new cases of COVID-19 in the USA and to determine the important factors related to COVID-19. Based on a hyperparametric optimization technique, we developed three machine learning algorithms based on decision trees, including random forest (RF), eXtreme Gradient Boosting (XGBoost), and Light Gradient Boosting Machine (LightGBM), and three linear ensemble models were used to integrate these outcomes for better prediction accuracy. Finally, the SHapley Additive explanation (SHAP) value was used to obtain the feature importance ranking. Our outcomes demonstrated that, among the three basic machine learners, the prediction accuracy was the following in descending order: LightGBM, XGBoost, and RF. The optimized LAD ensemble was the most precise prediction model that reduced the prediction error of the best base learner (LightGBM) by approximately 3.111%, while vaccination, wearing masks, less mobility, and government interventions had positive effects on the control and prevention of COVID-19.
Subject(s)
Key words

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: COVID-19 Type of study: Prognostic_studies / Risk_factors_studies Limits: Humans Country/Region as subject: America do norte Language: En Journal: Environ Sci Pollut Res Int Journal subject: SAUDE AMBIENTAL / TOXICOLOGIA Year: 2023 Type: Article Affiliation country: China

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: COVID-19 Type of study: Prognostic_studies / Risk_factors_studies Limits: Humans Country/Region as subject: America do norte Language: En Journal: Environ Sci Pollut Res Int Journal subject: SAUDE AMBIENTAL / TOXICOLOGIA Year: 2023 Type: Article Affiliation country: China