Your browser doesn't support javascript.
loading
A multivariate multi-step LSTM forecasting model for tuberculosis incidence with model explanation in Liaoning Province, China.
Yang, Enbin; Zhang, Hao; Guo, Xinsheng; Zang, Zinan; Liu, Zhen; Liu, Yuanning.
Afiliação
  • Yang E; College of Computer Science and Technology, Jilin University, Changchun, 130012, China.
  • Zhang H; Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, 130012, China.
  • Guo X; College of Computer Science and Technology, Jilin University, Changchun, 130012, China.
  • Zang Z; Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, 130012, China.
  • Liu Z; College of Software, Jilin University, Changchun, 130012, China.
  • Liu Y; College of Computer Science and Technology, Jilin University, Changchun, 130012, China.
BMC Infect Dis ; 22(1): 490, 2022 May 23.
Article em En | MEDLINE | ID: mdl-35606725
ABSTRACT

BACKGROUND:

Tuberculosis (TB) is the respiratory infectious disease with the highest incidence in China. We aim to design a series of forecasting models and find the factors that affect the incidence of TB, thereby improving the accuracy of the incidence prediction.

RESULTS:

In this paper, we developed a new interpretable prediction system based on the multivariate multi-step Long Short-Term Memory (LSTM) model and SHapley Additive exPlanation (SHAP) method. Four accuracy measures are introduced into the system Root Mean Square Error, Mean Absolute Error, Mean Absolute Percentage Error, and symmetric Mean Absolute Percentage Error. The Autoregressive Integrated Moving Average (ARIMA) model and seasonal ARIMA model are established. The multi-step ARIMA-LSTM model is proposed for the first time to examine the performance of each model in the short, medium, and long term, respectively. Compared with the ARIMA model, each error of the multivariate 2-step LSTM model is reduced by 12.92%, 15.94%, 15.97%, and 14.81% in the short term. The 3-step ARIMA-LSTM model achieved excellent performance, with each error decreased to 15.19%, 33.14%, 36.79%, and 29.76% in the medium and long term. We provide the local and global explanation of the multivariate single-step LSTM model in the field of incidence prediction, pioneering.

CONCLUSIONS:

The multivariate 2-step LSTM model is suitable for short-term prediction and obtained a similar performance as previous studies. The 3-step ARIMA-LSTM model is appropriate for medium-to-long-term prediction and outperforms these models. The SHAP results indicate that the five most crucial features are maximum temperature, average relative humidity, local financial budget, monthly sunshine percentage, and sunshine hours.
Assuntos
Palavras-chave

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Tuberculose Tipo de estudo: Incidence_studies / Prognostic_studies / Risk_factors_studies Limite: Humans País como assunto: Asia Idioma: En Ano de publicação: 2022 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Tuberculose Tipo de estudo: Incidence_studies / Prognostic_studies / Risk_factors_studies Limite: Humans País como assunto: Asia Idioma: En Ano de publicação: 2022 Tipo de documento: Article