A multi-resolution ensemble model of three decision-tree-based algorithms to predict daily NO2 concentration in France 2005-2022.
Environ Res
; 257: 119241, 2024 Sep 15.
Article
en En
| MEDLINE
| ID: mdl-38810827
ABSTRACT
Understanding and managing the health effects of Nitrogen Dioxide (NO2) requires high resolution spatiotemporal exposure maps. Here, we developed a multi-stage multi-resolution ensemble model that predicts daily NO2 concentration across continental France from 2005 to 2022. Innovations of this work include the computation of daily predictions at a 200 m resolution in large urban areas and the use of a spatio-temporal blocking procedure to avoid data leakage and ensure fair performance estimation. Predictions were obtained after three cascading stages of modeling (1) predicting NO2 total column density from Ozone Monitoring Instrument satellite; (2) predicting daily NO2 concentrations at a 1 km spatial resolution using a large set of potential predictors such as predictions obtained from stage 1, land-cover and road traffic data; and (3) predicting residuals from stage 2 models at a 200 m resolution in large urban areas. The latter two stages used a generalized additive model to ensemble predictions of three decision-tree algorithms (random forest, extreme gradient boosting and categorical boosting). Cross-validated performances of our ensemble models were overall very good, with a ten-fold cross-validated R2 for the 1 km model of 0.83, and of 0.69 for the 200 m model. All three basis learners participated in the ensemble predictions to various degrees depending on time and space. In sum, our multi-stage approach was able to predict daily NO2 concentrations with a relatively low error. Ensembling the predictions maximizes the chance of obtaining accurate values if one basis learner fails in a specific area or at a particular time, by relying on the other learners. To the best of our knowledge, this is the first study aiming to predict NO2 concentrations in France with such a high spatiotemporal resolution, large spatial extent, and long temporal coverage. Exposure estimates are available to investigate NO2 health effects in epidemiological studies.
Palabras clave
Texto completo:
1
Colección:
01-internacional
Banco de datos:
MEDLINE
Asunto principal:
Algoritmos
/
Árboles de Decisión
/
Contaminantes Atmosféricos
/
Dióxido de Nitrógeno
País/Región como asunto:
Europa
Idioma:
En
Revista:
Environ Res
Año:
2024
Tipo del documento:
Article