Performance of Prediction Algorithms for Modeling Outdoor Air Pollution Spatial Surfaces.

Kerckhoffs, Jules; Hoek, Gerard; Portengen, Lützen; Brunekreef, Bert; Vermeulen, Roel C H

Kerckhoffs, Jules; Hoek, Gerard; Portengen, Lützen; Brunekreef, Bert; Vermeulen, Roel C H.

Afiliação

Kerckhoffs J; Institute for Risk Assessment Sciences (IRAS), Division of Environmental Epidemiology , Utrecht University , 3584 CK Utrecht , The Netherlands.
Hoek G; Institute for Risk Assessment Sciences (IRAS), Division of Environmental Epidemiology , Utrecht University , 3584 CK Utrecht , The Netherlands.
Portengen L; Institute for Risk Assessment Sciences (IRAS), Division of Environmental Epidemiology , Utrecht University , 3584 CK Utrecht , The Netherlands.
Brunekreef B; Institute for Risk Assessment Sciences (IRAS), Division of Environmental Epidemiology , Utrecht University , 3584 CK Utrecht , The Netherlands.
Vermeulen RCH; Julius Center for Health Sciences and Primary Care , University Medical Center, University of Utrecht , 358 CK Utrecht , The Netherlands.

Environ Sci Technol ; 53(3): 1413-1421, 2019 02 05.

Article em En | MEDLINE | ID: mdl-30609353

RESUMO

Land use regression (LUR) models for air pollutants are often developed using multiple linear regression techniques. However, in the past decade linear (stepwise) regression methods have been criticized for their lack of flexibility, their ignorance of potential interaction between predictors, and their limited ability to incorporate highly correlated predictors. We used two training sets of ultrafine particles (UFP) data (mobile measurements (8200 segments, 25 s monitoring per segment), and short-term stationary measurements (368 sites, 3 × 30 min per site)) to evaluate different modeling approaches to estimate long-term UFP concentrations by estimating precision and bias based on an independent external data set (42 sites, average of three 24-h measurements). Higher training data R2 did not equate to higher test R2 for the external long-term average exposure estimates, making the argument that external validation data are critical to compare model performance. Machine learning algorithms trained on mobile measurements explained only 38-47% of external UFP concentrations, whereas multivariable methods like stepwise regression and elastic net explained 56-62%. Some machine learning algorithms (bagging, random forest) trained on short-term measurements explained modestly more variability of external UFP concentrations compared to multiple linear regression and regularized regression techniques. In conclusion, differences in predictive ability of algorithms depend on the type of training data and are generally modest.

Assuntos

Poluentes Atmosféricos; Poluição do Ar; Algoritmos; Monitoramento Ambiental; Material Particulado

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Poluentes Atmosféricos / Poluição do Ar Tipo de estudo: Prognostic_studies / Risk_factors_studies Idioma: En Revista: Environ Sci Technol Ano de publicação: 2019 Tipo de documento: Article País de afiliação: Holanda

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google