Urban ozone variability using automated machine learning: inference from different feature importance schemes.

Nath, Sankar Jyoti; Girach, Imran A; Harithasree, S; Bhuyan, Kalyan; Ojha, Narendra; Kumar, Manish

Nath, Sankar Jyoti; Girach, Imran A; Harithasree, S; Bhuyan, Kalyan; Ojha, Narendra; Kumar, Manish.

Afiliação

Nath SJ; Centre for Environment and Energy Development, Ranchi, 834001, India.
Girach IA; Space Applications Centre, Indian Space Research Organisation, Ahmedabad, 380015, India. imran.girach@gmail.com.
Harithasree S; Physical Research Laboratory, Ahmedabad, 380009, India.
Bhuyan K; Indian Institute of Technology, Gandhinagar, 382055, Gujarat, India.
Ojha N; Centre for Atmospheric Studies, Dibrugarh University, Dibrugarh, 786004, India.
Kumar M; Physical Research Laboratory, Ahmedabad, 380009, India. ojha@prl.res.in.

Environ Monit Assess ; 196(4): 393, 2024 Mar 23.

Article em En | MEDLINE | ID: mdl-38520559

ABSTRACT

ABSTRACT

Tropospheric ozone is an air pollutant at the ground level and a greenhouse gas which significantly contributes to the global warming. Strong anthropogenic emissions in and around urban environments enhance surface ozone pollution impacting the human health and vegetation adversely. However, observations are often scarce and the factors driving ozone variability remain uncertain in the developing regions of the world. In this regard, here, we conducted machine learning (ML) simulations of ozone variability and comprehensively examined the governing factors over a major urban environment (Ahmedabad) in western India. Ozone precursors (NO2, NO, CO, C5H8 and CH2O) from the CAMS (Copernicus Atmosphere Monitoring Service) reanalysis and meteorological parameters from the ERA5 (European Centre for Medium-Range Weather Forecast's (ECMWF) fifth-generation reanalysis) were included as features in the ML models. Automated ML (AutoML) fitted the deep learning model optimally and simulated the daily ozone with root mean square error (RMSE) of ~2 ppbv reproducing 84-88% of variability. The model performance achieved here is comparable to widely used ML models (RF-Random Forest and XGBoost-eXtreme Gradient Boosting). Explainability of the models is discussed through different schemes of feature importance, including SAGE (Shapley Additive Global importancE) and permutation importance. The leading features are found to be different from different feature importance schemes. We show that urban ozone could be simulated well (RMSE = 2.5 ppbv and R2 = 0.78) by considering first four leading features, from different schemes, which are consistent with ozone photochemistry. Our study underscores the need to conduct science-informed analysis of feature importance from multiple schemes to infer the roles of input variables in ozone variability. AutoML-based studies, exploiting potentials of long-term observations, can strongly complement the conventional chemistry-transport modelling and can also help in accurate simulation and forecast of urban ozone.

Assuntos

Poluentes Atmosféricos; Poluição do Ar; Ozônio; Humanos; Ozônio/análise; Poluição do Ar/análise; Monitoramento Ambiental; Poluentes Atmosféricos/análise; Aprendizado de Máquina

Palavras-chave

Air pollution; Air quality; Artificial intelligence; Atmospheric chemistry; AutoML; Machine learning; Meteorology; Modelling; Ozone; Precursors; Random Forest; XGBoost

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Ozônio / Poluentes Atmosféricos / Poluição do Ar Limite: Humans Idioma: En Ano de publicação: 2024 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google