RESUMEN
The impact of air pollution in Chennai metropolitan city, a southern Indian coastal city was examined to predict the Air Quality Index (AQI). Regular monitoring and prediction of the Air Quality Index (AQI) are critical for combating air pollution. The current study created machine learning models such as XGBoost, Random Forest, BaggingRegressor, and LGBMRegressor for the prediction of the AQI using the historical data available from 2017 to 2022. According to historical data, the AQI is highest in January, with a mean value of 104.6 g/gm, and the lowest in August, with a mean AQI value of 63.87 g/gm. Particulate matter, gaseous pollutants, and meteorological parameters were used to predict AQI, and the heat map generated showed that of all the parameters, PM2.5 has the greatest impact on AQI, with a value of 0.91. The log transformation method is used to normalize datasets and determine skewness and kurtosis. The XGBoost model demonstrated strong performance, achieving an R2 (correlation coefficient) of 0.9935, a mean absolute error (MAE) of 0.02, a mean square error (MSE) of 0.001, and a root mean square error (RMSE) of 0.04. In comparison, the LightGBM model's prediction was less effective, as it attained an R2 of 0.9748. According to the study, the AQI in Chennai has been increasing over the last two years, and if the same conditions persist, the city's air pollution will worsen in the future. Furthermore, accurate future air quality level predictions can be made using historical data and advanced machine learning algorithms.
Asunto(s)
Contaminantes Atmosféricos , Contaminación del Aire , Cambio Climático , India , Aprendizaje AutomáticoRESUMEN
Clean air is critical component for health and survival of human and wildlife, as atmospheric pollution is associated with a number of significant diseases including cancer. However, due to rapid industrialization and population growth, activities such as transportation, household, agricultural, and industrial processes contribute to air pollution. As a result, air pollution has become a significant problem in many cities, especially in emerging countries like India. To maintain ambient air quality, regular monitoring and forecasting of air pollution is necessary. For that purpose, machine learning has emerged as a promising technique for predicting the Air Quality Index (AQI) compared to conventional methods. Here we apply the AQI to the city of Visakhapatnam, Andhra Pradesh, India, focusing on 12 contaminants and 10 meteorological parameters from July 2017 to September 2022. For this purpose, we employed several machine learning models, including LightGBM, Random Forest, Catboost, Adaboost, and XGBoost. The results show that the Catboost model outperformed other models with an R2 correlation coefficient of 0.9998, a mean absolute error (MAE) of 0.60, a mean square error (MSE) of 0.58, and a root mean square error (RMSE) of 0.76. The Adaboost model had the least effective prediction with an R2 correlation coefficient of 0.9753. In summary, machine learning is a promising technique for predicting AQI with Catboost being the best-performing model for AQI prediction. Moreover, by leveraging historical data and machine learning algorithms enables accurate predictions of future urban air quality levels on a global scale.
Asunto(s)
Contaminantes Atmosféricos , Contaminación del Aire , Humanos , Contaminantes Atmosféricos/análisis , Ciudades , Monitoreo del Ambiente/métodos , Contaminación del Aire/análisis , Aprendizaje Automático , Material Particulado/análisisRESUMEN
Coronavirus disease 2019 (COVID-19), which causes severe respiratory illness, has become a pandemic. The World Health Organization has declared it a public health crisis of international concern. We developed a susceptible, exposed, infected, recovered (SEIR) model for COVID-19 to show the importance of estimating the reproduction number (R0). This work is focused on predicting the COVID-19 outbreak in its early stage in India based on an estimation of R0. The developed model will help policymakers to take active measures prior to the further spread of COVID-19. Data on daily newly infected cases in India from March 2, 2020 to April 2, 2020 were to estimate R0 using the earlyR package. The maximum-likelihood approach was used to analyze the distribution of R0 values, and the bootstrap strategy was applied for resampling to identify the most likely R0 value. We estimated the median value of R0 to be 1.471 (95% confidence interval [CI], 1.351 to 1.592) and predicted that the new case count may reach 39,382 (95% CI, 34,300 to 47,351) in 30 days.
Asunto(s)
Número Básico de Reproducción/estadística & datos numéricos , Infecciones por Coronavirus/epidemiología , Brotes de Enfermedades , Neumonía Viral/epidemiología , COVID-19 , Predicción , Humanos , India/epidemiología , Cómputos Matemáticos , PandemiasRESUMEN
The COVID-19 is an epidemic that causes respiratory infection. The forecasted data will help the policy makers to take precautionary measures and to control the epidemic spread. The two models were adopted for forecasting the daily newly registered cases of COVID-19 namely 'earlyR' epidemic model and ARIMA model. In earlyR epidemic model, the reported values of serial interval of COVID-19 with gamma distribution have been used to estimate the value of R0 and 'projections' package is used to obtain epidemic trajectories by fitting the existing COVID-19 India data, serial interval distribution, and obtained R0 value of respective states. The ARIMA model is developed by using the 'auto.arima' function to evaluate the values of (p, d, q) and 'forecast' package is used to predict the new infected cases. The methodology evaluation shows that ARIMA model gives the better accuracy compared to earlyR epidemic model.