RESUMEN
Nitrogen dioxide (NO2) is a major air pollutant primarily emitted from traffic and industrial activities, posing health risks. However, current air pollution models often underestimate exposure risks by neglecting the bimodal pattern of NO2 levels throughout the day. This study aimed to address this gap by developing ensemble mixed spatial models (EMSM) using geo-artificial intelligence (Geo-AI) to examine the spatial and temporal variations of NO2 concentrations at a high resolution of 50m. These EMSMs integrated spatial modelling methods, including kriging, land use regression, machine learning, and ensemble learning. The models utilized 26 years of observed NO2 measurements, meteorological parameters, geospatial layers, and social and season-dependent variables as representative of emission sources. Separate models were developed for daytime and nighttime periods, which achieved high reliability with adjusted R2 values of 0.92 and 0.93, respectively. The study revealed that mean NO2 concentrations were significantly higher at nighttime (9.60 ppb) compared to daytime (5.61 ppb). Additionally, winter exhibited the highest NO2 levels regardless of time period. The developed EMSMs were utilized to generate maps illustrating NO2 levels pre and during COVID restrictions in Taiwan. These findings could aid epidemiological research on exposure risks and support policy-making and environmental planning initiatives.
Asunto(s)
Contaminantes Atmosféricos , Contaminación del Aire , Inteligencia Artificial , Monitoreo del Ambiente , Dióxido de Nitrógeno , Dióxido de Nitrógeno/análisis , Taiwán , Contaminación del Aire/análisis , Contaminantes Atmosféricos/análisis , Monitoreo del Ambiente/métodos , Estaciones del AñoRESUMEN
Dynamic gridded population data are crucial in fields such as disaster reduction, public health, urban planning, and global change studies. Despite the use of multi-source geospatial data and advanced machine learning models, current frameworks for population spatialization often struggle with spatial non-stationarity, temporal generalizability, and fine temporal resolution. To address these issues, we introduce a framework for dynamic gridded population mapping using open-source geospatial data and machine learning. The framework consists of (i) delineation of human footprint zones, (ii) construction of muliti-scale population prediction models using automated machine learning (AutoML) framework and geographical ensemble learning strategy, and (iii) hierarchical population spatial disaggregation with pycnophylactic constraint-based corrections. Employing this framework, we generated hourly time-series gridded population maps for China in 2016 with a 1-km spatial resolution. The average accuracy evaluated by root mean square deviation (RMSD) is 325, surpassing datasets like LandScan, WorldPop, GPW, and GHSL. The generated seamless maps reveal the temporal dynamic of population distribution at fine spatial scales from hourly to monthly. This framework demonstrates the potential of integrating spatial statistics, machine learning, and geospatial big data in enhancing our understanding of spatio-temporal heterogeneity in population distribution, which is essential for urban planning, environmental management, and public health.
RESUMEN
Recent advances in data science and urban environmental health research utilise large-scale databases (100s-1000s of cities) to explore the complex interplay of urban characteristics such as city form and size, climate, mobility, exposure, and environmental health impacts. Cities are still hotspots of air pollution and noise, suffer urban heat island effects and lack of green space, which leads to disease and mortality burdens preventable with better knowledge. Better understanding through harmonising and analysing data in large numbers of cities is essential to identifying the most effective means of disease prevention and understanding context dependencies important for policy.
RESUMEN
Recent advances in unmanned aerial vehicles (UAV), mini and mobile sensors, and GeoAI (a blend of geospatial and artificial intelligence (AI) research) are the main highlights among agricultural innovations to improve crop productivity and thus secure vulnerable food systems. This study investigated the versatility of UAV-borne multisensory data fusion within a framework of multi-task deep learning for high-throughput phenotyping in maize. UAVs equipped with a set of miniaturized sensors including hyperspectral, thermal, and LiDAR were collected in an experimental corn field in Urbana, IL, USA during the growing season. A full suite of eight phenotypes was in situ measured at the end of the season for ground truth data, specifically, dry stalk biomass, cob biomass, dry grain yield, harvest index, grain nitrogen utilization efficiency (Grain NutE), grain nitrogen content, total plant nitrogen content, and grain density. After being funneled through a series of radiometric calibrations and geo-corrections, the aerial data were analytically processed in three primary approaches. First, an extended version normalized difference spectral index (NDSI) served as a simple arithmetic combination of different data modalities to explore the correlation degree with maize phenotypes. The extended NDSI analysis revealed the NIR spectra (750-1000 nm) alone in a strong relation with all of eight maize traits. Second, a fusion of vegetation indices, structural indices, and thermal index selectively handcrafted from each data modality was fed to classical machine learning regressors, Support Vector Machine (SVM) and Random Forest (RF). The prediction performance varied from phenotype to phenotype, ranging from R2 = 0.34 for grain density up to R2 = 0.85 for both grain nitrogen content and total plant nitrogen content. Further, a fusion of hyperspectral and LiDAR data completely exceeded limitations of single data modality, especially addressing the vegetation saturation effect occurring in optical remote sensing. Third, a multi-task deep convolutional neural network (CNN) was customized to take a raw imagery data fusion of hyperspectral, thermal, and LiDAR for multi-predictions of maize traits at a time. The multi-task deep learning performed predictions comparably, if not better in some traits, with the mono-task deep learning and machine learning regressors. Data augmentation used for the deep learning models boosted the prediction accuracy, which helps to alleviate the intrinsic limitation of a small sample size and unbalanced sample classes in remote sensing research. Theoretical and practical implications to plant breeders and crop growers were also made explicit during discussions in the studies.
Asunto(s)
Aprendizaje Profundo , Zea mays , Inteligencia Artificial , Dispositivos Aéreos No Tripulados , Grano Comestible , NitrógenoRESUMEN
BACKGROUND: Obesity is a serious public health problem. Existing research has shown a strong association between obesity and an individual's diet and physical activity. If we extend such an association to the neighborhood level, information about the diet and physical activity of the residents of a neighborhood may improve the estimate of neighborhood-level obesity prevalence and help identify the neighborhoods that are more likely to suffer from obesity. However, it is challenging to measure neighborhood-level diet and physical activity through surveys and interviews, especially for a large geographic area. METHODS: We propose a method for deriving neighborhood-level diet and physical activity measurements from anonymized mobile phone location data, and examine the extent to which the derived measurements can enhance obesity estimation, in addition to the socioeconomic and demographic variables typically used in the literature. We conduct case studies in three different U.S. cities, which are New York City, Los Angeles, and Buffalo, using anonymized mobile phone location data from the company SafeGraph. We employ five different statistical and machine learning models to test the potential enhancement brought by the derived measurements for obesity estimation. RESULTS: We find that it is feasible to derive neighborhood-level diet and physical activity measurements from anonymized mobile phone location data. The derived measurements provide only a small enhancement for obesity estimation, compared with using a comprehensive set of socioeconomic and demographic variables. However, using these derived measurements alone can achieve a moderate accuracy for obesity estimation, and they may provide a stronger enhancement when comprehensive socioeconomic and demographic data are not available (e.g., in some developing countries). From a methodological perspective, spatially explicit models overall perform better than non-spatial models for neighborhood-level obesity estimation. CONCLUSIONS: Our proposed method can be used for deriving neighborhood-level diet and physical activity measurements from anonymized mobile phone data. The derived measurements can enhance obesity estimation, and can be especially useful when comprehensive socioeconomic and demographic data are not available. In addition, these derived measurements can be used to study obesity-related health behaviors, such as visit frequency of neighborhood residents to fast-food restaurants, and to identify primary places contributing to obesity-related issues.
Asunto(s)
Dieta , Obesidad , Humanos , Dieta/efectos adversos , Obesidad/diagnóstico , Obesidad/epidemiología , Conductas Relacionadas con la Salud , Ejercicio Físico , Encuestas y Cuestionarios , Características de la ResidenciaRESUMEN
Humans rely on clean water for their health, well-being, and various socio-economic activities. During the past few years, the COVID-19 pandemic has been a constant reminder of about the importance of hygiene and sanitation for public health. The most common approach to securing clean water supplies for this purpose is via wastewater treatment. To date, an effective method of detecting wastewater treatment plants (WWTP) accurately and automatically via remote sensing is unavailable. In this paper, we provide a solution to this task by proposing a novel joint deep learning (JDL) method that consists of a fine-tuned object detection network and a multi-task residual attention network (RAN). By leveraging OpenStreetMap (OSM) and multimodal remote sensing (RS) data, our JDL method is able to simultaneously tackle two different tasks: land use land cover (LULC) and WWTP classification. Moreover, JDL exploits the complementary effects between these tasks for a performance gain. We train JDL using 4,187 WWTP features and 4,200 LULC samples and validate the performance of the proposed method over a selected area around Stuttgart with 723 WWTP features and 1,200 LULC samples to generate an LULC classification map and a WWTP detection map. Extensive experiments conducted with different comparative methods demonstrate the effectiveness and efficiency of our JDL method in automatic WWTP detection in comparison with single-modality/single-task or traditional survey methods. Moreover, lessons learned pave the way for future works to simultaneously and effectively address multiple large-scale mapping tasks (e.g., both mapping LULC and detecting WWTP) from multimodal RS data via deep learning.
RESUMEN
The moulding together of artificial intelligence (AI) and the geographic/geographic information systems (GIS) dimension creates GeoAI. There is an emerging role for GeoAI in health and healthcare, as location is an integral part of both population and individual health. This article provides an overview of GeoAI technologies (methods, tools and software), and their current and potential applications in several disciplines within public health, precision medicine, and Internet of Things-powered smart healthy cities. The potential challenges currently facing GeoAI research and applications in health and healthcare are also briefly discussed.
Asunto(s)
Inteligencia Artificial/tendencias , Atención a la Salud/tendencias , Sistemas de Información Geográfica/tendencias , Salud Pública/tendencias , Atención a la Salud/métodos , Humanos , Medicina de Precisión/métodos , Medicina de Precisión/tendencias , Salud Pública/métodosRESUMEN
Geospatial artificial intelligence (geoAI) is an emerging scientific discipline that combines innovations in spatial science, artificial intelligence methods in machine learning (e.g., deep learning), data mining, and high-performance computing to extract knowledge from spatial big data. In environmental epidemiology, exposure modeling is a commonly used approach to conduct exposure assessment to determine the distribution of exposures in study populations. geoAI technologies provide important advantages for exposure modeling in environmental epidemiology, including the ability to incorporate large amounts of big spatial and temporal data in a variety of formats; computational efficiency; flexibility in algorithms and workflows to accommodate relevant characteristics of spatial (environmental) processes including spatial nonstationarity; and scalability to model other environmental exposures across different geographic areas. The objectives of this commentary are to provide an overview of key concepts surrounding the evolving and interdisciplinary field of geoAI including spatial data science, machine learning, deep learning, and data mining; recent geoAI applications in research; and potential future directions for geoAI in environmental epidemiology.
Asunto(s)
Inteligencia Artificial , Exposición a Riesgos Ambientales , Salud Ambiental/métodos , Monitoreo del Ambiente/métodosRESUMEN
Air pollution is considered one of the major environmental risks to health worldwide. Researchers are making significant efforts to study it, thanks to state-of-art technologies in data collection and processing, and to mitigate its effect. In this context, while a lot is known about the role of urbanization, industries, and transport, the impact of agricultural activities on the spatial distribution of pollution is less studied, despite knowledge about emissions suggest it is not a secondary factor. Therefore, the aim of this study was to assess this impact, and to compare it with that of traditional polluting sources, harvesting the capabilities of GEOAI (Geomatics and Earth Observation Artificial Intelligence). The analysis targeted the highly polluted territory of Lombardy, Italy, considering fine particulate matter (PM2.5). PM2.5 data were obtained from the Copernicus-Atmosphere-Monitoring-Service and processed to infer time-invariant spatial parameters (frequency, intensity and exposure) of concentration across the whole period. An ensemble architecture was implemented, with three blocks: correlation-based features selection, Multiscale-Geographically-Weighted-Regression for spatial enhancement, and a final random forest classifier. Finally, the SHapley Additive exPlanation algorithm was applied to compute the relevance of the different land-use classes on the model. The impact of land-use classes was found significantly higher compared to other published models, showing that the insignificant correlations found in the literature are probably due to an unfit experimental setup. The impact of agricultural activities on the spatial distribution of PM2.5 concentration was comparable to the other considered sources, even when focusing only on the most densely inhabited urban areas. In particular, the agriculture's contribution resulted in pollution spikes rather than in a baseline increase. These results allow to state that public policymakers should consider also agricultural activities for evidence-based decision-making about pollution mitigation.
Asunto(s)
Contaminantes Atmosféricos , Contaminación del Aire , Contaminantes Atmosféricos/análisis , Inteligencia Artificial , Monitoreo del Ambiente/métodos , Contaminación del Aire/análisis , Material Particulado/análisis , AgriculturaRESUMEN
PM2.5 concentrations are higher during rush hours at background stations compared to the average concentration across these stations. Few studies have investigated PM2.5 concentration and its spatial distribution during rush hours using machine learning models. This study employs a geospatial-artificial intelligence (Geo-AI) prediction model to estimate the spatial and temporal variations of PM2.5 concentrations during morning and dusk rush hours in Taiwan. Mean hourly PM2.5 measurements were collected from 2006 to 2020, and aggregated into morning (7 a.m.-9 a.m.) and dusk (4 p.m.-6 p.m.) rush-hour mean concentrations. The Geo-AI prediction model was generated by integrating kriging interpolation, land-use regression, machine learning, and a stacking ensemble approach. A forward stepwise variable selection method based on the SHapley Additive exPlanations (SHAP) index was used to identify the most influential variables. The performance of the Geo-AI models for morning and dusk rush hours had accuracy scores of 0.95 and 0.93, respectively and these results were validated, indicating robust model performance. Spatially, PM2.5 concentrations were higher in southwestern Taiwan for morning rush hours, and suburban areas for dusk rush hours. Key predictors included kriged PM2.5 values, SO2 concentrations, forest density, and the distance to incinerators for both morning and dusk rush hours. These PM2.5 estimates for morning and dusk rush hours can support the development of alternative commuting routes with lower concentrations.
Asunto(s)
Contaminantes Atmosféricos , Contaminación del Aire , Inteligencia Artificial , Monitoreo del Ambiente , Material Particulado , Taiwán , Material Particulado/análisis , Contaminantes Atmosféricos/análisis , Monitoreo del Ambiente/métodos , Contaminación del Aire/estadística & datos numéricos , TransportesRESUMEN
The presence of fine particulate matter (PM2.5) indoors constitutes a significant component of overall PM2.5 exposure, as individuals spend 90% of their time indoors; however, personal monitoring for large cohorts is often impractical. In light of this, this study seeks to employ a novel geospatial artificial intelligence (Geo-AI) coupled with machine learning (ML) approaches to develop indoor PM2.5 models. Multiple predictor variables were collected from 102 residential households, including meteorological data; elevation; land use; indoor environmental factors including human activities, building characteristics, infiltration factors, and real-time measurements; and various other factors. Geo-AI, which integrates land use regression, inverse distance weighting, and ML algorithms, was utilized to construct outdoor PM2.5 and PM10 estimates for residential households. The most influential variables were identified via correlation analysis and stepwise regression. Three ML methods, namely support vector machine, multiple linear regression, and multilayer perceptron (MLP) were used to estimate indoor PM2.5 concentration. Then, MLP was employed to blend three ML algorithms. The resulting model demonstrated commendable performance, achieving a 10-fold cross-validation R2 of 0.92 and a root mean square error of 2.3 µg/m3 for indoor PM2.5 estimations. Notably, the combination of Geo-AI and ensembled ML models in this study outperformed all other individual models. In addition, the present study pointed out the most influential factors for indoor PM2.5 model were outdoor PM2.5, PM2.5/PM10 ratio, sampling month, infiltration factor, located near factory, cleaning frequency, number of door entrance linked with outdoor, and wall material. Further exploration of diverse ensemble model formats to integrate estimates from different models could enhance overall performance. Consequently, the potential applications of this model extend to estimating real individual exposure to PM2.5 for further epidemiological research. Moreover, the model offers valuable insights for efficient indoor air quality management and control strategies.
RESUMEN
Ambient ammonia (NH3) plays an important compound in forming particulate matters (PMs), and therefore, it is crucial to comprehend NH3's properties in order to better reduce PMs. However, it is not easy to achieve this goal due to the limited range/real-time NH3 data monitored by the air quality stations. While there were other studies to predict NH3 and its source apportionment, this manuscript provides a novel method (i.e., GEO-AI)) to look into NH3 predictions and their contribution sources. This study represents a pioneering effort in the application of a novel geospatial-artificial intelligence (Geo-AI) base model with parcel tracking functions. This innovative approach seamlessly integrates various machine learning algorithms and geographic predictor variables to estimate NH3 concentrations, marking the first instance of such a comprehensive methodology. The Shapley additive explanation (SHAP) was used to further analyze source contribution of NH3 with domain knowledge. From 2016 to 2018, Taichung's hourly average NH3 values were predicted with total variance up to 96%. SHAP values revealed that waterbody, traffic and agriculture emissions were the most significant factors to affect NH3 concentrations in Taichung among all the characteristics. Our methodology is a vital first step for shaping future policies and regulations and is adaptable to regions with limited monitoring sites.
Asunto(s)
Contaminantes Atmosféricos , Contaminación del Aire , Contaminantes Atmosféricos/análisis , Inteligencia Artificial , Monitoreo del Ambiente/métodos , Contaminación del Aire/análisis , Material Particulado/análisisRESUMEN
Introduction: Since its emergence in late 2019, the SARS-CoV-2 virus has led to a global health crisis, affecting millions and reshaping societies and economies worldwide. Investigating the determinants of SARS-CoV-2 diffusion and their spatiotemporal dynamics at high spatial resolution is critical for public health and policymaking. Methods: This study analyses 194,682 georeferenced SARS-CoV-2 RT-PCR tests from March 2020 and April 2022 in the canton of Vaud, Switzerland. We characterized five distinct pandemic periods using metrics of spatial and temporal clustering like inverse Shannon entropy, the Hoover index, Lloyd's index of mean crowding, and the modified space-time DBSCAN algorithm. We assessed the demographic, socioeconomic, and environmental factors contributing to cluster persistence during each period using eXtreme Gradient Boosting (XGBoost) and SHapley Additive exPlanations (SHAP), to consider non-linear and spatial effects. Results: Our findings reveal important variations in the spatial and temporal clustering of cases. Notably, areas with flatter epidemics had higher total attack rate. Air pollution emerged as a factor showing a consistent positive association with higher cluster persistence, substantiated by both immission models and, to a lesser extent, tropospheric NO2 estimations. Factors including population density, testing rates, and geographical coordinates, also showed important positive associations with higher cluster persistence. The socioeconomic index showed no significant contribution to cluster persistence, suggesting its limited role in the observed dynamics, which warrants further research. Discussion: Overall, the determinants of cluster persistence remained across the study periods. These findings highlight the need for effective air quality management strategies to mitigate air pollution's adverse impacts on public health, particularly in the context of respiratory viral diseases like COVID-19.
Asunto(s)
COVID-19 , SARS-CoV-2 , Análisis Espacio-Temporal , Humanos , COVID-19/epidemiología , COVID-19/transmisión , Suiza/epidemiología , Contaminación del Aire/estadística & datos numéricos , Pandemias , Factores SocioeconómicosRESUMEN
It is generally established that PCDD/Fs is harmful to human health and therefore extensive field research is necessary. This study is the first to use a novel geospatial-artificial intelligence (Geo-AI) based ensemble mixed spatial model (EMSM) that integrates multiple machine learning algorithms and geographic predictor variables selected using SHapley Additive exPlanations (SHAP) values to predict spatial-temporal fluctuations in PCDD/Fs concentrations across the entire island of Taiwan. Daily PCDD/F I-TEQ levels from 2006 to 2016 were used for model construction, while external data was used for validating model dependability. We utilized Geo-AI, incorporating kriging, five machine learning, and ensemble methods (combinations of the aforementioned five models) to develop EMSMs. The EMSMs were used to estimate long-term spatiotemporal variations in PCDD/F I-TEQ levels, considering in-situ measurements, meteorological factors, geospatial predictors, social and seasonal influences over a 10-year period. The findings demonstrated that the EMSM was superior to all other models, with an increase in explanatory power reaching 87 %. The results of spatial-temporal resolution show that the temporal fluctuation of PCDD/F concentrations can be a result of weather circumstances, while geographical variance can be the result of urbanization and industrialization. These results provide accurate estimates that support pollution control measures and epidemiological studies.
Asunto(s)
Contaminantes Atmosféricos , Benzofuranos , Dibenzodioxinas Policloradas , Humanos , Dibenzodioxinas Policloradas/análisis , Dibenzofuranos , Inteligencia Artificial , Taiwán , Dibenzofuranos Policlorados/análisis , Benzofuranos/análisis , Monitoreo del Ambiente/métodos , Contaminantes Atmosféricos/análisisRESUMEN
The pandemic of COVID-19 has posed unprecedented threats to healthcare systems worldwide. Great efforts were spent to fight the emergency, with the widespread use of cutting-edge technologies, especially big data analytics and AI. In this context, the present study proposes a novel combination of geographical filtering and machine learning (ML) for the development and optimization of a COVID-19 early alert system based on Emergency Medical Services (EMS) data, for the anticipated identification of outbreaks with very high granularity, up to single municipalities. The model, implemented for the region of Lombardy, Italy, showed robust performance, with an overall 80% accuracy in identifying the active spread of the disease. The further post-processing of the output was implemented to classify the territory into five risk classes, resulting in effectively anticipating the demand for interventions by EMS. This model shows state-of-art potentiality for future applications in the early detection of the burden of the impact of COVID-19, or other similar epidemics, on the healthcare system.