RESUMO
BACKGROUND: Access to water and sanitation is a basic human right; however, in many parts of the world, communities experience water, sanitation, and hygiene (WaSH) insecurity. While WaSH insecurity is prevalent in many low and middle-income countries, it is also a problem in high-income countries, like the United States, as is evident in vulnerable populations, including people experiencing homelessness. Limited knowledge exists about the coping strategies unhoused people use to access WaSH services. This study, therefore, examines WaSH access among unhoused communities in Los Angeles, California, a city with the second-highest count of unhoused people across the nation. METHODS: We conducted a cross-sectional study using a snowball sampling technique with 263 unhoused people living in Skid Row, Los Angeles. We calculated frequencies and used multivariable models to describe (1) how unhoused communities cope and gain access to WaSH services in different places, and (2) what individual-level factors contribute to unhoused people's ability to access WaSH services. RESULTS: Our findings reveal that access to WaSH services for unhoused communities in Los Angeles is most difficult at night. Reduced access to overnight sanitation resulted in 19% of the sample population using buckets inside their tents and 28% openly defecating in public spaces. Bottled water and public taps are the primary drinking water source, but 6% of the sample reported obtaining water from fire hydrants, and 50% of the population stores water for night use. Unhoused people also had limited access to water and soap for hand hygiene throughout the day, with 17% of the sample relying on hand sanitizer to clean their hands. Shower and laundry access were among the most limited services available, and reduced people's ability to maintain body hygiene practices and limited employment opportunities. Our regression models suggest that WaSH access is not homogenous among the unhoused. Community differences exist; the odds of having difficulty accessing sanitation services is two times greater for those living outside of Skid Row (Adj OR: 2.52; 95% CI: 1.08-6.37) and three times greater for people who have been unhoused for more than six years compared to people who have been unhoused for less than a year (Adj OR: 3.26; 95% CI: 1.36-8.07). CONCLUSION: Overall, this study suggests a need for more permanent, 24-h access to WaSH services for unhoused communities living in Skid Row, including toilets, drinking water, water and soap for hand hygiene, showers, and laundry services.
Assuntos
Higiene , Pessoas Mal Alojadas , Saneamento , Insegurança Hídrica , Los Angeles , Abastecimento de Água , Água Potável , Humanos , Estudos Transversais , População Urbana , Masculino , Feminino , Adolescente , Adulto , Pessoa de Meia-Idade , IdosoRESUMO
Many approaches to time series classification rely on machine learning methods. However, there is growing interest in going beyond black box prediction models to understand discriminatory features of the time series and their associations with outcomes. One promising method is time-series shapelets (TSS), which identifies maximally discriminative subsequences of time series. For example, in environmental health applications TSS could be used to identify short-term patterns in exposure time series (shapelets) associated with adverse health outcomes. Identification of candidate shapelets in TSS is computationally intensive. The original TSS algorithm used exhaustive search. Subsequent algorithms introduced efficiencies by trimming/aggregating the set of candidates or training candidates from initialized values, but these approaches have limitations. In this paper, we introduce Wavelet-TSS (W-TSS) a novel intelligent method for identifying candidate shapelets in TSS using wavelet transformation discovery. We tested W-TSS on two datasets: (1) a synthetic example used in previous TSS studies and (2) a panel study relating exposures from residential air pollution sensors to symptoms in participants with asthma. Compared to previous TSS algorithms, W-TSS was more computationally efficient, more accurate, and was able to discover more discriminative shapelets. W-TSS does not require pre-specification of shapelet length.
Assuntos
Poluição do Ar , Algoritmos , Humanos , Aprendizado de Máquina , Projetos de PesquisaRESUMO
Geospatial artificial intelligence (geoAI) is an emerging scientific discipline that combines innovations in spatial science, artificial intelligence methods in machine learning (e.g., deep learning), data mining, and high-performance computing to extract knowledge from spatial big data. In environmental epidemiology, exposure modeling is a commonly used approach to conduct exposure assessment to determine the distribution of exposures in study populations. geoAI technologies provide important advantages for exposure modeling in environmental epidemiology, including the ability to incorporate large amounts of big spatial and temporal data in a variety of formats; computational efficiency; flexibility in algorithms and workflows to accommodate relevant characteristics of spatial (environmental) processes including spatial nonstationarity; and scalability to model other environmental exposures across different geographic areas. The objectives of this commentary are to provide an overview of key concepts surrounding the evolving and interdisciplinary field of geoAI including spatial data science, machine learning, deep learning, and data mining; recent geoAI applications in research; and potential future directions for geoAI in environmental epidemiology.
Assuntos
Inteligência Artificial , Exposição Ambiental , Saúde Ambiental/métodos , Monitoramento Ambiental/métodosRESUMO
Accessing realistic human movements (aka trajectories) is essential for many application domains, such as urban planning, transportation, and public health. However, due to privacy and commercial concerns, real-world trajectories are not readily available, giving rise to an important research area of generating synthetic but realistic trajectories. Inspired by the success of deep neural networks (DNN), data-driven methods learn the underlying human decision-making mechanisms and generate synthetic trajectories by directly fitting real-world data. However, these DNN-based approaches do not exploit people's moving behaviors (e.g., work commute, shopping purpose), significantly influencing human decisions during the generation process. This paper proposes MBP-GAIL, a novel framework based on generative adversarial imitation learning that synthesizes realistic trajectories that preserve moving behavior patterns in real data. MBP-GAIL models temporal dependencies by Recurrent Neural Networks (RNN) and combines the stochastic constraints from moving behavior patterns and spatial constraints in the learning process. Through comprehensive experiments, we demonstrate that MBP-GAIL outperforms state-of-the-art methods and can better support decision making in trajectory simulations.
RESUMO
Transportation infrastructure, such as road or railroad networks, represent a fundamental component of our civilization. For sustainable planning and informed decision making, a thorough understanding of the long-term evolution of transportation infrastructure such as road networks is crucial. However, spatially explicit, multi-temporal road network data covering large spatial extents are scarce and rarely available prior to the 2000s. Herein, we propose a framework that employs increasingly available scanned and georeferenced historical map series to reconstruct past road networks, by integrating abundant, contemporary road network data and color information extracted from historical maps. Specifically, our method uses contemporary road segments as analytical units and extracts historical roads by inferring their existence in historical map series based on image processing and clustering techniques. We tested our method on over 300,000 road segments representing more than 50,000 km of the road network in the United States, extending across three study areas that cover 42 historical topographic map sheets dated between 1890 and 1950. We evaluated our approach by comparison to other historical datasets and against manually created reference data, achieving F-1 scores of up to 0.95, and showed that the extracted road network statistics are highly plausible over time, i.e., following general growth patterns. We demonstrated that contemporary geospatial data integrated with information extracted from historical map series open up new avenues for the quantitative analysis of long-term urbanization processes and landscape changes far beyond the era of operational remote sensing and digital cartography.
RESUMO
Spatially explicit, fine-grained datasets describing historical urban extents are rarely available prior to the era of operational remote sensing. However, such data are necessary to better understand long-term urbanization and land development processes and for the assessment of coupled nature-human systems (e.g., the dynamics of the wildland-urban interface). Herein, we propose a framework that jointly uses remote-sensing-derived human settlement data (i.e., the Global Human Settlement Layer, GHSL) and scanned, georeferenced historical maps to automatically generate historical urban extents for the early 20th century. By applying unsupervised color space segmentation to the historical maps, spatially constrained to the urban extents derived from the GHSL, our approach generates historical settlement extents for seamless integration with the multitemporal GHSL. We apply our method to study areas in countries across four continents, and evaluate our approach against historical building density estimates from the Historical Settlement Data Compilation for the US (HISDAC-US), and against urban area estimates from the History Database of the Global Environment (HYDE). Our results achieve Area-under-the-Curve values > 0.9 when comparing to HISDAC-US and are largely in agreement with model-based urban areas from the HYDE database, demonstrating that the integration of remote-sensing-derived observations and historical cartographic data sources opens up new, promising avenues for assessing urbanization and long-term land cover change in countries where historical maps are available.
RESUMO
Advances in measurement technology are producing increasingly time-resolved environmental exposure data. We aim to gain new insights into exposures and their potential health impacts by moving beyond simple summary statistics (e.g., means, maxima) to characterize more detailed features of high-frequency time series data. This study proposes a novel variant of the Self-Organizing Map (SOM) algorithm called Dynamic Time Warping Self-Organizing Map (DTW-SOM) for unsupervised pattern discovery in time series. This algorithm uses DTW, a similarity measure that optimally aligns interior patterns of sequential data, both as the similarity measure and training guide of the neural network. We applied DTW-SOM to a panel study monitoring indoor and outdoor residential temperature and particulate matter air pollution (PM2.5) for 10 patients with asthma from 7 households near Salt Lake City, UT; the patients were followed for up to 373 days each. Compared to previous SOM algorithms using timestamp alignment on time series data, the DTW-SOM algorithm produced fewer quantization errors and more detailed diurnal patterns. DTW-SOM identified the expected typical diurnal patterns in outdoor temperature which varied by season, as well diurnal patterns in PM2.5 which may be related to daily asthma outcomes. In summary, DTW-SOM is an innovative feature engineering method that can be applied to highly time-resolved environmental exposures assessed by sensors to identify typical diurnal (or hourly or monthly) patterns and provide new insights into the health effects of environmental exposures.
Assuntos
Algoritmos , Exposição Ambiental/efeitos adversos , Exposição Ambiental/análise , Avaliação do Impacto na Saúde , Poluentes Atmosféricos , Poluição do Ar , Asma/diagnóstico , Asma/epidemiologia , Asma/etiologia , Monitoramento Ambiental/métodos , Avaliação do Impacto na Saúde/métodos , Humanos , Redes Neurais de Computação , Material Particulado , Fatores de TempoRESUMO
BACKGROUND: Time-resolved quantification of physical activity can contribute to both personalized medicine and epidemiological research studies, for example, managing and identifying triggers of asthma exacerbations. A growing number of reportedly accurate machine learning algorithms for human activity recognition (HAR) have been developed using data from wearable devices (eg, smartwatch and smartphone). However, many HAR algorithms depend on fixed-size sampling windows that may poorly adapt to real-world conditions in which activity bouts are of unequal duration. A small sliding window can produce noisy predictions under stable conditions, whereas a large sliding window may miss brief bursts of intense activity. OBJECTIVE: We aimed to create an HAR framework adapted to variable duration activity bouts by (1) detecting the change points of activity bouts in a multivariate time series and (2) predicting activity for each homogeneous window defined by these change points. METHODS: We applied standard fixed-width sliding windows (4-6 different sizes) or greedy Gaussian segmentation (GGS) to identify break points in filtered triaxial accelerometer and gyroscope data. After standard feature engineering, we applied an Xgboost model to predict physical activity within each window and then converted windowed predictions to instantaneous predictions to facilitate comparison across segmentation methods. We applied these methods in 2 datasets: the human activity recognition using smartphones (HARuS) dataset where a total of 30 adults performed activities of approximately equal duration (approximately 20 seconds each) while wearing a waist-worn smartphone, and the Biomedical REAl-Time Health Evaluation for Pediatric Asthma (BREATHE) dataset where a total of 14 children performed 6 activities for approximately 10 min each while wearing a smartwatch. To mimic a real-world scenario, we generated artificial unequal activity bout durations in the BREATHE data by randomly subdividing each activity bout into 10 segments and randomly concatenating the 60 activity bouts. Each dataset was divided into ~90% training and ~10% holdout testing. RESULTS: In the HARuS data, GGS produced the least noisy predictions of 6 physical activities and had the second highest accuracy rate of 91.06% (the highest accuracy rate was 91.79% for the sliding window of size 0.8 second). In the BREATHE data, GGS again produced the least noisy predictions and had the highest accuracy rate of 79.4% of predictions for 6 physical activities. CONCLUSIONS: In a scenario with variable duration activity bouts, GGS multivariate segmentation produced smart-sized windows with more stable predictions and a higher accuracy rate than traditional fixed-size sliding window approaches. Overall, accuracy was good in both datasets but, as expected, it was slightly lower in the more real-world study using wrist-worn smartwatches in children (BREATHE) than in the more tightly controlled study using waist-worn smartphones in adults (HARuS). We implemented GGS in an offline setting, but it could be adapted for real-time prediction with streaming data.
Assuntos
Atividades Humanas/psicologia , Reconhecimento Psicológico , Dispositivos Eletrônicos Vestíveis/normas , Acelerometria/métodos , Adulto , Feminino , Atividades Humanas/estatística & dados numéricos , Humanos , Aprendizado de Máquina/normas , Aprendizado de Máquina/estatística & dados numéricos , Masculino , Pessoa de Meia-Idade , Análise Multivariada , Fatores de Tempo , Dispositivos Eletrônicos Vestíveis/psicologiaRESUMO
Historical maps are unique sources of retrospective geographical information. Recently, several map archives containing map series covering large spatial and temporal extents have been systematically scanned and made available to the public. The geographical information contained in such data archives makes it possible to extend geospatial analysis retrospectively beyond the era of digital cartography. However, given the large data volumes of such archives (e.g., more than 200,000 map sheets in the United States Geological Survey topographic map archive) and the low graphical quality of older, manually-produced map sheets, the process to extract geographical information from these map archives needs to be automated to the highest degree possible. To understand the potential challenges (e.g., salient map characteristics and data quality variations) in automating large-scale information extraction tasks for map archives, it is useful to efficiently assess spatio-temporal coverage, approximate map content, and spatial accuracy of georeferenced map sheets at different map scales. Such preliminary analytical steps are often neglected or ignored in the map processing literature but represent critical phases that lay the foundation for any subsequent computational processes including recognition. Exemplified for the United States Geological Survey topographic map and the Sanborn fire insurance map archives, we demonstrate how such preliminary analyses can be systematically conducted using traditional analytical and cartographic techniques, as well as visual-analytical data mining tools originating from machine learning and data science.
RESUMO
Air quality models are important for studying the impact of air pollutant on health conditions at a fine spatiotemporal scale. Existing work typically relies on area-specific, expert-selected attributes of pollution emissions (e,g., transportation) and dispersion (e.g., meteorology) for building the model for each combination of study areas, pollutant types, and spatiotemporal scales. In this paper, we present a data mining approach that utilizes publicly available OpenStreetMap (OSM) data to automatically generate an air quality model for the concentrations of fine particulate matter less than 2.5 µm in aerodynamic diameter at various temporal scales. Our experiment shows that our (domain-) expert-free model could generate accurate PM2.5 concentration predictions, which can be used to improve air quality models that traditionally rely on expert-selected input. Our approach also quantifies the impact on air quality from a variety of geographic features (i.e., how various types of geographic features such as parking lots and commercial buildings affect air quality and from what distance) representing mobile, stationary and area natural and anthropogenic air pollution sources. This approach is particularly important for enabling the construction of context-specific spatiotemporal models of air pollution, allowing investigations of the impact of air pollution exposures on sensitive populations such as children with asthma at scale.
RESUMO
According to the Centers for Disease Control, in the United States there are 6.8 million children living with asthma. Despite the importance of the disease, the available prognostic tools are not sufficient for biomedical researchers to thoroughly investigate the potential risks of the disease at scale. To overcome these challenges we present a big data integration and analysis infrastructure developed by our Data and Software Coordination and Integration Center (DSCIC) of the NIBIB-funded Pediatric Research using Integrated Sensor Monitoring Systems (PRISMS) program. Our goal is to help biomedical researchers to efficiently predict and prevent asthma attacks. The PRISMS-DSCIC is responsible for collecting, integrating, storing, and analyzing real-time environmental, physiological and behavioral data obtained from heterogeneous sensor and traditional data sources. Our architecture is based on the Apache Kafka, Spark and Hadoop frameworks and PostgreSQL DBMS. A main contribution of this work is extending the Spark framework with a mediation layer, based on logical schema mappings and query rewriting, to facilitate data analysis over a consistent harmonized schema. The system provides both batch and stream analytic capabilities over the massive data generated by wearable and fixed sensors.