RESUMEN
Identification of areas of high disease risk has been one of the top goals for infectious disease public health surveillance. Accurate prediction of these regions leads to effective resource allocation and faster intervention. This paper proposes a novel prediction surveillance metric based on a Bayesian spatio-temporal model for infectious disease outbreaks. Exceedance probability, which has been commonly used for cluster detection in statistical epidemiology, was extended to predict areas of high risk. The proposed metric consists of three components: the area's risk profile, temporal risk trend, and spatial neighborhood influence. We also introduce a weighting scheme to balance these three components, which accommodates the characteristics of the infectious disease outbreak, spatial properties, and disease trends. Thorough simulation studies were conducted to identify the optimal weighting scheme and evaluate the performance of the proposed prediction surveillance metric. Results indicate that the area's own risk and the neighborhood influence play an important role in making a highly sensitive metric, and the risk trend term is important for the specificity and accuracy of prediction. The proposed prediction metric was applied to the COVID-19 case data of South Carolina from March 12, 2020, and the subsequent 30 weeks of data.
RESUMEN
BACKGROUND: The analysis of dental caries has been a major focus of recent work on modeling dental defect data. While a dental caries focus is of major importance in dental research, the examination of developmental defects which could also contribute at an early stage of dental caries formation, is also of potential interest. This paper proposes a set of methods which address the appearance of different combinations of defects across different tooth regions. In our modeling we assess the linkages between tooth region development and both the type of defect and associations with etiological predictors of the defects which could be influential at different times during the tooth crown development. METHODS: We develop different hierarchical model formulations under the Bayesian paradigm to assess exposures during primary central incisor (PMCI) tooth development and PMCI defects. We evaluate the Bayesian hierarchical models under various simulation scenarios to compare their performance with both simulated dental defect data and real data from a motivating application. RESULTS: The proposed model provides inference on identifying a subset of etiological predictors of an individual defect accounting for the correlation between tooth regions and on identifying a subset of etiological predictors for the joint effect of defects. Furthermore, the model provides inference on the correlation between the regions of the teeth as well as between the joint effect of the developmental enamel defects and dental caries. Simulation results show that the proposed model consistently yields steady inferences in identifying etiological biomarkers associated with the outcome of localized developmental enamel defects and dental caries under varying simulation scenarios as deemed by small mean square error (MSE) when comparing the simulation results to real application results. CONCLUSION: We evaluate the proposed model under varying simulation scenarios to develop a model for multivariate dental defects and dental caries assuming a flexible covariance structure that can handle regional and joint effects. The proposed model shed new light on methods for capturing inclusive predictors in different multivariate joint models under the same covariance structure and provides a natural extension to a nested hierarchical model.
Asunto(s)
Caries Dental , Incisivo , Niño , Humanos , Teorema de Bayes , Diente Primario , Prevalencia , Esmalte DentalRESUMEN
BACKGROUND: Dengue is a mosquito-borne disease that causes over 300 million infections worldwide each year with no specific treatment available. Effective surveillance systems are needed for outbreak detection and resource allocation. Spatial cluster detection methods are commonly used, but no general guidance exists on the most appropriate method for dengue surveillance. Therefore, a comprehensive study is needed to assess different methods and provide guidance for dengue surveillance programs. METHODS: To evaluate the effectiveness of different cluster detection methods for dengue surveillance, we selected and assessed commonly used methods: Getis Ord [Formula: see text], Local Moran, SaTScan, and Bayesian modeling. We conducted a simulation study to compare their performance in detecting clusters, and applied all methods to a case study of dengue surveillance in Thailand in 2019 to further evaluate their practical utility. RESULTS: In the simulation study, Getis Ord [Formula: see text] and Local Moran had similar performance, with most misdetections occurring at cluster boundaries and isolated hotspots. SaTScan showed better precision but was less effective at detecting inner outliers, although it performed well on large outbreaks. Bayesian convolution modeling had the highest overall precision in the simulation study. In the dengue case study in Thailand, Getis Ord [Formula: see text] and Local Moran missed most disease clusters, while SaTScan was mostly able to detect a large cluster. Bayesian disease mapping seemed to be the most effective, with adaptive detection of irregularly shaped disease anomalies. CONCLUSIONS: Bayesian modeling showed to be the most effective method, demonstrating the best accuracy in adaptively identifying irregularly shaped disease anomalies. In contrast, SaTScan excelled in detecting large outbreaks and regular forms. This study provides empirical evidence for the selection of appropriate tools for dengue surveillance in Thailand, with potential applicability to other disease control programs in similar settings.
Asunto(s)
Dengue , Animales , Humanos , Dengue/diagnóstico , Dengue/epidemiología , Tailandia/epidemiología , Teorema de Bayes , Análisis por Conglomerados , Brotes de Enfermedades/prevención & control , Toma de DecisionesRESUMEN
INTRODUCTION: Localized non-inheritable developmental defects of tooth enamel (DDE) are classified as enamel hypoplasia (EH), opacity (OP), and post-eruptive breakdown (PEB) using the enamel defects index. To better understand the etiology of DDE, we assessed the linkages amongst exposome variables for these defects during the specific time duration for enamel mineralization of the human primary maxillary central incisor enamel crowns. In general, these two teeth develop between 13 and 14 weeks in utero and 3-4 weeks' postpartum of a full-term delivery, followed by tooth eruption at about 1 year of age. METHODS: We utilized existing datasets for mother-child dyads that encompassed 12 weeks' gestation through birth and early infancy, and child DDE outcomes from digital images of the erupted primary maxillary central incisor teeth. We applied a Bayesian modeling paradigm to assess the important predictors of EH, OP, and PEB. RESULTS: The results of Gibbs variable selection showed a key set of predictors: mother's prepregnancy body mass index (BMI); maternal serum concentrations of calcium and phosphorus at gestational week 28; child's gestational age; and both mother's and child's functional vitamin D deficiency (FVDD). In this sample of healthy mothers and children, significant predictors for OP included the child having a gestational period >36 weeks and FVDD at birth, and for PEB included a mother's prepregnancy BMI <21.5 and higher serum phosphorus concentration at week 28. CONCLUSION: In conclusion, our methodology and results provide a roadmap for assessing timely biomarker measures of exposures during specific tooth development to better understand the etiology of DDE for future prevention.
Asunto(s)
Hipoplasia del Esmalte Dental , Esmalte Dental , Recién Nacido , Femenino , Humanos , Incisivo , Teorema de Bayes , Hipoplasia del Esmalte Dental/etiología , Prevalencia , Fósforo , Diente PrimarioRESUMEN
BACKGROUND: An association was observed between an inflammation-related risk score (IRRS) and worse overall survival (OS) among a cohort of mostly White women with invasive epithelial ovarian cancer (EOC). Herein, we evaluated the association between the IRRS and OS among Black women with EOC, a population with higher frequencies of pro-inflammatory exposures and worse survival. METHODS: The analysis included 592 Black women diagnosed with EOC from the African American Cancer Epidemiology Study (AACES). Cox proportional hazards models were used to compute hazard ratios (HRs) and 95% confidence intervals (CIs) for the association of the IRRS and OS, adjusting for relevant covariates. Additional inflammation-related exposures, including the energy-adjusted Dietary Inflammatory Index (E-DIITM), were evaluated. RESULTS: A dose-response trend was observed showing higher IRRS was associated with worse OS (per quartile HR: 1.11, 95% CI: 1.01-1.22). Adding the E-DII to the model attenuated the association of IRRS with OS, and increasing E-DII, indicating a more pro-inflammatory diet, was associated with shorter OS (per quartile HR: 1.12, 95% CI: 1.02-1.24). Scoring high on both indices was associated with shorter OS (HR: 1.54, 95% CI: 1.16-2.06). CONCLUSION: Higher levels of inflammation-related exposures were associated with decreased EOC OS among Black women.
Asunto(s)
Inflamación , Neoplasias Ováricas , Humanos , Femenino , Inflamación/epidemiología , Inflamación/complicaciones , Factores de Riesgo , Dieta , Carcinoma Epitelial de Ovario/epidemiología , Carcinoma Epitelial de Ovario/complicaciones , Estudios de CohortesRESUMEN
PURPOSE: The causes for the survival disparity among Black women with epithelial ovarian cancer (EOC) are likely multi-factorial. Here we describe the African American Cancer Epidemiology Study (AACES), the largest cohort of Black women with EOC. METHODS: AACES phase 2 (enrolled 2020 onward) is a multi-site, population-based study focused on overall survival (OS) of EOC. Rapid case ascertainment is used in ongoing patient recruitment in eight U.S. states, both northern and southern. Data collection is composed of a survey, biospecimens, and medical record abstraction. Results characterizing the survival experience of the phase 1 study population (enrolled 2010-2015) are presented. RESULTS: Thus far, ~ 650 patients with EOC have been enrolled in the AACES. The five-year OS of AACES participants approximates those of Black women in the Surveillance Epidemiology and End Results (SEER) registry who survive at least 10-month past diagnosis and is worse compared to white women in SEER, 49 vs. 60%, respectively. A high proportion of women in AACES have low levels of household income (45% < $25,000 annually), education (51% ≤ high school education), and insurance coverage (32% uninsured or Medicaid). Those followed annually differ from those without follow-up with higher levels of localized disease (28 vs 24%) and higher levels of optimal debulking status (73 vs 67%). CONCLUSION: AACES is well positioned to evaluate the contribution of social determinants of health to the poor survival of Black women with EOC and advance understanding of the multi-factorial causes of the ovarian cancer survival disparity in Black women.
Asunto(s)
Negro o Afroamericano , Carcinoma Epitelial de Ovario , Neoplasias Ováricas , Femenino , Humanos , Carcinoma Epitelial de Ovario/epidemiología , Neoplasias Ováricas/epidemiología , Sistema de Registros , Estados Unidos/epidemiologíaRESUMEN
BACKGROUND: Bayesian models have been applied throughout the Covid-19 pandemic especially to model time series of case counts or deaths. Fewer examples exist of spatio-temporal modeling, even though the spatial spread of disease is a crucial factor in public health monitoring. The predictive capabilities of infectious disease models is also important. METHODS: In this study, the ability of Bayesian hierarchical models to recover different parts of the variation in disease counts is the focus. It is clear that different measures provide different views of behavior when models are fitted prospectively. Over a series of time horizons one step predictions have been generated and compared for different models (for case counts and death counts). These Bayesian SIR models were fitted using MCMC at 28 time horizons to mimic prospective prediction. A range of goodness of prediction measures were analyzed across the different time horizons. RESULTS: A particularly important result is that the peak intensity of case load is often under-estimated, while random spikes in case load can be mimicked using time dependent random effects. It is also clear that during the early wave of the pandemic simpler model forms are favored, but subsequently lagged spatial dependence models for cases are favored, even if the sophisticated models perform better overall. DISCUSSION: The models fitted mimic the situation where at a given time the history of the process is known but the future must be predicted based on the current evolution which has been observed. Using an overall 'best' model for prediction based on retrospective fitting of the complete pandemic waves is an assumption. However it is also clear that this case count model is well favored over other forms. During the first wave a simpler time series model predicts case counts better for counties than a spatially dependent one. The picture is more varied for morality. CONCLUSIONS: From a predictive point of view it is clear that spatio-temporal models applied to county level Covid-19 data within the US vary in how well they fit over time and also how well they predict future events. At different times, SIR case count models and also mortality models with cumulative counts perform better in terms of prediction. A fundamental result is that predictive capability of models varies over time and using the same model could lead to poor predictive performance. In addition it is clear that models addressing the spatial context for case counts (i.e. with lagged neighborhood terms) and cumulative case counts for mortality data are clearly better at modeling spatio-temporal data which is commonly available for the Covid-19 pandemic in different areas of the globe.
Asunto(s)
COVID-19 , Humanos , COVID-19/epidemiología , Teorema de Bayes , Estudios Prospectivos , Pandemias , Estudios RetrospectivosRESUMEN
BACKGROUND: To control emerging diseases, governments often have to make decisions based on limited evidence. The effective or temporal reproductive number is used to estimate the expected number of new cases caused by an infectious person in a partially susceptible population. While the temporal dynamic is captured in the temporal reproduction number, the dominant approach is currently based on modeling that implicitly treats people within a population as geographically well mixed. METHODS: In this study we aimed to develop a generic and robust methodology for estimating spatiotemporal dynamic measures that can be instantaneously computed for each location and time within a Bayesian model selection and averaging framework. A simulation study was conducted to demonstrate robustness of the method. A case study was provided of a real-world application to COVID-19 national surveillance data in Thailand. RESULTS: Overall, the proposed method allowed for estimation of different scenarios of reproduction numbers in the simulation study. The model selection chose the true serial interval when included in our study whereas model averaging yielded the weighted outcome which could be less accurate than model selection. In the case study of COVID-19 in Thailand, the best model based on model selection and averaging criteria had a similar trend to real data and was consistent with previously published findings in the country. CONCLUSIONS: The method yielded robust estimation in several simulated scenarios of force of transmission with computing flexibility and practical benefits. Thus, this development can be suitable and practically useful for surveillance applications especially for newly emerging diseases. As new outbreak waves continue to develop and the risk changes on both local and global scales, our work can facilitate policymaking for timely disease control.
Asunto(s)
COVID-19 , Enfermedades Transmisibles Emergentes , Humanos , COVID-19/epidemiología , Enfermedades Transmisibles Emergentes/epidemiología , Teorema de Bayes , Simulación por Computador , Brotes de Enfermedades/prevención & controlRESUMEN
BACKGROUND: COVID-19 brought enormous challenges to public health surveillance and underscored the importance of developing and maintaining robust systems for accurate surveillance. As public health data collection efforts expand, there is a critical need for infectious disease modeling researchers to continue to develop prospective surveillance metrics and statistical models to accommodate the modeling of large disease counts and variability. This paper evaluated different likelihoods for the disease count model and various spatiotemporal mean models for prospective surveillance. METHODS: We evaluated Bayesian spatiotemporal models, which are the foundation for model-based infectious disease surveillance metrics. Bayesian spatiotemporal mean models based on the Poisson and the negative binomial likelihoods were evaluated with the different lengths of past data usage. We compared their goodness of fit and short-term prediction performance with both simulated epidemic data and real data from the COVID-19 pandemic. RESULTS: The simulation results show that the negative binomial likelihood-based models show better goodness of fit results than Poisson likelihood-based models as deemed by smaller deviance information criteria (DIC) values. However, Poisson models yield smaller mean square error (MSE) and mean absolute one-step prediction error (MAOSPE) results when we use a shorter length of the past data such as 7 and 3 time periods. Real COVID-19 data analysis of New Jersey and South Carolina shows similar results for the goodness of fit and short-term prediction results. Negative binomial-based mean models showed better performance when we used the past data of 52 time periods. Poisson-based mean models showed comparable goodness of fit performance and smaller MSE and MAOSPE results when we used the past data of 7 and 3 time periods. CONCLUSION: We evaluate these models and provide future infectious disease outbreak modeling guidelines for Bayesian spatiotemporal analysis. Our choice of the likelihood and spatiotemporal mean models was influenced by both historical data length and variability. With a longer length of past data usage and more over-dispersed data, the negative binomial likelihood shows a better model fit than the Poisson likelihood. However, as we use a shorter length of the past data for our surveillance analysis, the difference between the Poisson and the negative binomial models becomes smaller. In this case, the Poisson likelihood shows robust posterior mean estimate and short-term prediction results.
Asunto(s)
COVID-19 , Enfermedades Transmisibles , Humanos , Teorema de Bayes , COVID-19/epidemiología , Funciones de Verosimilitud , Pandemias , Estudios Prospectivos , Enfermedades Transmisibles/epidemiologíaRESUMEN
BACKGROUND: Diabetes is a public health burden that disproportionately affects military veterans and racial minorities. Studies of racial disparities are inherently observational, and thus may require the use of methods such as Propensity Score Analysis (PSA). While traditional PSA accounts for patient-level factors, this may not be sufficient when patients are clustered at the geographic level and thus important confounders, whether observed or unobserved, vary by geographic location. METHODS: We employ a spatial propensity score matching method to account for "geographic confounding", which occurs when the confounding factors, whether observed or unobserved, vary by geographic region. We augment the propensity score and outcome models with spatial random effects, which are assigned scaled Besag-York-Mollié priors to address spatial clustering and improve inferences by borrowing information across neighboring geographic regions. We apply this approach to a study exploring racial disparities in diabetes specialty care between non-Hispanic black and non-Hispanic white veterans. We construct multiple global estimates of the risk difference in diabetes care: a crude unadjusted estimate, an estimate based solely on patient-level matching, and an estimate that incorporates both patient and spatial information. RESULTS: In simulation we show that in the presence of an unmeasured geographic confounder, ignoring spatial heterogeneity results in increased relative bias and mean squared error, whereas incorporating spatial random effects improves inferences. In our study of racial disparities in diabetes specialty care, the crude unadjusted estimate suggests that specialty care is more prevalent among non-Hispanic blacks, while patient-level matching indicates that it is less prevalent. Hierarchical spatial matching supports the latter conclusion, with a further increase in the magnitude of the disparity. CONCLUSIONS: These results highlight the importance of accounting for spatial heterogeneity in propensity score analysis, and suggest the need for clinical care and management strategies that are culturally sensitive and racially inclusive.
Asunto(s)
Grupos Raciales , Población Blanca , Sesgo , Humanos , Puntaje de Propensión , Análisis EspacialRESUMEN
The introduction of spatial and temporal frailty parameters in survival models furnishes a way to represent unmeasured confounding in the outcome of interest. Using a Bayesian accelerated failure time model, we are able to flexibly explore a wide range of spatial and temporal options for structuring frailties as well as examine the benefits of using these different structures in certain settings. A setting of particular interest for this work involved using temporal frailties to capture the impact of events of interest on breast cancer survival. Our results suggest that it is important to include these temporal frailties when there is a true temporal structure to the outcome and including them when a true temporal structure is absent does not sacrifice model fit. Additionally, the frailties are able to correctly recover the truth imposed on simulated data without affecting the fixed effect estimates. In the case study involving Louisiana breast cancer-specific mortality, the temporal frailty played an important role in representing the unmeasured confounding related to improvements in knowledge, education, and disease screenings as well as the impacts of Hurricane Katrina and the passing of the Affordable Care Act. In conclusion, the incorporation of temporal, in addition to spatial, frailties in survival analysis can lead to better fitting models and improved inference by representing both spatially and temporally varying unmeasured risk factors and confounding that could impact survival. Specifically, we successfully estimated changes in survival around the time of events of interest.
Asunto(s)
Bioestadística/métodos , Neoplasias de la Mama/mortalidad , Modelos Estadísticos , Análisis de Supervivencia , HumanosRESUMEN
AIM: The aim of this study was to assess biomarkers of calcium homeostasis and tooth development, in mothers during pregnancy and their children at birth, for enamel hypoplasia (EH) in the primary maxillary central incisor teeth. METHODS: Bayesian methodology was used for secondary data analyses from a randomized, controlled trial of prenatal vitamin D3 supplementation in healthy mothers (N = 350) and a follow-up study of a subset of the children. The biomarkers were serum calcium (Ca), phosphorus (P), intact parathyroid hormone (iPTH), total circulating 25-dihydroxyvitamin D (25(OH)D), and 1,25-dihydroxyvitamin D (1,25(OH)2D). The maternal biomarkers were assayed monthly during pregnancy, and the child's biomarkers were derived from cord blood. Digital images of the child's 2 teeth were scored for EH using Enamel Defects Index criteria for each of the incisal, middle, and cervical regions for an EH extent score. RESULTS: The child EH prevalence was 41% (60/145), with most defects present in the incisal and middle tooth regions. Cord blood iPTH and 1,25(OH)2D levels were significantly associated with EH extent after controlling for maternal factors. For every 1 pg/mL increase in cord blood iPTH, the EH extent decreased by approximately 6%. For every 10 pg/mL increase in cord blood 1,25(OH)2D, the EH extent increased by almost 30% (holding all other terms constant and adjusting for subject-level heterogeneity). The relationship between maternal 25(OH)D and maternal mean iPTH varied significantly by EH extent. CONCLUSION: The results suggest possible modifiable relationships of maternal and neonatal factors of calcium homeostasis during pregnancy and at birth for EH, contributing to the frontier of knowledge regarding sound tooth development for dental caries prevention.
Asunto(s)
Caries Dental , Hipoplasia del Esmalte Dental , Teorema de Bayes , Biomarcadores , Calcio , Hipoplasia del Esmalte Dental/prevención & control , Femenino , Estudios de Seguimiento , Homeostasis , Humanos , Recién Nacido , EmbarazoRESUMEN
BACKGROUND: Discrimination and trust are known barriers to accessing health care. Despite well-documented racial disparities in the ovarian cancer care continuum, the role of these barriers has not been examined. This study evaluated the association of everyday discrimination and trust in physicians with a prolonged interval between symptom onset and ovarian cancer diagnosis (hereafter referred to as prolonged symptom duration). METHODS: Subjects included cases enrolled in the African American Cancer Epidemiology Study, a multisite case-control study of epithelial ovarian cancer among black women. Logistic regression was used to calculate odds ratios (ORs) and 95% confidence intervals (CIs) for associations of everyday discrimination and trust in physicians with a prolonged symptom duration (1 or more symptoms lasting longer than the median symptom-specific duration), and it controlled for access-to-care covariates and potential confounders. RESULTS: Among the 486 cases in this analysis, 302 women had prolonged symptom duration. In the fully adjusted model, a 1-unit increase in the frequency of everyday discrimination increased the odds of prolonged symptom duration 74% (OR, 1.74; 95% CI, 1.22-2.49), but trust in physicians was not associated with prolonged symptom duration (OR, 0.86; 95% CI, 0.66-1.11). CONCLUSIONS: Perceived everyday discrimination was associated with prolonged symptom duration, whereas more commonly evaluated determinants of access to care and trust in physicians were not. These results suggest that more research on the effects of interpersonal barriers affecting ovarian cancer care is warranted.
Asunto(s)
Negro o Afroamericano , Disparidades en Atención de Salud , Neoplasias Ováricas/epidemiología , Relaciones Médico-Paciente , Racismo , Confianza , Anciano , Estudios de Casos y Controles , Comorbilidad , Femenino , Humanos , Persona de Mediana Edad , Estadificación de Neoplasias , Oportunidad Relativa , Neoplasias Ováricas/diagnóstico , Neoplasias Ováricas/etnología , Vigilancia en Salud Pública , Estados Unidos/epidemiologíaRESUMEN
BACKGROUND: New emerging diseases are public health concerns in which policy makers have to make decisions in the presence of enormous uncertainty. This is an important challenge in terms of emergency preparation requiring the operation of effective surveillance systems. A key concept to investigate the dynamic of infectious diseases is the basic reproduction number. However it is difficult to be applicable in real situations due to the underlying theoretical assumptions. METHODS: In this paper we propose a robust and flexible methodology for estimating disease strength varying in space and time using an alternative measure of disease transmission within the hierarchical modeling framework. The proposed measure is also extended to allow for incorporating knowledge from related diseases to enhance performance of surveillance system. RESULTS: A simulation was conducted to examine robustness of the proposed methodology and the simulation results demonstrate that the proposed method allows robust estimation of the disease strength across simulation scenarios. A real data example is provided of an integrative application of Dengue and Zika surveillance in Thailand. The real data example also shows that combining both diseases in an integrated analysis essentially decreases variability of model fitting. CONCLUSIONS: The proposed methodology is robust in several simulated scenarios of spatiotemporal transmission force with computing flexibility and practical benefits. This development has potential for broad applicability as an alternative tool for integrated surveillance of emerging diseases such as Zika.
Asunto(s)
Dengue/epidemiología , Dengue/transmisión , Monitoreo Epidemiológico , Infección por el Virus Zika/epidemiología , Infección por el Virus Zika/transmisión , Enfermedades Transmisibles Emergentes/epidemiología , Enfermedades Transmisibles Emergentes/transmisión , Virus del Dengue , Humanos , Análisis Multivariante , Salud Pública/métodos , Análisis Espacio-Temporal , Tailandia/epidemiología , Virus ZikaRESUMEN
It is our primary focus to study the spatial distribution of disease incidence at different geographical levels. Often, spatial data are available in the form of aggregation at multiple scale levels such as census tract, county, state, and so on. When data are aggregated from a fine (e.g. county) to a coarse (e.g. state) geographical level, there will be loss of information. The problem is more challenging when excessive zeros are available at the fine level. After data aggregation, the excessive zeros at the fine level will be reduced at the coarse level. If we ignore the zero inflation and the aggregation effect, we could get inconsistent risk estimates at the fine and coarse levels. Hence, in this paper, we address those problems using zero inflated multiscale models that jointly describe the risk variations at different geographical levels. For the excessive zeros at the fine level, we use a zero inflated convolution model, whereas we consider a regular convolution model for the smoothed data at the coarse level. These methods provide a consistent risk estimate at the fine and coarse levels when high percentages of structural zeros are present in the data.
RESUMEN
Spatial big data have the velocity, volume, and variety of big data sources and contain additional geographic information. Digital data sources, such as medical claims, mobile phone call data records, and geographically tagged tweets, have entered infectious diseases epidemiology as novel sources of data to complement traditional infectious disease surveillance. In this work, we provide examples of how spatial big data have been used thus far in epidemiological analyses and describe opportunities for these sources to improve disease-mitigation strategies and public health coordination. In addition, we consider the technical, practical, and ethical challenges with the use of spatial big data in infectious disease surveillance and inference. Finally, we discuss the implications of the rising use of spatial big data in epidemiology to health risk communication, and public health policy recommendations and coordination across scales.
Asunto(s)
Enfermedades Transmisibles/epidemiología , Monitoreo Epidemiológico , Análisis Espacial , Política de Salud , Humanos , Administración en Salud Pública/ética , Topografía MédicaRESUMEN
To describe the spatial distribution of diseases, a number of methods have been proposed to model relative risks within areas. Most models use Bayesian hierarchical methods, in which one models both spatially structured and unstructured extra-Poisson variance present in the data. For modelling a single disease, the conditional autoregressive (CAR) convolution model has been very popular. More recently, a combined model was proposed that 'combines' ideas from the CAR convolution model and the well-known Poisson-gamma model. The combined model was shown to be a good alternative to the CAR convolution model when there was a large amount of uncorrelated extra-variance in the data. Less solutions exist for modelling two diseases simultaneously or modelling a disease in two sub-populations simultaneously. Furthermore, existing models are typically based on the CAR convolution model. In this paper, a bivariate version of the combined model is proposed in which the unstructured heterogeneity term is split up into terms that are shared and terms that are specific to the disease or subpopulation, while spatial dependency is introduced via a univariate or multivariate Markov random field. The proposed method is illustrated by analysis of disease data in Georgia (USA) and Limburg (Belgium) and in a simulation study. We conclude that the bivariate combined model constitutes an interesting model when two diseases are possibly correlated. As the choice of the preferred model differs between data sets, we suggest to use the new and existing modelling approaches together and to choose the best model via goodness-of-fit statistics. Copyright © 2016 John Wiley & Sons, Ltd.
Asunto(s)
Teorema de Bayes , Análisis Espacial , Bélgica/epidemiología , Georgia/epidemiología , Humanos , Modelos Estadísticos , RiesgoRESUMEN
To enhance our knowledge regarding biological pathway regulation, we took an integrated approach, using the biomedical literature, ontologies, network analyses and experimental investigation to infer novel genes that could modulate biological pathways. We first constructed a novel gene network via a pairwise comparison of all yeast genes' Ontology Fingerprints--a set of Gene Ontology terms overrepresented in the PubMed abstracts linked to a gene along with those terms' corresponding enrichment P-values. The network was further refined using a Bayesian hierarchical model to identify novel genes that could potentially influence the pathway activities. We applied this method to the sphingolipid pathway in yeast and found that many top-ranked genes indeed displayed altered sphingolipid pathway functions, initially measured by their sensitivity to myriocin, an inhibitor of de novo sphingolipid biosynthesis. Further experiments confirmed the modulation of the sphingolipid pathway by one of these genes, PFA4, encoding a palmitoyl transferase. Comparative analysis showed that few of these novel genes could be discovered by other existing methods. Our novel gene network provides a unique and comprehensive resource to study pathway modulations and systems biology in general.
Asunto(s)
Ontología de Genes , Redes Reguladoras de Genes , Teorema de Bayes , Genes Fúngicos , Redes y Vías Metabólicas/genética , PubMed , Esfingolípidos/metabolismo , Levaduras/genética , Levaduras/metabolismoRESUMEN
One of the main goals in spatial epidemiology is to study the geographical pattern of disease risks. For such purpose, the convolution model composed of correlated and uncorrelated components is often used. However, one of the two components could be predominant in some regions. To investigate the predominance of the correlated or uncorrelated component for multiple scale data, we propose four different spatial mixture multiscale models by mixing spatially varying probability weights of correlated (CH) and uncorrelated heterogeneities (UH). The first model assumes that there is no linkage between the different scales and, hence, we consider independent mixture convolution models at each scale. The second model introduces linkage between finer and coarser scales via a shared uncorrelated component of the mixture convolution model. The third model is similar to the second model but the linkage between the scales is introduced through the correlated component. Finally, the fourth model accommodates for a scale effect by sharing both CH and UH simultaneously. We applied these models to real and simulated data, and found that the fourth model is the best model followed by the second model.
Asunto(s)
Epidemiología , Modelos Estadísticos , Humanos , Medición de RiesgoRESUMEN
Choice of neighborhood scale affects associations between environmental attributes and health-related outcomes. This phenomenon, a part of the modifiable areal unit problem, has been described fully in geography but not as it relates to food environment research. Using two administrative-based geographic boundaries (census tracts and block groups), supermarket geographic measures (density, cumulative opportunity and distance to nearest) were created to examine differences by scale and associations between three common U.S. Census-based socioeconomic status (SES) characteristics (median household income, percentage of population living below poverty and percentage of population with at least a high school education) and a summary neighborhood SES z-score in an eight-county region of South Carolina. General linear mixed-models were used. Overall, both supermarket density and cumulative opportunity were higher when using census tract boundaries compared to block groups. In analytic models, higher median household income was significantly associated with lower neighborhood supermarket density and lower cumulative opportunity using either the census tract or block group boundaries, and neighborhood poverty was positively associated with supermarket density and cumulative opportunity. Both median household income and percent high school education were positively associated with distance to nearest supermarket using either boundary definition, whereas neighborhood poverty had an inverse association. Findings from this study support the premise that supermarket measures can differ by choice of geographic scale and can influence associations between measures. Researchers should consider the most appropriate geographic scale carefully when conducting food environment studies.