ABSTRACT
Scientific progress depends on formulating testable hypotheses informed by the literature. In many domains, however, this model is strained because the number of research papers exceeds human readability. Here, we developed computational assistance to analyze the biomedical literature by reading PubMed abstracts to suggest new hypotheses. The approach was tested experimentally on the tumor suppressor p53 by ranking its most likely kinases, based on all available abstracts. Many of the best-ranked kinases were found to bind and phosphorylate p53 (P value = 0.005), suggesting six likely p53 kinases so far. One of these, NEK2, was studied in detail. A known mitosis promoter, NEK2 was shown to phosphorylate p53 at Ser315 in vitro and in vivo and to functionally inhibit p53. These bona fide validations of text-based predictions of p53 phosphorylation, and the discovery of an inhibitory p53 kinase of pharmaceutical interest, suggest that automated reasoning using a large body of literature can generate valuable molecular hypotheses and has the potential to accelerate scientific discovery.
Subject(s)
Abstracting and Indexing , NIMA-Related Kinases/metabolism , Tumor Suppressor Protein p53/antagonists & inhibitors , Tumor Suppressor Protein p53/metabolism , HCT116 Cells , HEK293 Cells , Humans , NIMA-Related Kinases/genetics , Natural Language Processing , Phosphorylation , PubMed , Tumor Suppressor Protein p53/geneticsABSTRACT
Background: A model that jointly simulates infectious diseases with common modes of transmission can serve as a decision-analytic tool to identify optimal intervention combinations for overall disease prevention. In the United States, sexually transmitted infections (STIs) are a huge economic burden, with a large fraction of the burden attributed to HIV. Data also show interactions between HIV and other sexually transmitted infections (STIs), such as higher risk of acquisition and progression of co-infections among persons with HIV compared to persons without. However, given the wide range in prevalence and incidence burdens of STIs, current compartmental or agent-based network simulation methods alone are insufficient or computationally burdensome for joint disease modeling. Further, causal factors for higher risk of coinfection could be both behavioral (i.e., compounding effects of individual behaviors, network structures, and care behaviors) and biological (i.e., presence of one disease can biologically increase the risk of another). However, the data on the fraction attributed to each are limited. Methods: We present a new mixed agent-based compartmental (MAC) framework for jointly modeling STIs. It uses a combination of a new agent-based evolving network modeling (ABENM) technique for lower-prevalence diseases and compartmental modeling for higher-prevalence diseases. As a demonstration, we applied MAC to simulate lower-prevalence HIV in the United States and a higher-prevalence hypothetical Disease 2, using a range of transmission and progression rates to generate burdens replicative of the wide range of STIs. We simulated sexual transmissions among heterosexual males, heterosexual females, and men who have sex with men (men only and men and women). Setting the biological risk of co-infection to zero, we conducted numerical analyses to evaluate the influence of behavioral factors alone on disease dynamics. Results: The contribution of behavioral factors to risk of coinfection was sensitive to disease burden, care access, and population heterogeneity and mixing. The contribution of behavioral factors was generally lower than observed risk of coinfections for the range of hypothetical prevalence studied here, suggesting potential role of biological factors, that should be investigated further specific to an STI. Conclusions: The purpose of this study is to present a new simulation technique for jointly modeling infectious diseases that have common modes of transmission but varying epidemiological features. The numerical analysis serves as proof-of-concept for the application to STIs. Interactions between diseases are influenced by behavioral factors, are sensitive to care access and population features, and are likely exacerbated by biological factors. Social and economic conditions are among key drivers of behaviors that increase STI transmission, and thus, structural interventions are a key part of behavioral interventions. Joint modeling of diseases helps comprehensively simulate behavioral and biological factors of disease interactions to evaluate the true impact of common structural interventions on overall disease prevention. The new simulation framework is especially suited to simulate behavior as a function of social determinants, and further, to identify optimal combinations of common structural and disease-specific interventions.
ABSTRACT
Sampling is becoming an essential tool for scalable interactive visual analysis. After outlining prior work by the database community on sampling for visualization of aggregation queries, this article considers how these results might be improved and extended to a broader setting. The goal is to better understand how users interact with sampling to enable wider adoption of sampling for scalable visual analytics.
ABSTRACT
We consider the problem of estimating the number of distinct species S in a study area from the recorded presence or absence of species in each of a sample of quadrats. A generalized jackknife estimator of S is derived, along with an estimate of its variance. It is compared with the jackknife estimator for S proposed by Heltshe and Forrester and the empirical Bayes estimator of Mingoti and Meeden. We show that the empirical Bayes estimator has the form of a generalized jackknife estimator under a specific model for species distribution. We compare the new estimators of S to the empirical Bayes estimator via simulation. We characterize circumstances under which each is superior.