RESUMEN
In the early phases of growth, resurgent epidemic waves of SARS-CoV-2 incidence have been characterised by localised outbreaks. Therefore, understanding the geographic dispersion of emerging variants at the start of an outbreak is key for situational public health awareness. Using telecoms data, we derived mobility networks describing the movement patterns between local authorities in England, which we have used to inform the spatial structure of a Bayesian BYM2 model. Surge testing interventions can result in spatio-temporal sampling bias, and we account for this by extending the BYM2 model to include a random effect for each timepoint in a given area. Simulated-scenario modelling and real-world analyses of each variant that became dominant in England were conducted using our BYM2 model at local authority level in England. Simulated datasets were created using a stochastic metapopulation model, with the transmission rates between different areas parameterised using telecoms mobility data. Different scenarios were constructed to reproduce real-world spatial dispersion patterns that could prove challenging to inference, and we used these scenarios to understand the performance characteristics of the BYM2 model. The model performed better than unadjusted test positivity in all the simulation-scenarios, and in particular when sample sizes were small, or data was missing for geographical areas. Through the analyses of emerging variant transmission across England, we found a reduction in the early growth phase geographic clustering of later dominant variants as England became more interconnected from early 2022 and public health interventions were reduced. We have also shown the recent increased geographic spread and dominance of variants with similar mutations in the receptor binding domain, which may be indicative of convergent evolution of SARS-CoV-2 variants.
Asunto(s)
COVID-19 , SARS-CoV-2 , Humanos , Teorema de Bayes , SARS-CoV-2/genética , COVID-19/epidemiología , Inglaterra/epidemiologíaRESUMEN
New SARS-CoV-2 variants causing COVID-19 are a major risk to public health worldwide due to the potential for phenotypic change and increases in pathogenicity, transmissibility and/or vaccine escape. Recognising signatures of new variants in terms of replacing growth and severity are key to informing the public health response. To assess this, we aimed to investigate key time periods in the course of infection, hospitalisation and death, by variant. We linked datasets on contact tracing (Contact Tracing Advisory Service), testing (the Second-Generation Surveillance System) and hospitalisation (the Admitted Patient Care dataset) for the entire length of contact tracing in the England - from March 2020 to March 2022. We modelled, for England, time delay distributions using a Bayesian doubly interval censored modelling approach for the SARS-CoV-2 variants Alpha, Delta, Delta Plus (AY.4.2), Omicron BA.1 and Omicron BA.2. This was conducted for the incubation period, the time from infection to hospitalisation and hospitalisation to death. We further modelled the growth of novel variant replacement using a generalised additive model with a negative binomial error structure and the relationship between incubation period length and the risk of a fatality using a Bernoulli generalised linear model with a logit link. The mean incubation periods for each variant were: Alpha 4.19 (95% credible interval (CrI) 4.13-4.26) days; Delta 3.87 (95% CrI 3.82-3.93) days; Delta Plus 3.92 (95% CrI 3.87-3.98) days; Omicron BA.1 3.67 (95% CrI 3.61-3.72) days and Omicron BA.2 3.48 (95% CrI 3.43-3.53) days. The mean time from infection to hospitalisation was for Alpha 11.31 (95% CrI 11.20-11.41) days, Delta 10.36 (95% CrI 10.26-10.45) days and Omicron BA.1 11.54 (95% CrI 11.38-11.70) days. The mean time from hospitalisation to death was, for Alpha 14.31 (95% CrI 14.00-14.62) days; Delta 12.81 (95% CrI 12.62-13.00) days and Omicron BA.2 16.02 (95% CrI 15.46-16.60) days. The 95th percentile of the incubation periods were: Alpha 11.19 (95% CrI 10.92-11.48) days; Delta 9.97 (95% CrI 9.73-10.21) days; Delta Plus 9.99 (95% CrI 9.78-10.24) days; Omicron BA.1 9.45 (95% CrI 9.23-9.67) days and Omicron BA.2 8.83 (95% CrI 8.62-9.05) days. Shorter incubation periods were associated with greater fatality risk when adjusted for age, sex, variant, vaccination status, vaccination manufacturer and time since last dose with an odds ratio of 0.83 (95% confidence interval 0.82-0.83) (P value < 0.05). Variants of SARS-CoV-2 that have replaced previously dominant variants have had shorter incubation periods. Conversely co-existing variants have had very similar and non-distinct incubation period distributions. Shorter incubation periods reflect generation time advantage, with a reduction in the time to the peak infectious period, and may be a significant factor in novel variant replacing growth. Shorter times for admission to hospital and death were associated with variant severity - the most severe variant, Delta, led to significantly earlier hospitalisation, and death. These measures are likely important for future risk assessment of new variants, and their potential impact on population health.
Asunto(s)
COVID-19 , SARS-CoV-2 , Humanos , Teorema de Bayes , Trazado de ContactoRESUMEN
Stan is a probabilistic programming language for specifying statistical models. A Stan program imperatively defines a log probability function over parameters conditioned on specified data and constants. As of version 2.14.0, Stan provides full Bayesian inference for continuous-variable models through Markov chain Monte Carlo methods such as the No-U-Turn sampler, an adaptive form of Hamiltonian Monte Carlo sampling. Penalized maximum likelihood estimates are calculated using optimization methods such as the limited memory Broyden-Fletcher-Goldfarb-Shanno algorithm. Stan is also a platform for computing log densities and their gradients and Hessians, which can be used in alternative algorithms such as variational Bayes, expectation propagation, and marginal inference using approximate integration. To this end, Stan is set up so that the densities, gradients, and Hessians, along with intermediate quantities of the algorithm such as acceptance probabilities, are easily accessible. Stan can be called from the command line using the cmdstan package, through R using the rstan package, and through Python using the pystan package. All three interfaces support sampling and optimization-based inference with diagnostics and posterior analysis. rstan and pystan also provide access to log probabilities, gradients, Hessians, parameter transforms, and specialized plotting.
RESUMEN
Cryo-electron microscopy (cryo-EM) has recently become a leading method for obtaining high-resolution structures of biological macromolecules. However, cryo-EM is limited to biomolecular samples with low conformational heterogeneity, where most conformations can be well-sampled at various projection angles. While cryo-EM provides single-molecule data for heterogeneous molecules, most existing reconstruction tools cannot retrieve the ensemble distribution of possible molecular conformations from these data. To overcome these limitations, we build on a previous Bayesian approach and develop an ensemble refinement framework that estimates the ensemble density from a set of cryo-EM particle images by reweighting a prior conformational ensemble, e.g., from molecular dynamics simulations or structure prediction tools. Our work provides a general approach to recovering the equilibrium probability density of the biomolecule directly in conformational space from single-molecule data. To validate the framework, we study the extraction of state populations and free energies for a simple toy model and from synthetic cryo-EM particle images of a simulated protein that explores multiple folded and unfolded conformations.
Asunto(s)
Simulación de Dinámica Molecular , Proteínas , Microscopía por Crioelectrón/métodos , Teorema de Bayes , Conformación MolecularRESUMEN
Autism spectrum disorder (ASD) is a neurodevelopmental disorder characterized by heterogeneous cognitive, behavioral and communication impairments. Disruption of the gut-brain axis (GBA) has been implicated in ASD although with limited reproducibility across studies. In this study, we developed a Bayesian differential ranking algorithm to identify ASD-associated molecular and taxa profiles across 10 cross-sectional microbiome datasets and 15 other datasets, including dietary patterns, metabolomics, cytokine profiles and human brain gene expression profiles. We found a functional architecture along the GBA that correlates with heterogeneity of ASD phenotypes, and it is characterized by ASD-associated amino acid, carbohydrate and lipid profiles predominantly encoded by microbial species in the genera Prevotella, Bifidobacterium, Desulfovibrio and Bacteroides and correlates with brain gene expression changes, restrictive dietary patterns and pro-inflammatory cytokine profiles. The functional architecture revealed in age-matched and sex-matched cohorts is not present in sibling-matched cohorts. We also show a strong association between temporal changes in microbiome composition and ASD phenotypes. In summary, we propose a framework to leverage multi-omic datasets from well-defined cohorts and investigate how the GBA influences ASD.
Asunto(s)
Trastorno del Espectro Autista , Microbioma Gastrointestinal , Humanos , Microbioma Gastrointestinal/genética , Eje Cerebro-Intestino , Trastorno del Espectro Autista/genética , Trastorno del Espectro Autista/metabolismo , Estudios Transversales , Teorema de Bayes , Reproducibilidad de los Resultados , CitocinasRESUMEN
Cryo-electron microscopy (cryo-EM) extracts single-particle density projections of individual biomolecules. Although cryo-EM is widely used for 3D reconstruction, due to its single-particle nature it has the potential to provide information about a biomolecule's conformational variability and underlying free-energy landscape. However, treating cryo-EM as a single-molecule technique is challenging because of the low signal-to-noise ratio (SNR) in individual particles. In this work, we propose the cryo-BIFE method (cryo-EM Bayesian Inference of Free-Energy profiles), which uses a path collective variable to extract free-energy profiles and their uncertainties from cryo-EM images. We test the framework on several synthetic systems where the imaging parameters and conditions were controlled. We found that for realistic cryo-EM environments and relevant biomolecular systems, it is possible to recover the underlying free energy, with the pose accuracy and SNR as crucial determinants. We then use the method to study the conformational transitions of a calcium-activated channel with real cryo-EM particles. Interestingly, we recover not only the most probable conformation (used to generate a high-resolution reconstruction of the calcium-bound state) but also a metastable state that corresponds to the calcium-unbound conformation. As expected for turnover transitions within the same sample, the activation barriers are on the order of [Formula: see text]. We expect our tool for extracting free-energy profiles from cryo-EM images to enable more complete characterization of the thermodynamic ensemble of biomolecules.
RESUMEN
When testing for a rare disease, prevalence estimates can be highly sensitive to uncertainty in the specificity and sensitivity of the test. Bayesian inference is a natural way to propagate these uncertainties, with hierarchical modelling capturing variation in these parameters across experiments. Another concern is the people in the sample not being representative of the general population. Statistical adjustment cannot without strong assumptions correct for selection bias in an opt-in sample, but multilevel regression and post-stratification can at least adjust for known differences between the sample and the population. We demonstrate hierarchical regression and post-stratification models with code in Stan and discuss their application to a controversial recent study of SARS-CoV-2 antibodies in a sample of people from the Stanford University area. Wide posterior intervals make it impossible to evaluate the quantitative claims of that study regarding the number of unreported infections. For future studies, the methods described here should facilitate more accurate estimates of disease prevalence from imperfect tests performed on non-representative samples.
RESUMEN
Nineteen teams presented results for the Gene Mention Task at the BioCreative II Workshop. In this task participants designed systems to identify substrings in sentences corresponding to gene name mentions. A variety of different methods were used and the results varied with a highest achieved F1 score of 0.8721. Here we present brief descriptions of all the methods used and a statistical analysis of the results. We also demonstrate that, by combining the results from all submissions, an F score of 0.9066 is feasible, and furthermore that the best result makes use of the lowest scoring submissions.