RESUMEN
The epistemic uncertainty in coronavirus disease (COVID-19) model-based predictions using complex noisy data greatly affects the accuracy of pandemic trend and state estimations. Quantifying the uncertainty of COVID-19 trends caused by different unobserved hidden variables is needed to evaluate the accuracy of the predictions for complex compartmental epidemiological models. A new approach for estimating the measurement noise covariance from real COVID-19 pandemic data has been presented based on the marginal likelihood (Bayesian evidence) for Bayesian model selection of the stochastic part of the Extended Kalman filter (EKF), with a sixth-order nonlinear epidemic model, known as the SEIQRD (Susceptible-Exposed-Infected-Quarantined-Recovered-Dead) compartmental model. This study presents a method for testing the noise covariance in cases of dependence or independence between the infected and death errors, to better understand their impact on the predictive accuracy and reliability of EKF statistical models. The proposed approach is able to reduce the error in the quantity of interest compared to the arbitrarily chosen values in the EKF estimation.
Asunto(s)
COVID-19 , Pandemias , Humanos , Arabia Saudita/epidemiología , Teorema de Bayes , Reproducibilidad de los Resultados , COVID-19/epidemiologíaRESUMEN
Nested sampling is an efficient method for calculating Bayesian evidence in data analysis and partition functions of potential energies. It is based on an exploration using a dynamical set of sampling points that evolves to higher values of the sampled function. When several maxima are present, this exploration can be a very difficult task. Different codes implement different strategies. Local maxima are generally treated separately, applying cluster recognition of the sampling points based on machine learning methods. We present here the development and implementation of different search and clustering methods on the nested_fit code. Slice sampling and the uniform search method are added in addition to the random walk already implemented. Three new cluster recognition methods are also developed. The efficiency of the different strategies, in terms of accuracy and number of likelihood calls, is compared considering a series of benchmark tests, including model comparison and a harmonic energy potential. Slice sampling proves to be the most stable and accurate search strategy. The different clustering methods present similar results but with very different computing time and scaling. Different choices of the stopping criterion of the algorithm, another critical issue of nested sampling, are also investigated with the harmonic energy potential.
RESUMEN
Nested sampling is an efficient algorithm for the calculation of the Bayesian evidence and posterior parameter probability distributions. It is based on the step-by-step exploration of the parameter space by Monte Carlo sampling with a series of values sets called live points that evolve towards the region of interest, i.e., where the likelihood function is maximal. In presence of several local likelihood maxima, the algorithm converges with difficulty. Some systematic errors can also be introduced by unexplored parameter volume regions. In order to avoid this, different methods are proposed in the literature for an efficient search of new live points, even in presence of local maxima. Here we present a new solution based on the mean shift cluster recognition method implemented in a random walk search algorithm. The clustering recognition is integrated within the Bayesian analysis program NestedFit. It is tested with the analysis of some difficult cases. Compared to the analysis results without cluster recognition, the computation time is considerably reduced. At the same time, the entire parameter space is efficiently explored, which translates into a smaller uncertainty of the extracted value of the Bayesian evidence.
RESUMEN
Steady-state vowels are vowels that are uttered with a momentarily fixed vocal tract configuration and with steady vibration of the vocal folds. In this steady-state, the vowel waveform appears as a quasi-periodic string of elementary units called pitch periods. Humans perceive this quasi-periodic regularity as a definite pitch. Likewise, so-called pitch-synchronous methods exploit this regularity by using the duration of the pitch periods as a natural time scale for their analysis. In this work, we present a simple pitch-synchronous method using a Bayesian approach for estimating formants that slightly generalizes the basic approach of modeling the pitch periods as a superposition of decaying sinusoids, one for each vowel formant, by explicitly taking into account the additional low-frequency content in the waveform which arises not from formants but rather from the glottal pulse. We model this low-frequency content in the time domain as a polynomial trend function that is added to the decaying sinusoids. The problem then reduces to a rather familiar one in macroeconomics: estimate the cycles (our decaying sinusoids) independently from the trend (our polynomial trend function); in other words, detrend the waveform of steady-state waveforms. We show how to do this efficiently.
RESUMEN
Wastewater monitoring is an efficient and effective way to surveil for various pathogens in communities. This is especially beneficial in areas of high transmission, such as preK-12 schools, where infections may otherwise go unreported. In this work, we apply wastewater disease surveillance using school and community wastewater from across Houston, Texas to monitor three major enteric viruses: astrovirus, sapovirus genogroup GI, and group A rotavirus. We present the results of a 10-week study that included the analysis of 164 wastewater samples for astrovirus, rotavirus, and sapovirus in 10 preK-12 schools, 6 wastewater treatment plants, and 2 lift stations using newly designed RT-ddPCR assays. We show that the RT-ddPCR assays were able to detect astrovirus, rotavirus, and sapovirus in school, lift station, and wastewater treatment plant (WWTP) wastewater, and that a positive detection of a virus in a school sample was paired with a positive detection of the same virus at a downstream lift station or wastewater treatment plant over 97 % of the time. Additionally, we show how wastewater detections of rotavirus in schools and WWTPs were significantly associated with citywide viral intestinal infections. School wastewater can play a role in the monitoring of enteric viruses and in the detection of outbreaks, potentially allowing public health officials to quickly implement mitigation strategies to prevent viral spread into surrounding communities.
Asunto(s)
Rotavirus , Sapovirus , Instituciones Académicas , Aguas Residuales , Aguas Residuales/virología , Sapovirus/aislamiento & purificación , Rotavirus/aislamiento & purificación , Texas , Monitoreo del Ambiente/métodos , Humanos , Mamastrovirus/aislamiento & purificaciónRESUMEN
We analyze single molecule FRET burst measurements using Bayesian nested sampling. The MultiNest algorithm produces accurate FRET efficiency distributions from single-molecule data. FRET efficiency distributions recovered by MultiNest and classic maximum entropy are compared for simulated data and for calmodulin labeled at residues 44 and 117. MultiNest compares favorably with maximum entropy analysis for simulated data, judged by the Bayesian evidence. FRET efficiency distributions recovered for calmodulin labeled with two different FRET dye pairs depended on the dye pair and changed upon Ca2+ binding. We also looked at the FRET efficiency distributions of calmodulin bound to the calcium/calmodulin dependent protein kinase II (CaMKII) binding domain. For both dye pairs, the FRET efficiency distribution collapsed to a single peak in the case of calmodulin bound to the CaMKII peptide. These measurements strongly suggest that consideration of dye-protein interactions is crucial in forming an accurate picture of protein conformations from FRET data.
RESUMEN
Motivation: Integrative structural modeling combines data from experiments, physical principles, statistics of previous structures, and prior models to obtain structures of macromolecular assemblies that are challenging to characterize experimentally. The choice of model representation is a key decision in integrative modeling, as it dictates the accuracy of scoring, efficiency of sampling, and resolution of analysis. But currently, the choice is usually made ad hoc, manually. Results: Here, we report NestOR (Nested Sampling for Optimizing Representation), a fully automated, statistically rigorous method based on Bayesian model selection to identify the optimal coarse-grained representation for a given integrative modeling setup. Given an integrative modeling setup, it determines the optimal representations from given candidate representations based on their model evidence and sampling efficiency. The performance of NestOR was evaluated on a benchmark of four macromolecular assemblies. Availability: NestOR is implemented in the Integrative Modeling Platform (https://integrativemodeling.org) and is available at https://github.com/isblab/nestor.
RESUMEN
The variation of eleven Cymodocea nodosa metrics was studied along two anthropogenic gradients in the North Aegean Sea, in two separate periods (July 2004 and July 2013). The aim was to specify existing monitoring programs on different kind of human-induced or natural stress for a better decision-making support. Key water variables (N-NO2, N-NO3, N-NH4, P-PO4, Chl-a, attenuation coefficient-K, and suspended solids) along with the stress index MALUSI were also estimated in each sampling effort. All metrics (except one) showed significant differences (p<0.05) and highest variation at the meadows scale in both sampling periods. The body size, e.g., CymoSkew, total and maximum leaf length, and leaf area (cm2/shoot), rather than the abundance, e.g., shoot density (shoots/m2), leaf area index (m2/m2), metrics were related to anthropogenic eutrophication variables represented by N-NH4, N-NO3, N/P and MALUSI. The temporal analysis was restricted to two (2) meadows and water variables that were common between the two periods. PERMANOVA and PCA of common meadows and metrics within nine years showed significant but not consistent differences. While the most impacted studied site of Viamyl remained unchanged, a significant improvement of water quality was observed in the second most impacted meadow of Nea Karvali, which however was reduced to half of its previous area. On the one hand that was the result of combined management practices in nearby aquacultures and lower industrial activities due to the economic crisis. On the contrary, dredging and excess siltation from changes in land catchments and construction of permanent structures may decrease seagrass abundance.
Asunto(s)
Alismatales/fisiología , Eutrofización , Alismatales/anatomía & histología , Clorofila/análisis , Clorofila A , Monitoreo del Ambiente/métodos , Región Mediterránea , Nitratos/análisis , Fosfatos/análisis , Hojas de la Planta/fisiología , Brotes de la Planta/fisiología , Agua de Mar/análisis , Agua de Mar/químicaRESUMEN
BACKGROUND: Computational models in biology are characterized by a large degree of uncertainty. This uncertainty can be analyzed with Bayesian statistics, however, the sampling algorithms that are frequently used for calculating Bayesian statistical estimates are computationally demanding, and each algorithm has unique advantages and disadvantages. It is typically unclear, before starting an analysis, which algorithm will perform well on a given computational model. RESULTS: We present BCM, a toolkit for the Bayesian analysis of Computational Models using samplers. It provides efficient, multithreaded implementations of eleven algorithms for sampling from posterior probability distributions and for calculating marginal likelihoods. BCM includes tools to simplify the process of model specification and scripts for visualizing the results. The flexible architecture allows it to be used on diverse types of biological computational models. In an example inference task using a model of the cell cycle based on ordinary differential equations, BCM is significantly more efficient than existing software packages, allowing more challenging inference problems to be solved. CONCLUSIONS: BCM represents an efficient one-stop-shop for computational modelers wishing to use sampler-based Bayesian statistics.
Asunto(s)
Biología Computacional/métodos , Simulación por Computador , Programas Informáticos , Algoritmos , Teorema de Bayes , Cinética , Modelos BiológicosRESUMEN
Weeds tend to aggregate in patches within fields, and there is evidence that this is partly owing to variation in soil properties. Because the processes driving soil heterogeneity operate at various scales, the strength of the relations between soil properties and weed density would also be expected to be scale-dependent. Quantifying these effects of scale on weed patch dynamics is essential to guide the design of discrete sampling protocols for mapping weed distribution. We developed a general method that uses novel within-field nested sampling and residual maximum-likelihood (reml) estimation to explore scale-dependent relations between weeds and soil properties. We validated the method using a case study of Alopecurus myosuroides in winter wheat. Using reml, we partitioned the variance and covariance into scale-specific components and estimated the correlations between the weed counts and soil properties at each scale. We used variograms to quantify the spatial structure in the data and to map variables by kriging. Our methodology successfully captured the effect of scale on a number of edaphic drivers of weed patchiness. The overall Pearson correlations between A. myosuroides and soil organic matter and clay content were weak and masked the stronger correlations at >50 m. Knowing how the variance was partitioned across the spatial scales, we optimised the sampling design to focus sampling effort at those scales that contributed most to the total variance. The methods have the potential to guide patch spraying of weeds by identifying areas of the field that are vulnerable to weed establishment.