Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 403
Filter
Add more filters

Publication year range
1.
Proc Natl Acad Sci U S A ; 120(18): e2218197120, 2023 May 02.
Article in English | MEDLINE | ID: mdl-37094150

ABSTRACT

System identification learns mathematical models of dynamic systems starting from input-output data. Despite its long history, such research area is still extremely active. New challenges are posed by identification of complex physical processes given by the interconnection of dynamic systems. Examples arise in biology and industry, e.g., in the study of brain dynamics or sensor networks. In the last years, regularized kernel-based identification, with inspiration from machine learning, has emerged as an interesting alternative to the classical approach commonly adopted in the literature. In the linear setting, it uses the class of stable kernels to include fundamental features of physical dynamical systems, e.g., smooth exponential decay of impulse responses. Such class includes also unknown continuous parameters, called hyperparameters, which play a similar role as the model discrete order in controlling complexity. In this paper, we develop a linear system identification procedure by casting stable kernels in a full Bayesian framework. Our models incorporate hyperparameters uncertainty and consist of a mixture of dynamic systems over a continuum spectrum of dimensions. They are obtained by overcoming drawbacks related to classical Markov chain Monte Carlo schemes that, when applied to stable kernels, are proved to become nearly reducible (i.e., unable to reconstruct posteriors of interest in reasonable time). Numerical experiments show that full Bayes frequently outperforms the state-of-the-art results on typical benchmark problems. Two real applications related to brain dynamics (neural activity) and sensor networks are also included.

2.
J Allergy Clin Immunol ; 154(2): 308-315, 2024 Aug.
Article in English | MEDLINE | ID: mdl-38494094

ABSTRACT

BACKGROUND: Single nucleotide polymorphisms (SNPs) in genes on chromosome 17q12-q21 are associated with childhood-onset asthma and rhinovirus-induced wheeze. There are few mechanistic data linking chromosome 17q12-q21 to wheezing illness. OBJECTIVE: We investigated whether 17q12-q21 risk alleles were associated with impaired interferon responses to rhinovirus. METHODS: In a population-based birth cohort of European ancestry, we stimulated peripheral blood mononuclear cells with rhinovirus A1 (RV-A1) and rhinovirus A16 (RV-A16) and measured IFN and IFN-induced C-X-C motif chemokine ligand 10 (aka IP10) responses in supernatants. We investigated associations between virus-induced cytokines and 6 SNPs in 17q12-q21. Bayesian profile regression was applied to identify clusters of individuals with different immune response profiles and genetic variants. RESULTS: Five SNPs (in high linkage disequilibrium, r2 ≥ 0.8) were significantly associated with RV-A1-induced IFN-ß (rs9303277, P = .010; rs11557467, P = .012; rs2290400, P = .006; rs7216389, P = .008; rs8079416, P = .005). A reduction in RV-A1-induced IFN-ß was observed among individuals with asthma risk alleles. There were no significant associations for RV-A1-induced IFN-α or CXCL10, or for any RV-A16-induced IFN/CXCL10. Bayesian profile regression analysis identified 3 clusters that differed in IFN-ß induction to RV-A1 (low, medium, high). The typical genetic profile of the cluster associated with low RV-A1-induced IFN-ß responses was characterized by a very high probability of being homozygous for the asthma risk allele for all SNPs. Children with persistent wheeze were almost 3 times more likely to be in clusters with reduced/average RV-A1-induced IFN-ß responses than in the high immune response cluster. CONCLUSIONS: Polymorphisms on chromosome 17q12-q21 are associated with rhinovirus-induced IFN-ß, suggesting a novel mechanism-impaired IFN-ß induction-links 17q12-q21 risk alleles with asthma/wheeze.


Subject(s)
Chromosomes, Human, Pair 17 , Polymorphism, Single Nucleotide , Rhinovirus , Humans , Chromosomes, Human, Pair 17/genetics , Male , Female , Asthma/genetics , Asthma/immunology , Interferons , Child , Respiratory Sounds/genetics , Respiratory Sounds/immunology , Picornaviridae Infections/immunology , Picornaviridae Infections/genetics , Genetic Predisposition to Disease , Chemokine CXCL10/genetics , Leukocytes, Mononuclear/immunology , Child, Preschool
3.
J Neurophysiol ; 132(2): 347-361, 2024 Aug 01.
Article in English | MEDLINE | ID: mdl-38919148

ABSTRACT

Recent work has shown the fundamental role that cognitive strategies play in visuomotor adaptation. Although algorithmic strategies, such as mental rotation, are flexible and generalizable, they are computationally demanding. To avoid this computational cost, people can instead rely on memory retrieval of previously successful visuomotor solutions. However, such a strategy is likely subject to stimulus-response associations and rely heavily on working memory. In a series of five experiments, we sought to estimate the constraints in terms of capacity and precision of working memory retrieval for visuomotor adaptation. This was accomplished by leveraging different variations of visuomotor item-recognition and visuomotor rotation tasks where we associated unique rotations with specific targets in the workspace and manipulated the set size (i.e., number of rotation-target associations). Notably, from experiment 1 to 4, we found key signatures of working memory retrieval and not mental rotation. In particular, participants were less accurate and slower for larger set sizes and less recent items. Using a Bayesian latent-mixture model, we found that such decrease in performance was the result of increasing guessing behavior and less precise memories. In addition, we estimated that participants' working memory capacity was limited to two to five items, after which guessing increasingly dominated performance. Finally, in experiment 5, we showed how the constraints observed across experiments 1 to 4 can be overcome when relying on long-term memory retrieval. Our results point to the opportunity of studying other sources of memories where visuomotor solutions can be stored (e.g., episodic memories) to achieve successful adaptation.NEW & NOTEWORTHY We show that humans can adapt to feedback perturbations in different variations of the visuomotor rotation task by retrieving the successful solutions from working memory. In addition, using a Bayesian latent-mixture model, we reveal that guessing and low-precision memories are both responsible for the decrease in participants' performance as the number of solutions to memorize increases. These constraints can be overcome by relying on long-term memory retrieval resulting from extended practice with the visuomotor solutions.


Subject(s)
Memory, Short-Term , Mental Recall , Psychomotor Performance , Humans , Memory, Short-Term/physiology , Psychomotor Performance/physiology , Male , Female , Adult , Mental Recall/physiology , Young Adult , Bayes Theorem , Adaptation, Physiological/physiology , Rotation , Visual Perception/physiology
4.
Am J Epidemiol ; 2024 Jun 13.
Article in English | MEDLINE | ID: mdl-38872348

ABSTRACT

Cardiovascular disease is a leading cause of death worldwide. There is limited evidence that exposure to current-use pesticides may contribute to cardiovascular disease risk. We examined the association between residential proximity to the application of agricultural pesticides and cardiovascular risk factors among 484 adult women in the Center for the Health Assessment of Mothers and Children of Salinas (CHAMACOS) study, a cohort based in an agricultural region of California. Outcome assessment was completed between 2010 and 2013. Using participant residential addresses and California's Pesticide Use Reporting database, we estimated agricultural pesticide use within one km of residences during the 2-year period preceding outcome assessment. We used Bayesian Hierarchical Modeling to evaluate associations between exposure to 14 agricultural pesticides and continuous measures of waist circumference, body mass index, and blood pressure. Each 10-fold increase in paraquat application around homes was associated with increased diastolic blood pressure (ß=2.60 mm Hg, 95% Credible Interval (CrI): 0.27-4.89) and each 10-fold increase in glyphosate application was associated with increased pulse pressure (ß=2.26 mm Hg, 95% CrI: 0.09-4.41). No meaningful associations were observed for the other pesticides examined. Our results suggest that paraquat and glyphosate pesticides may affect cardiovascular disease development in women with chronic environmental exposure.

5.
Antimicrob Agents Chemother ; 68(9): e0086324, 2024 Sep 04.
Article in English | MEDLINE | ID: mdl-39136464

ABSTRACT

The rise of multidrug-resistant malaria requires accelerated development of novel antimalarial drugs. Pharmacokinetic-pharmacodynamic (PK-PD) models relate blood antimalarial drug concentrations with the parasite-time profile to inform dosing regimens. We performed a simulation study to assess the utility of a Bayesian hierarchical mechanistic PK-PD model for predicting parasite-time profiles for a Phase 2 study of a new antimalarial drug, cipargamin. We simulated cipargamin concentration- and malaria parasite-profiles based on a Phase 2 study of eight volunteers who received cipargamin 7 days after inoculation with malaria parasites. The cipargamin profiles were generated from a two-compartment PK model and parasite profiles from a previously published biologically informed PD model. One thousand PK-PD data sets of eight patients were simulated, following the sampling intervals of the Phase 2 study. The mechanistic PK-PD model was incorporated in a Bayesian hierarchical framework, and the parameters were estimated. Population PK model parameters describing absorption, distribution, and clearance were estimated with minimal bias (mean relative bias ranged from 1.7% to 8.4%). The PD model was fitted to the parasitaemia profiles in each simulated data set using the estimated PK parameters. Posterior predictive checks demonstrate that our PK-PD model adequately captures the simulated PD profiles. The bias of the estimated population average PD parameters was low-moderate in magnitude. This simulation study demonstrates the viability of our PK-PD model to predict parasitological outcomes in Phase 2 volunteer infection studies. This work will inform the dose-effect relationship of cipargamin, guiding decisions on dosing regimens to be evaluated in Phase 3 trials.


Subject(s)
Antimalarials , Bayes Theorem , Antimalarials/pharmacokinetics , Antimalarials/therapeutic use , Antimalarials/pharmacology , Humans , Malaria, Falciparum/drug therapy , Malaria, Falciparum/parasitology , Plasmodium falciparum/drug effects , Adult , Parasitemia/drug therapy , Parasitemia/parasitology , Malaria/drug therapy , Male , Computer Simulation , Female
6.
Biostatistics ; 2023 Oct 06.
Article in English | MEDLINE | ID: mdl-37805937

ABSTRACT

In recent years, the field of neuroimaging has undergone a paradigm shift, moving away from the traditional brain mapping approach towards the development of integrated, multivariate brain models that can predict categories of mental events. However, large interindividual differences in both brain anatomy and functional localization after standard anatomical alignment remain a major limitation in performing this type of analysis, as it leads to feature misalignment across subjects in subsequent predictive models. This article addresses this problem by developing and validating a new computational technique for reducing misalignment across individuals in functional brain systems by spatially transforming each subject's functional data to a common latent template map. Our proposed Bayesian functional group-wise registration approach allows us to assess differences in brain function across subjects and individual differences in activation topology. We achieve the probabilistic registration with inverse-consistency by utilizing the generalized Bayes framework with a loss function for the symmetric group-wise registration. It models the latent template with a Gaussian process, which helps capture spatial features in the template, producing a more precise estimation. We evaluate the method in simulation studies and apply it to data from an fMRI study of thermal pain, with the goal of using functional brain activity to predict physical pain. We find that the proposed approach allows for improved prediction of reported pain scores over conventional approaches. Received on 2 January 2017. Editorial decision on 8 June 2021.

7.
Malar J ; 23(1): 102, 2024 Apr 09.
Article in English | MEDLINE | ID: mdl-38594716

ABSTRACT

BACKGROUND: Ghana is among the top 10 highest malaria burden countries, with about 20,000 children dying annually, 25% of which were under five years. This study aimed to produce interactive web-based disease spatial maps and identify the high-burden malaria districts in Ghana. METHODS: The study used 2016-2021 data extracted from the routine health service nationally representative and comprehensive District Health Information Management System II (DHIMS2) implemented by the Ghana Health Service. Bayesian geospatial modelling and interactive web-based spatial disease mapping methods were employed to quantify spatial variations and clustering in malaria risk across 260 districts. For each district, the study simultaneously mapped the observed malaria counts, district name, standardized incidence rate, and predicted relative risk and their associated standard errors using interactive web-based visualization methods. RESULTS: A total of 32,659,240 malaria cases were reported among children < 5 years from 2016 to 2021. For every 10% increase in the number of children, malaria risk increased by 0.039 (log-mean 0.95, 95% credible interval = - 13.82-15.73) and for every 10% increase in the number of males, malaria risk decreased by 0.075, albeit not statistically significant (log-mean - 1.82, 95% credible interval = - 16.59-12.95). The study found substantial spatial and temporal differences in malaria risk across the 260 districts. The predicted national relative risk was 1.25 (95% credible interval = 1.23, 1.27). The malaria risk is relatively the same over the entire year. However, a slightly higher relative risk was recorded in 2019 while in 2021, residing in Keta, Abuakwa South, Jomoro, Ahafo Ano South East, Tain, Nanumba North, and Tatale Sanguli districts was associated with the highest malaria risk ranging from a relative risk of 3.00 to 4.83. The district-level spatial patterns of malaria risks changed over time. CONCLUSION: This study identified high malaria risk districts in Ghana where urgent and targeted control efforts are required. Noticeable changes were also observed in malaria risk for certain districts over some periods in the study. The findings provide an effective, actionable tool to arm policymakers and programme managers in their efforts to reduce malaria risk and its associated morbidity and mortality in line with the Sustainable Development Goals (SDG) 3.2 for limited public health resource settings, where universal intervention across all districts is practically impossible.


Subject(s)
Malaria , Male , Child , Humans , Ghana/epidemiology , Bayes Theorem , Malaria/epidemiology , Health Services , Risk
8.
Stat Med ; 43(3): 501-513, 2024 02 10.
Article in English | MEDLINE | ID: mdl-38038137

ABSTRACT

We propose a multi-metric flexible Bayesian framework to support efficient interim decision-making in multi-arm multi-stage phase II clinical trials. Multi-arm multi-stage phase II studies increase the efficiency of drug development, but early decisions regarding the futility or desirability of a given arm carry considerable risk since sample sizes are often low and follow-up periods may be short. Further, since intermediate outcomes based on biomarkers of treatment response are rarely perfect surrogates for the primary outcome and different trial stakeholders may have different levels of risk tolerance, a single hypothesis test is insufficient for comprehensively summarizing the state of the collected evidence. We present a Bayesian framework comprised of multiple metrics based on point estimates, uncertainty, and evidence towards desired thresholds (a Target Product Profile) for (1) ranking of arms and (2) comparison of each arm against an internal control. Using a large public-private partnership targeting novel TB arms as a motivating example, we find via simulation study that our multi-metric framework provides sufficient confidence for decision-making with sample sizes as low as 30 patients per arm, even when intermediate outcomes have only moderate correlation with the primary outcome. Our reframing of trial design and the decision-making procedure has been well-received by research partners and is a practical approach to more efficient assessment of novel therapeutics.


Subject(s)
Research Design , Humans , Bayes Theorem , Sample Size , Uncertainty , Computer Simulation
9.
Stat Med ; 43(18): 3484-3502, 2024 Aug 15.
Article in English | MEDLINE | ID: mdl-38857904

ABSTRACT

The rise of cutting-edge precision cancer treatments has led to a growing significance of the optimal biological dose (OBD) in modern oncology trials. These trials now prioritize the consideration of both toxicity and efficacy simultaneously when determining the most desirable dosage for treatment. Traditional approaches in early-phase oncology trials have conventionally relied on the assumption of a monotone relationship between treatment efficacy and dosage. However, this assumption may not hold valid for novel oncology therapies. In reality, the dose-efficacy curve of such treatments may reach a plateau at a specific dose, posing challenges for conventional methods in accurately identifying the OBD. Furthermore, achieving reliable identification of the OBD is typically not possible based on a single small-sample trial. With data from multiple phase I and phase I/II trials, we propose a novel Bayesian random-effects dose-optimization meta-analysis (REDOMA) approach to identify the OBD by synthesizing toxicity and efficacy data from each trial. The REDOMA method can address trials with heterogeneous characteristics. We adopt a curve-free approach based on a Gamma process prior to model the average dose-toxicity relationship. In addition, we utilize a Bayesian model selection framework that uses the spike-and-slab prior as an automatic variable selection technique to eliminate monotonic constraints on the dose-efficacy curve. The good performance of the REDOMA method is confirmed by extensive simulation studies.


Subject(s)
Bayes Theorem , Dose-Response Relationship, Drug , Humans , Neoplasms/drug therapy , Meta-Analysis as Topic , Computer Simulation , Clinical Trials, Phase I as Topic/methods , Antineoplastic Agents/therapeutic use , Antineoplastic Agents/administration & dosage , Clinical Trials, Phase II as Topic/methods , Models, Statistical
10.
BMC Med Res Methodol ; 24(1): 195, 2024 Sep 07.
Article in English | MEDLINE | ID: mdl-39244581

ABSTRACT

The inability to correctly account for unmeasured confounding can lead to bias in parameter estimates, invalid uncertainty assessments, and erroneous conclusions. Sensitivity analysis is an approach to investigate the impact of unmeasured confounding in observational studies. However, the adoption of this approach has been slow given the lack of accessible software. An extensive review of available R packages to account for unmeasured confounding list deterministic sensitivity analysis methods, but no R packages were listed for probabilistic sensitivity analysis. The R package unmconf implements the first available package for probabilistic sensitivity analysis through a Bayesian unmeasured confounding model. The package allows for normal, binary, Poisson, or gamma responses, accounting for one or two unmeasured confounders from the normal or binomial distribution. The goal of unmconf is to implement a user friendly package that performs Bayesian modeling in the presence of unmeasured confounders, with simple commands on the front end while performing more intensive computation on the back end. We investigate the applicability of this package through novel simulation studies. The results indicate that credible intervals will have near nominal coverage probability and smaller bias when modeling the unmeasured confounder(s) for varying levels of internal/external validation data across various combinations of response-unmeasured confounder distributional families.


Subject(s)
Bayes Theorem , Confounding Factors, Epidemiologic , Software , Humans , Computer Simulation , Models, Statistical , Algorithms , Bias , Regression Analysis
11.
Clin Trials ; 21(3): 308-321, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38243401

ABSTRACT

In precision oncology, integrating multiple cancer patient subgroups into a single master protocol allows for the simultaneous assessment of treatment effects in these subgroups and promotes the sharing of information between them, ultimately reducing sample sizes and costs and enhancing scientific validity. However, the safety and efficacy of these therapies may vary across different subgroups, resulting in heterogeneous outcomes. Therefore, identifying subgroup-specific optimal doses in early-phase clinical trials is crucial for the development of future trials. In this article, we review various innovative Bayesian information-borrowing strategies that aim to determine and optimize subgroup-specific doses. Specifically, we discuss Bayesian hierarchical modeling, Bayesian clustering, Bayesian model averaging or selection, pairwise borrowing, and other relevant approaches. By employing these Bayesian information-borrowing methods, investigators can gain a better understanding of the intricate relationships between dose, toxicity, and efficacy in each subgroup. This increased understanding significantly improves the chances of identifying an optimal dose tailored to each specific subgroup. Furthermore, we present several practical recommendations to guide the design of future early-phase oncology trials involving multiple subgroups when using the Bayesian information-borrowing methods.


Subject(s)
Bayes Theorem , Neoplasms , Research Design , Humans , Neoplasms/drug therapy , Precision Medicine/methods , Models, Statistical , Dose-Response Relationship, Drug , Clinical Trials as Topic/methods
12.
Clin Trials ; 21(4): 430-439, 2024 Aug.
Article in English | MEDLINE | ID: mdl-38243404

ABSTRACT

BACKGROUND: Knowing the predictive factors of the variation in a center-level continuous outcome of interest is valuable in the design and analysis of parallel-arm cluster randomized trials. The symbolic two-step method for sample size planning that we present incorporates this knowledge while simultaneously accounting for patient-level characteristics. Our approach is illustrated through application to cluster randomized trials in cancer care delivery research. The required number of centers (clusters) depends on the between- and within-center variance; the within-center variance is a function of estimates obtained by regressing the log within-center variance on predictive factors. Obtaining accurate estimates of the components needed to characterize the within-center variation is challenging. METHODS: Using our previously derived sample size formula, our objective in the current research is to directly account for the imprecision in these estimates, using a Bayesian approach, to safeguard against designing an underpowered study when using the symbolic two-step method. Using estimates of the required components, including the number of centers that contribute to those estimates, we make formal allowance for the imprecision in these estimates on which a sample size will be based. RESULTS: The mean of the distribution for power is consistently smaller than the single point estimate that the sample size formula yields. The reduction in power is more pronounced in the presence of increased uncertainty about the estimates with the reduction becoming more attenuated with increased numbers of centers that contribute to the estimates. CONCLUSIONS: Accounting for imprecision in the estimates of the components required for sample size estimation using the symbolic two-step method in the design of a cluster randomized trial yields conservative estimates of power.


Subject(s)
Bayes Theorem , Neoplasms , Randomized Controlled Trials as Topic , Research Design , Humans , Sample Size , Randomized Controlled Trials as Topic/methods , Neoplasms/therapy , Cluster Analysis , Delivery of Health Care , Multicenter Studies as Topic/methods
13.
Biom J ; 66(1): e2200324, 2024 Jan.
Article in English | MEDLINE | ID: mdl-37776057

ABSTRACT

A common practice in clinical trials is to evaluate a treatment effect on an intermediate outcome when the true outcome of interest would be difficult or costly to measure. We consider how to validate intermediate outcomes in a causally-valid way when the trial outcomes are time-to-event. Using counterfactual outcomes, those that would be observed if the counterfactual treatment had been given, the causal association paradigm assesses the relationship of the treatment effect on the surrogate outcome with the treatment effect on the true, primary outcome. In particular, we propose illness-death models to accommodate the censored and semicompeting risk structure of survival data. The proposed causal version of these models involves estimable and counterfactual frailty terms. Via these multistate models, we characterize what a valid surrogate would look like using a causal effect predictiveness plot. We evaluate the estimation properties of a Bayesian method using Markov chain Monte Carlo and assess the sensitivity of our model assumptions. Our motivating data source is a localized prostate cancer clinical trial where the two survival outcomes are time to distant metastasis and time to death.


Subject(s)
Frailty , Models, Statistical , Humans , Bayes Theorem , Biomarkers
14.
Lifetime Data Anal ; 30(3): 600-623, 2024 Jul.
Article in English | MEDLINE | ID: mdl-38806842

ABSTRACT

We consider measurement error models for two variables observed repeatedly and subject to measurement error. One variable is continuous, while the other variable is a mixture of continuous and zero measurements. This second variable has two sources of zeros. The first source is episodic zeros, wherein some of the measurements for an individual may be zero and others positive. The second source is hard zeros, i.e., some individuals will always report zero. An example is the consumption of alcohol from alcoholic beverages: some individuals consume alcoholic beverages episodically, while others never consume alcoholic beverages. However, with a small number of repeat measurements from individuals, it is not possible to determine those who are episodic zeros and those who are hard zeros. We develop a new measurement error model for this problem, and use Bayesian methods to fit it. Simulations and data analyses are used to illustrate our methods. Extensions to parametric models and survival analysis are discussed briefly.


Subject(s)
Bayes Theorem , Models, Statistical , Humans , Computer Simulation , Survival Analysis , Alcohol Drinking , Data Interpretation, Statistical
15.
Genet Epidemiol ; 46(7): 446-462, 2022 10.
Article in English | MEDLINE | ID: mdl-35753057

ABSTRACT

5-hydroxymethylcytosine (5hmC) is a methylation state linked with gene regulation, commonly found in cells of the central nervous system. 5hmC is associated with demethylation of cytosines from 5-methylcytosine (5mC) to the unmethylated state. The presence of 5hmC can be inferred by a paired experiment involving bisulfite and oxidation-bisulfite treatments on the same sample, followed by a methylation assay using a platform such as the Illumina Infinium MethylationEPIC BeadChip (EPIC). Existing methods for analysis of the resulting EPIC data are not ideal. Most approaches ignore the correlation between the two experiments and any imprecision associated with DNA damage from the additional treatment. Estimates of 5mC/5hmC levels free from these limitations are desirable to reveal associations between methylation states and phenotypes. We propose a hierarchical Bayesian method called Constrained HYdroxy Methylation Estimation (CHYME) to simultaneously estimate 5mC/5hmC signals as well as any associations between these signals and covariates or phenotypes, while accounting for the potential impact of DNA damage and dependencies induced by the experimental design. Simulations show that CHYME has valid type 1 error and better power than a range of alternative methods, including the popular OxyBS method and linear models on transformed proportions. Other methods we examined suffer from hugely inflated type 1 error for inference on 5hmC proportions. We use CHYME to explore genome-wide associations between 5mC/5hmC levels and cause of death in postmortem prefrontal cortex brain tissue samples. These analyses indicate that CHYME is a useful tool to reveal phenotypic associations with 5mC/5hmC levels.


Subject(s)
DNA Methylation , Models, Genetic , Bayes Theorem , Cytosine , DNA Methylation/genetics , Humans , Phenotype
16.
Mol Biol Evol ; 39(8)2022 08 03.
Article in English | MEDLINE | ID: mdl-35733333

ABSTRACT

Single-cell sequencing provides a new way to explore the evolutionary history of cells. Compared to traditional bulk sequencing, where a population of heterogeneous cells is pooled to form a single observation, single-cell sequencing isolates and amplifies genetic material from individual cells, thereby preserving the information about the origin of the sequences. However, single-cell data are more error-prone than bulk sequencing data due to the limited genomic material available per cell. Here, we present error and mutation models for evolutionary inference of single-cell data within a mature and extensible Bayesian framework, BEAST2. Our framework enables integration with biologically informative models such as relaxed molecular clocks and population dynamic models. Our simulations show that modeling errors increase the accuracy of relative divergence times and substitution parameters. We reconstruct the phylogenetic history of a colorectal cancer patient and a healthy patient from single-cell DNA sequencing data. We find that the estimated times of terminal splitting events are shifted forward in time compared to models which ignore errors. We observed that not accounting for errors can overestimate the phylogenetic diversity in single-cell DNA sequencing data. We estimate that 30-50% of the apparent diversity can be attributed to error. Our work enables a full Bayesian approach capable of accounting for errors in the data within the integrative Bayesian software framework BEAST2.


Subject(s)
Neoplasms , Software , Bayes Theorem , Evolution, Molecular , Genomics , Humans , Models, Genetic , Phylogeny
17.
Biostatistics ; 24(1): 85-107, 2022 12 12.
Article in English | MEDLINE | ID: mdl-34363680

ABSTRACT

Risk prediction models are a crucial tool in healthcare. Risk prediction models with a binary outcome (i.e., binary classification models) are often constructed using methodology which assumes the costs of different classification errors are equal. In many healthcare applications, this assumption is not valid, and the differences between misclassification costs can be quite large. For instance, in a diagnostic setting, the cost of misdiagnosing a person with a life-threatening disease as healthy may be larger than the cost of misdiagnosing a healthy person as a patient. In this article, we present Tailored Bayes (TB), a novel Bayesian inference framework which "tailors" model fitting to optimize predictive performance with respect to unbalanced misclassification costs. We use simulation studies to showcase when TB is expected to outperform standard Bayesian methods in the context of logistic regression. We then apply TB to three real-world applications, a cardiac surgery, a breast cancer prognostication task, and a breast cancer tumor classification task and demonstrate the improvement in predictive performance over standard methods.


Subject(s)
Breast Neoplasms , Models, Statistical , Humans , Female , Bayes Theorem , Logistic Models , Computer Simulation , Breast Neoplasms/diagnosis
18.
Biometrics ; 79(3): 1986-1995, 2023 09.
Article in English | MEDLINE | ID: mdl-36250351

ABSTRACT

Performing causal inference in observational studies requires we assume confounding variables are correctly adjusted for. In settings with few discrete-valued confounders, standard models can be employed. However, as the number of confounders increases these models become less feasible as there are fewer observations available for each unique combination of confounding variables. In this paper, we propose a new model for estimating treatment effects in observational studies that incorporates both parametric and nonparametric outcome models. By conceptually splitting the data, we can combine these models while maintaining a conjugate framework, allowing us to avoid the use of Markov chain Monte Carlo (MCMC) methods. Approximations using the central limit theorem and random sampling allow our method to be scaled to high-dimensional confounders. Through simulation studies we show our method can be competitive with benchmark models while maintaining efficient computation, and illustrate the method on a large epidemiological health survey.


Subject(s)
Observational Studies as Topic , Causality , Computer Simulation , Markov Chains , Monte Carlo Method
19.
Biometrics ; 79(4): 3650-3663, 2023 12.
Article in English | MEDLINE | ID: mdl-36745619

ABSTRACT

Understanding factors that contribute to the increased likelihood of pathogen transmission between two individuals is important for infection control. However, analyzing measures of pathogen relatedness to estimate these associations is complicated due to correlation arising from the presence of the same individual across multiple dyadic outcomes, potential spatial correlation caused by unmeasured transmission dynamics, and the distinctive distributional characteristics of some of the outcomes. We develop two novel hierarchical Bayesian spatial methods for analyzing dyadic pathogen genetic relatedness data, in the form of patristic distances and transmission probabilities, that simultaneously address each of these complications. Using individual-level spatially correlated random effect parameters, we account for multiple sources of correlation between the outcomes as well as other important features of their distribution. Through simulation, we show the limitations of existing approaches in terms of estimating key associations of interest, and the ability of the new methodology to correct for these issues across datasets with different levels of correlation. All methods are applied to Mycobacterium tuberculosis data from the Republic of Moldova, where we identify previously unknown factors associated with disease transmission and, through analysis of the random effect parameters, key individuals, and areas with increased transmission activity. Model comparisons show the importance of the new methodology in this setting. The methods are implemented in the R package GenePair.


Subject(s)
Mycobacterium tuberculosis , Humans , Mycobacterium tuberculosis/genetics , Bayes Theorem , Computer Simulation
20.
Biometrics ; 79(3): 1840-1852, 2023 09.
Article in English | MEDLINE | ID: mdl-35833874

ABSTRACT

Valid surrogate endpoints S can be used as a substitute for a true outcome of interest T to measure treatment efficacy in a clinical trial. We propose a causal inference approach to validate a surrogate by incorporating longitudinal measurements of the true outcomes using a mixed modeling approach, and we define models and quantities for validation that may vary across the study period using principal surrogacy criteria. We consider a surrogate-dependent treatment efficacy curve that allows us to validate the surrogate at different time points. We extend these methods to accommodate a delayed-start treatment design where all patients eventually receive the treatment. Not all parameters are identified in the general setting. We apply a Bayesian approach for estimation and inference, utilizing more informative prior distributions for selected parameters. We consider the sensitivity of these prior assumptions as well as assumptions of independence among certain counterfactual quantities conditional on pretreatment covariates to improve identifiability. We examine the frequentist properties (bias of point and variance estimates, credible interval coverage) of a Bayesian imputation method. Our work is motivated by a clinical trial of a gene therapy where the functional outcomes are measured repeatedly throughout the trial.


Subject(s)
Models, Statistical , Humans , Bayes Theorem , Biomarkers , Treatment Outcome , Causality
SELECTION OF CITATIONS
SEARCH DETAIL