Pesquisa | Portal de Pesquisa da BVS

1.

Fast and accurate Bayesian polygenic risk modeling with variational inference.

Zabad, Shadi; Gravel, Simon; Li, Yue.

Am J Hum Genet ; 110(5): 741-761, 2023 05 04.

Artigo em Inglês | MEDLINE | ID: mdl-37030289

RESUMO

The advent of large-scale genome-wide association studies (GWASs) has motivated the development of statistical methods for phenotype prediction with single-nucleotide polymorphism (SNP) array data. These polygenic risk score (PRS) methods use a multiple linear regression framework to infer joint effect sizes of all genetic variants on the trait. Among the subset of PRS methods that operate on GWAS summary statistics, sparse Bayesian methods have shown competitive predictive ability. However, most existing Bayesian approaches employ Markov chain Monte Carlo (MCMC) algorithms, which are computationally inefficient and do not scale favorably to higher dimensions, for posterior inference. Here, we introduce variational inference of polygenic risk scores (VIPRS), a Bayesian summary statistics-based PRS method that utilizes variational inference techniques to approximate the posterior distribution for the effect sizes. Our experiments with 36 simulation configurations and 12 real phenotypes from the UK Biobank dataset demonstrated that VIPRS is consistently competitive with the state-of-the-art in prediction accuracy while being more than twice as fast as popular MCMC-based approaches. This performance advantage is robust across a variety of genetic architectures, SNP heritabilities, and independent GWAS cohorts. In addition to its competitive accuracy on the "White British" samples, VIPRS showed improved transferability when applied to other ethnic groups, with up to 1.7-fold increase in R2 among individuals of Nigerian ancestry for low-density lipoprotein (LDL) cholesterol. To illustrate its scalability, we applied VIPRS to a dataset of 9.6 million genetic markers, which conferred further improvements in prediction accuracy for highly polygenic traits, such as height.

Assuntos

Estudo de Associação Genômica Ampla , Herança Multifatorial , Humanos , Herança Multifatorial/genética , Estudo de Associação Genômica Ampla/métodos , Teorema de Bayes , Polimorfismo de Nucleotídeo Único/genética , Fatores de Risco , Predisposição Genética para Doença

2.

BayesKAT: bayesian optimal kernel-based test for genetic association studies reveals joint genetic effects in complex diseases.

Das Adhikari, Sikta; Cui, Yuehua; Wang, Jianrong.

Brief Bioinform ; 25(3)2024 Mar 27.

Artigo em Inglês | MEDLINE | ID: mdl-38653490

RESUMO

Genome-wide Association Studies (GWAS) methods have identified individual single-nucleotide polymorphisms (SNPs) significantly associated with specific phenotypes. Nonetheless, many complex diseases are polygenic and are controlled by multiple genetic variants that are usually non-linearly dependent. These genetic variants are marginally less effective and remain undetected in GWAS analysis. Kernel-based tests (KBT), which evaluate the joint effect of a group of genetic variants, are therefore critical for complex disease analysis. However, choosing different kernel functions in KBT can significantly influence the type I error control and power, and selecting the optimal kernel remains a statistically challenging task. A few existing methods suffer from inflated type 1 errors, limited scalability, inferior power or issues of ambiguous conclusions. Here, we present a new Bayesian framework, BayesKAT (https://github.com/wangjr03/BayesKAT), which overcomes these kernel specification issues by selecting the optimal composite kernel adaptively from the data while testing genetic associations simultaneously. Furthermore, BayesKAT implements a scalable computational strategy to boost its applicability, especially for high-dimensional cases where other methods become less effective. Based on a series of performance comparisons using both simulated and real large-scale genetics data, BayesKAT outperforms the available methods in detecting complex group-level associations and controlling type I errors simultaneously. Applied on a variety of groups of functionally related genetic variants based on biological pathways, co-expression gene modules and protein complexes, BayesKAT deciphers the complex genetic basis and provides mechanistic insights into human diseases.

Assuntos

Teorema de Bayes , Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Humanos , Estudo de Associação Genômica Ampla/métodos , Predisposição Genética para Doença , Algoritmos , Software , Biologia Computacional/métodos , Estudos de Associação Genética/métodos

3.

Integration of Prior Expectations and Suppression of Prediction Errors During Expectancy-Induced Pain Modulation: The Influence of Anxiety and Pleasantness.

Tsai, Hsin-Yun; Lapanan, Kulvara; Lin, Yi-Hsuan; Huang, Cheng-Wei; Lin, Wen-Wei; Lin, Min-Min; Lu, Zheng-Liang; Lin, Feng-Sheng; Tseng, Ming-Tsung.

J Neurosci ; 44(17)2024 Apr 24.

Artigo em Inglês | MEDLINE | ID: mdl-38453467

RESUMO

Pain perception arises from the integration of prior expectations with sensory information. Although recent work has demonstrated that treatment expectancy effects (e.g., placebo hypoalgesia) can be explained by a Bayesian integration framework incorporating the precision level of expectations and sensory inputs, the key factor modulating this integration in stimulus expectancy-induced pain modulation remains unclear. In a stimulus expectancy paradigm combining emotion regulation in healthy male and female adults, we found that participants' voluntary reduction in anticipatory anxiety and pleasantness monotonically reduced the magnitude of pain modulation by negative and positive expectations, respectively, indicating a role of emotion. For both types of expectations, Bayesian model comparisons confirmed that an integration model using the respective emotion of expectations and sensory inputs explained stimulus expectancy effects on pain better than using their respective precision. For negative expectations, the role of anxiety is further supported by our fMRI findings that (1) functional coupling within anxiety-processing brain regions (amygdala and anterior cingulate) reflected the integration of expectations with sensory inputs and (2) anxiety appeared to impair the updating of expectations via suppressed prediction error signals in the anterior cingulate, thus perpetuating negative expectancy effects. Regarding positive expectations, their integration with sensory inputs relied on the functional coupling within brain structures processing positive emotion and inhibiting threat responding (medial orbitofrontal cortex and hippocampus). In summary, different from treatment expectancy, pain modulation by stimulus expectancy emanates from emotion-modulated integration of beliefs with sensory evidence and inadequate belief updating.

Assuntos

Antecipação Psicológica , Ansiedade , Imageamento por Ressonância Magnética , Humanos , Masculino , Feminino , Ansiedade/psicologia , Ansiedade/fisiopatologia , Adulto , Antecipação Psicológica/fisiologia , Adulto Jovem , Percepção da Dor/fisiologia , Dor/psicologia , Dor/fisiopatologia , Teorema de Bayes , Emoções/fisiologia , Encéfalo/diagnóstico por imagem , Encéfalo/fisiopatologia , Encéfalo/fisiologia , Prazer/fisiologia , Mapeamento Encefálico

4.

Meta-analysis of breast cancer risk for individuals with PALB2 pathogenic variants.

Ruberu, Thanthirige L M; Braun, Danielle; Parmigiani, Giovanni; Biswas, Swati.

Genet Epidemiol ; 2024 Apr 23.

Artigo em Inglês | MEDLINE | ID: mdl-38654400

RESUMO

Multigene panel testing now allows efficient testing of many cancer susceptibility genes leading to a larger number of mutation carriers being identified. They need to be counseled about their cancer risk conferred by the specific gene mutation. An important cancer susceptibility gene is PALB2. Multiple studies reported risk estimates for breast cancer (BC) conferred by pathogenic variants in PALB2. Due to the diverse modalities of reported risk estimates (age-specific risk, odds ratio, relative risk, and standardized incidence ratio) and effect sizes, a meta-analysis combining these estimates is necessary to accurately counsel patients with this mutation. However, this is not trivial due to heterogeneity of studies in terms of study design and risk measure. We utilized a recently proposed Bayesian random-effects meta-analysis method that can synthesize estimates from such heterogeneous studies. We applied this method to combine estimates from 12 studies on BC risk for carriers of pathogenic PALB2 mutations. The estimated overall (meta-analysis-based) risk of BC is 12.80% (6.11%-22.59%) by age 50 and 48.47% (36.05%-61.74%) by age 80. Pathogenic mutations in PALB2 makes women more susceptible to BC. Our risk estimates can help clinically manage patients carrying pathogenic variants in PALB2.

5.

A novel statistical method for decontaminating T-cell receptor sequencing data.

Li, Ruoxing; Altan, Mehmet; Reuben, Alexandre; Lin, Ruitao; Heymach, John V; Tran, Hai; Chen, Runzhe; Little, Latasha; Hubert, Shawna; Zhang, Jianjun; Li, Ziyi.

Brief Bioinform ; 24(4)2023 07 20.

Artigo em Inglês | MEDLINE | ID: mdl-37337757

RESUMO

The T-cell receptor (TCR) repertoire is highly diverse among the population and plays an essential role in initiating multiple immune processes. TCR sequencing (TCR-seq) has been developed to profile the T cell repertoire. Similar to other high-throughput experiments, contamination can happen during several steps of TCR-seq, including sample collection, preparation and sequencing. Such contamination creates artifacts in the data, leading to inaccurate or even biased results. Most existing methods assume 'clean' TCR-seq data as the starting point with no ability to handle data contamination. Here, we develop a novel statistical model to systematically detect and remove contamination in TCR-seq data. We summarize the observed contamination into two sources, pairwise and cross-cohort. For both sources, we provide visualizations and summary statistics to help users assess the severity of the contamination. Incorporating prior information from 14 existing TCR-seq datasets with minimum contamination, we develop a straightforward Bayesian model to statistically identify contaminated samples. We further provide strategies for removing the impacted sequences to allow for downstream analysis, thus avoiding any need to repeat experiments. Our proposed model shows robustness in contamination detection compared with a few off-the-shelf detection methods in simulation studies. We illustrate the use of our proposed method on two TCR-seq datasets generated locally.

Assuntos

Receptores de Antígenos de Linfócitos T , Linfócitos T , Humanos , Teorema de Bayes , Receptores de Antígenos de Linfócitos T/genética , Modelos Estatísticos , Sequenciamento de Nucleotídeos em Larga Escala/métodos

6.

Stochastic Character Mapping, Bayesian Model Selection, and Biosynthetic Pathways Shed New Light on the Evolution of Habitat Preference in Cyanobacteria.

Bianchini, Giorgio; Hagemann, Martin; Sánchez-Baracaldo, Patricia.

Syst Biol ; 2024 Jun 27.

Artigo em Inglês | MEDLINE | ID: mdl-38934241

RESUMO

Cyanobacteria are the only prokaryotes to have evolved oxygenic photosynthesis paving the way for complex life. Studying the evolution and ecological niche of cyanobacteria and their ancestors is crucial for understanding the intricate dynamics of biosphere evolution. These organisms frequently deal with environmental stressors such as salinity and drought, and they employ compatible solutes as a mechanism to cope with these challenges. Compatible solutes are small molecules that help maintain cellular osmotic balance in high salinity environments, such as marine waters. Their production plays a crucial role in salt tolerance, which, in turn, influences habitat preference. Among the five known compatible solutes produced by cyanobacteria (sucrose, trehalose, glucosylglycerol, glucosylglycerate, and glycine betaine), their synthesis varies between individual strains. In this study, we work in a Bayesian stochastic mapping framework, integrating multiple sources of information about compatible solute biosynthesis in order to predict the ancestral habitat preference of Cyanobacteria. Through extensive model selection analyses and statistical tests for correlation, we identify glucosylglycerol and glucosylglycerate as the most significantly correlated with habitat preference, while trehalose exhibits the weakest correlation. Additionally, glucosylglycerol, glucosylglycerate, and glycine betaine show high loss/gain rate ratios, indicating their potential role in adaptability, while sucrose and trehalose are less likely to be lost due to their additional cellular functions. Contrary to previous findings, our analyses predict that the last common ancestor of Cyanobacteria (living at around 3180 Ma) had a 97% probability of a high salinity habitat preference and was likely able to synthesise glucosylglycerol and glucosylglycerate. Nevertheless, cyanobacteria likely colonized low-salinity environments shortly after their origin, with an 89% probability of the first cyanobacterium with low-salinity habitat preference arising prior to the Great Oxygenation Event (2460 Ma). Stochastic mapping analyses provide evidence of cyanobacteria inhabiting early marine habitats, aiding in the interpretation of the geological record. Our age estimate of ~2590 Ma for the divergence of two major cyanobacterial clades (Macro- and Microcyanobacteria) suggests that these were likely significant contributors to primary productivity in marine habitats in the lead-up to the Great Oxygenation Event, and thus played a pivotal role in triggering the sudden increase in atmospheric oxygen.

7.

Comparing methods for statistical inference with model uncertainty.

Porwal, Anupreet; Raftery, Adrian E.

Proc Natl Acad Sci U S A ; 119(16): e2120737119, 2022 04 19.

Artigo em Inglês | MEDLINE | ID: mdl-35412893

RESUMO

Probability models are used for many statistical tasks, notably parameter estimation, interval estimation, inference about model parameters, point prediction, and interval prediction. Thus, choosing a statistical model and accounting for uncertainty about this choice are important parts of the scientific process. Here we focus on one such choice, that of variables to include in a linear regression model. Many methods have been proposed, including Bayesian and penalized likelihood methods, and it is unclear which one to use. We compared 21 of the most popular methods by carrying out an extensive set of simulation studies based closely on real datasets that span a range of situations encountered in practical data analysis. Three adaptive Bayesian model averaging (BMA) methods performed best across all statistical tasks. These used adaptive versions of Zellner's g-prior for the parameters, where the prior variance parameter g is a function of sample size or is estimated from the data. We found that for BMA methods implemented with Markov chain Monte Carlo, 10,000 iterations were enough. Computationally, we found two of the three best methods (BMA with g=ân and empirical Bayes-local) to be competitive with the least absolute shrinkage and selection operator (LASSO), which is often preferred as a variable selection technique because of its computational efficiency. BMA performed better than Bayesian model selection (in which just one model is selected).

8.

Detecting Episodic Evolution through Bayesian Inference of Molecular Clock Models.

Tay, John H; Baele, Guy; Duchene, Sebastian.

Mol Biol Evol ; 40(10)2023 Oct 04.

Artigo em Inglês | MEDLINE | ID: mdl-37738550

RESUMO

Molecular evolutionary rate variation is a key aspect of the evolution of many organisms that can be modeled using molecular clock models. For example, fixed local clocks revealed the role of episodic evolution in the emergence of SARS-CoV-2 variants of concern. Like all statistical models, however, the reliability of such inferences is contingent on an assessment of statistical evidence. We present a novel Bayesian phylogenetic approach for detecting episodic evolution. It consists of computing Bayes factors, as the ratio of posterior and prior odds of evolutionary rate increases, effectively quantifying support for the effect size. We conducted an extensive simulation study to illustrate the power of this method and benchmarked it to formal model comparison of a range of molecular clock models using (log) marginal likelihood estimation, and to inference under a random local clock model. Quantifying support for the effect size has higher sensitivity than formal model testing and is straight-forward to compute, because it only needs samples from the posterior and prior distribution. However, formal model testing has the advantage of accommodating a wide range molecular clock models. We also assessed the ability of an automated approach, known as the random local clock, where branches under episodic evolution may be detected without their a priori definition. In an empirical analysis of a data set of SARS-CoV-2 genomes, we find "very strong" evidence for episodic evolution. Our results provide guidelines and practical methods for Bayesian detection of episodic evolution, as well as avenues for further research into this phenomenon.

9.

A flexible Bayesian framework for individualized inference via adaptive borrowing.

Ji, Ziyu; Wolfson, Julian.

Biostatistics ; 24(3): 669-685, 2023 Jul 14.

Artigo em Inglês | MEDLINE | ID: mdl-35024790

RESUMO

The explosion in high-resolution data capture technologies in health has increased interest in making inferences about individual-level parameters. While technology may provide substantial data on a single individual, how best to use multisource population data to improve individualized inference remains an open research question. One possible approach, the multisource exchangeability model (MEM), is a Bayesian method for integrating data from supplementary sources into the analysis of a primary source. MEM was originally developed to improve inference for a single study by asymmetrically borrowing information from a set of similar previous studies and was further developed to apply a more computationally intensive symmetric borrowing in the context of basket trial; however, even for asymmetric borrowing, its computational burden grows exponentially with the number of supplementary sources, making it unsuitable for applications where hundreds or thousands of supplementary sources (i.e., individuals) could contribute to inference on a given individual. In this article, we propose the data-driven MEM (dMEM), a two-stage approach that includes both source selection and clustering to enable the inclusion of an arbitrary number of sources to contribute to individualized inference in a computationally tractable and data-efficient way. We illustrate the application of dMEM to individual-level human behavior and mental well-being data collected via smartphones, where our approach increases individual-level estimation precision by 84% compared with a standard no-borrowing method and outperforms recently proposed competing methods in 80% of individuals.

Assuntos

Modelos Estatísticos , Humanos , Teorema de Bayes

10.

Semi-supervised mixture multi-source exchangeability model for leveraging real-world data in clinical trials.

Haine, Lillian M F; Murry, Thomas A; Nahra, Raquel; Touloumi, Giota; Fernández-Cruz, Eduardo; Petoumenos, Kathy; Koopmeiners, Joseph S.

Biostatistics ; 2023 Sep 11.

Artigo em Inglês | MEDLINE | ID: mdl-37697901

RESUMO

The traditional trial paradigm is often criticized as being slow, inefficient, and costly. Statistical approaches that leverage external trial data have emerged to make trials more efficient by augmenting the sample size. However, these approaches assume that external data are from previously conducted trials, leaving a rich source of untapped real-world data (RWD) that cannot yet be effectively leveraged. We propose a semi-supervised mixture (SS-MIX) multisource exchangeability model (MEM); a flexible, two-step Bayesian approach for incorporating RWD into randomized controlled trial analyses. The first step is a SS-MIX model on a modified propensity score and the second step is a MEM. The first step targets a representative subgroup of individuals from the trial population and the second step avoids borrowing when there are substantial differences in outcomes among the trial sample and the representative observational sample. When comparing the proposed approach to competing borrowing approaches in a simulation study, we find that our approach borrows efficiently when the trial and RWD are consistent, while mitigating bias when the trial and external data differ on either measured or unmeasured covariates. We illustrate the proposed approach with an application to a randomized controlled trial investigating intravenous hyperimmune immunoglobulin in hospitalized patients with influenza, while leveraging data from an external observational study to supplement a subgroup analysis by influenza subtype.

11.

Bayesian adaptive model selection design for optimal biological dose finding in phase I/II clinical trials.

Lin, Ruitao; Yin, Guosheng; Shi, Haolun.

Biostatistics ; 24(2): 277-294, 2023 04 14.

Artigo em Inglês | MEDLINE | ID: mdl-34296266

RESUMO

Identification of the optimal dose presents a major challenge in drug development with molecularly targeted agents, immunotherapy, as well as chimeric antigen receptor T-cell treatments. By casting dose finding as a Bayesian model selection problem, we propose an adaptive design by simultaneously incorporating the toxicity and efficacy outcomes to select the optimal biological dose (OBD) in phase I/II clinical trials. Without imposing any parametric assumption or shape constraint on the underlying dose-response curves, we specify curve-free models for both the toxicity and efficacy endpoints to determine the OBD. By integrating the observed data across all dose levels, the proposed design is coherent in dose assignment and thus greatly enhances efficiency and accuracy in pinning down the right dose. Not only does our design possess a completely new yet flexible dose-finding framework, but it also has satisfactory and robust performance as demonstrated by extensive simulation studies. In addition, we show that our design enjoys desirable coherence properties, while most of existing phase I/II designs do not. We further extend the design to accommodate late-onset outcomes which are common in immunotherapy. The proposed design is exemplified with a phase I/II clinical trial in chronic lymphocytic leukemia.

Assuntos

Antineoplásicos , Humanos , Teorema de Bayes , Relação Dose-Resposta a Droga , Dose Máxima Tolerável , Simulação por Computador , Projetos de Pesquisa

12.

Bayesian joint models for multi-regional clinical trials.

Bean, Nathan W; Ibrahim, Joseph G; Psioda, Matthew A.

Biostatistics ; 2023 Sep 05.

Artigo em Inglês | MEDLINE | ID: mdl-37669215

RESUMO

In recent years, multi-regional clinical trials (MRCTs) have increased in popularity in the pharmaceutical industry due to their ability to accelerate the global drug development process. To address potential challenges with MRCTs, the International Council for Harmonisation released the E17 guidance document which suggests the use of statistical methods that utilize information borrowing across regions if regional sample sizes are small. We develop an approach that allows for information borrowing via Bayesian model averaging in the context of a joint analysis of survival and longitudinal data from MRCTs. In this novel application of joint models to MRCTs, we use Laplace's method to integrate over subject-specific random effects and to approximate posterior distributions for region-specific treatment effects on the time-to-event outcome. Through simulation studies, we demonstrate that the joint modeling approach can result in an increased rejection rate when testing the global treatment effect compared with methods that analyze survival data alone. We then apply the proposed approach to data from a cardiovascular outcomes MRCT.

13.

Bayesian multiregional clinical trials using model averaging.

Bean, Nathan W; Ibrahim, Joseph G; Psioda, Matthew A.

Biostatistics ; 24(2): 262-276, 2023 04 14.

Artigo em Inglês | MEDLINE | ID: mdl-34296263

RESUMO

Multiregional clinical trials (MRCTs) provide the benefit of more rapidly introducing drugs to the global market; however, small regional sample sizes can lead to poor estimation quality of region-specific effects when using current statistical methods. With the publication of the International Conference for Harmonisation E17 guideline in 2017, the MRCT design is recognized as a viable strategy that can be accepted by regional regulatory authorities, necessitating new statistical methods that improve the quality of region-specific inference. In this article, we develop a novel methodology for estimating region-specific and global treatment effects for MRCTs using Bayesian model averaging. This approach can be used for trials that compare two treatment groups with respect to a continuous outcome, and it allows for the incorporation of patient characteristics through the inclusion of covariates. We propose an approach that uses posterior model probabilities to quantify evidence in favor of consistency of treatment effects across all regions, and this metric can be used by regulatory authorities for drug approval. We show through simulations that the proposed modeling approach results in lower MSE than a fixed-effects linear regression model and better control of type I error rates than a Bayesian hierarchical model.

Assuntos

Aprovação de Drogas , Projetos de Pesquisa , Humanos , Teorema de Bayes , Resultado do Tratamento , Tamanho da Amostra , Probabilidade

14.

Rethinking ¹³C-metabolic flux analysis - The Bayesian way of flux inference.

Theorell, Axel; Jadebeck, Johann F; Wiechert, Wolfgang; McFadden, Johnjoe; Nöh, Katharina.

Metab Eng ; 83: 137-149, 2024 May.

Artigo em Inglês | MEDLINE | ID: mdl-38582144

RESUMO

Metabolic reaction rates (fluxes) play a crucial role in comprehending cellular phenotypes and are essential in areas such as metabolic engineering, biotechnology, and biomedical research. The state-of-the-art technique for estimating fluxes is metabolic flux analysis using isotopic labelling (13C-MFA), which uses a dataset-model combination to determine the fluxes. Bayesian statistical methods are gaining popularity in the field of life sciences, but the use of 13C-MFA is still dominated by conventional best-fit approaches. The slow take-up of Bayesian approaches is, at least partly, due to the unfamiliarity of Bayesian methods to metabolic engineering researchers. To address this unfamiliarity, we here outline similarities and differences between the two approaches and highlight particular advantages of the Bayesian way of flux analysis. With a real-life example, re-analysing a moderately informative labelling dataset of E. coli, we identify situations in which Bayesian methods are advantageous and more informative, pointing to potential pitfalls of current 13C-MFA evaluation approaches. We propose the use of Bayesian model averaging (BMA) for flux inference as a means of overcoming the problem of model uncertainty through its tendency to assign low probabilities to both, models that are unsupported by data, and models that are overly complex. In this capacity, BMA resembles a tempered Ockham's razor. With the tempered razor as a guide, BMA-based 13C-MFA alleviates the problem of model selection uncertainty and is thereby capable of becoming a game changer for metabolic engineering by uncovering new insights and inspiring novel approaches.

Assuntos

Teorema de Bayes , Isótopos de Carbono , Escherichia coli , Isótopos de Carbono/metabolismo , Escherichia coli/metabolismo , Escherichia coli/genética , Análise do Fluxo Metabólico/métodos , Modelos Biológicos , Engenharia Metabólica/métodos , Marcação por Isótopo

15.

Incorporating graph information in Bayesian factor analysis with robust and adaptive shrinkage priors.

Zhang, Qiyiwen; Chang, Changgee; Shen, Li; Long, Qi.

Biometrics ; 80(1)2024 Jan 29.

Artigo em Inglês | MEDLINE | ID: mdl-38281768

RESUMO

There has been an increasing interest in decomposing high-dimensional multi-omics data into a product of low-rank and sparse matrices for the purpose of dimension reduction and feature engineering. Bayesian factor models achieve such low-dimensional representation of the original data through different sparsity-inducing priors. However, few of these models can efficiently incorporate the information encoded by the biological graphs, which has been already proven to be useful in many analysis tasks. In this work, we propose a Bayesian factor model with novel hierarchical priors, which incorporate the biological graph knowledge as a tool of identifying a group of genes functioning collaboratively. The proposed model therefore enables sparsity within networks by allowing each factor loading to be shrunk adaptively and by considering additional layers to relate individual shrinkage parameters to the underlying graph information, both of which yield a more accurate structure recovery of factor loadings. Further, this new priors overcome the phase transition phenomenon, in contrast to existing graph-incorporated approaches, so that it is robust to noisy edges that are inconsistent with the actual sparsity structure of the factor loadings. Finally, our model can handle both continuous and discrete data types. The proposed method is shown to outperform several existing factor analysis methods through simulation experiments and real data analyses.

Assuntos

Algoritmos , Teorema de Bayes , Simulação por Computador , Análise Fatorial

16.

Flexible evaluation of surrogate markers with Bayesian model averaging.

Duan, Yunshan; Parast, Layla.

Stat Med ; 43(4): 774-792, 2024 02 20.

Artigo em Inglês | MEDLINE | ID: mdl-38081586

RESUMO

When long-term follow up is required for a primary endpoint in a randomized clinical trial, a valid surrogate marker can help to estimate the treatment effect and accelerate the decision process. Several model-based methods have been developed to evaluate the proportion of the treatment effect that is explained by the treatment effect on the surrogate marker. More recently, a nonparametric approach has been proposed allowing for more flexibility by avoiding the restrictive parametric model assumptions required in the model-based methods. While the model-based approaches suffer from potential mis-specification of the models, the nonparametric method fails to give desirable estimates when the sample size is small, or when the range of the data does not follow certain conditions. In this paper, we propose a Bayesian model averaging approach to estimate the proportion of treatment effect explained by the surrogate marker. Our procedure offers a compromise between the model-based approach and the nonparametric approach by introducing model flexibility via averaging over several candidate models and maintains the strength of parametric models with respect to inference. We compare our approach with previous model-based methods and the nonparametric method. Simulation studies demonstrate the advantage of our method when surrogate supports are inconsistent and sample sizes are small. We illustrate our method using data from the Diabetes Prevention Program study to examine hemoglobin A1c as a surrogate marker for fasting glucose.

Assuntos

Diabetes Mellitus , Humanos , Teorema de Bayes , Simulação por Computador , Tamanho da Amostra , Biomarcadores

17.

Flexible cost-penalized Bayesian model selection: Developing inclusion paths with an application to diagnosis of heart disease.

Porter, Erica M; Franck, Christopher T; Adams, Stephen.

Stat Med ; 2024 May 27.

Artigo em Inglês | MEDLINE | ID: mdl-38800970

RESUMO

We propose a Bayesian model selection approach that allows medical practitioners to select among predictor variables while taking their respective costs into account. Medical procedures almost always incur costs in time and/or money. These costs might exceed their usefulness for modeling the outcome of interest. We develop Bayesian model selection that uses flexible model priors to penalize costly predictors a priori and select a subset of predictors useful relative to their costs. Our approach (i) gives the practitioner control over the magnitude of cost penalization, (ii) enables the prior to scale well with sample size, and (iii) enables the creation of our proposed inclusion path visualization, which can be used to make decisions about individual candidate predictors using both probabilistic and visual tools. We demonstrate the effectiveness of our inclusion path approach and the importance of being able to adjust the magnitude of the prior's cost penalization through a dataset pertaining to heart disease diagnosis in patients at the Cleveland Clinic Foundation, where several candidate predictors with various costs were recorded for patients, and through simulated data.

18.

The fusion point of temporal binding: Promises and perils of multisensory accounts.

Klaffehn, Annika L; Herbort, Oliver; Pfister, Roland.

Cogn Psychol ; 151: 101662, 2024 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-38772251

RESUMO

Performing an action to initiate a consequence in the environment triggers the perceptual illusion of temporal binding. This phenomenon entails that actions and following effects are perceived to occur closer in time than they do outside the action-effect relationship. Here we ask whether temporal binding can be explained in terms of multisensory integration, by assuming either multisensory fusion or partial integration of the two events. We gathered two datasets featuring a wide range of action-effect delays as a key factor influencing integration. We then tested the fit of a computational model for multisensory integration, the statistically optimal cue integration (SOCI) model. Indeed, qualitative aspects of the data on a group-level followed the principles of a multisensory account. By contrast, quantitative evidence from a comprehensive model evaluation indicated that temporal binding cannot be reduced to multisensory integration. Rather, multisensory integration should be seen as one of several component processes underlying temporal binding on an individual level.

Assuntos

Percepção Visual , Humanos , Adulto , Masculino , Feminino , Percepção Visual/fisiologia , Adulto Jovem , Sinais (Psicologia) , Percepção Auditiva/fisiologia , Ilusões , Percepção do Tempo , Modelos Psicológicos

19.

A Bayesian model to identify multiple expression patterns with simultaneous FDR control for a multi-factor RNA-seq experiment.

Bian, Yuanyuan; He, Chong; Qiu, Jing.

Stat Appl Genet Mol Biol ; 22(1)2023 01 01.

Artigo em Inglês | MEDLINE | ID: mdl-37082815

RESUMO

It is often of research interest to identify genes that satisfy a particular expression pattern across different conditions such as tissues, genotypes, etc. One common practice is to perform differential expression analysis for each condition separately and then take the intersection of differentially expressed (DE) genes or non-DE genes under each condition to obtain genes that satisfy a particular pattern. Such a method can lead to many false positives, especially when the desired gene expression pattern involves equivalent expression under one condition. In this paper, we apply a Bayesian partition model to identify genes of all desired patterns while simultaneously controlling their false discovery rates (FDRs). Our simulation studies show that the common practice fails to control group specific FDRs for patterns involving equivalent expression while the proposed Bayesian method simultaneously controls group specific FDRs at all settings studied. In addition, the proposed method is more powerful when the FDR of the common practice is under control for identifying patterns only involving DE genes. Our simulation studies also show that it is an inherently more challenging problem to identify patterns involving equivalent expression than patterns only involving differential expression. Therefore, larger sample sizes are required to obtain the same target power to identify the former types of patterns than the latter types of patterns.

Assuntos

Perfilação da Expressão Gênica , RNA-Seq , Perfilação da Expressão Gênica/métodos , Teorema de Bayes , Simulação por Computador , Sequenciamento do Exoma

20.

A novel non-negative Bayesian stacking modeling method for Cancer survival prediction using high-dimensional omics data.

Shen, Junjie; Wang, Shuo; Sun, Hao; Huang, Jie; Bai, Lu; Wang, Xichao; Dong, Yongfei; Tang, Zaixiang.

BMC Med Res Methodol ; 24(1): 105, 2024 May 03.

Artigo em Inglês | MEDLINE | ID: mdl-38702624

RESUMO

BACKGROUND: Survival prediction using high-dimensional molecular data is a hot topic in the field of genomics and precision medicine, especially for cancer studies. Considering that carcinogenesis has a pathway-based pathogenesis, developing models using such group structures is a closer mimic of disease progression and prognosis. Many approaches can be used to integrate group information; however, most of them are single-model methods, which may account for unstable prediction. METHODS: We introduced a novel survival stacking method that modeled using group structure information to improve the robustness of cancer survival prediction in the context of high-dimensional omics data. With a super learner, survival stacking combines the prediction from multiple sub-models that are independently trained using the features in pre-grouped biological pathways. In addition to a non-negative linear combination of sub-models, we extended the super learner to non-negative Bayesian hierarchical generalized linear model and artificial neural network. We compared the proposed modeling strategy with the widely used survival penalized method Lasso Cox and several group penalized methods, e.g., group Lasso Cox, via simulation study and real-world data application. RESULTS: The proposed survival stacking method showed superior and robust performance in terms of discrimination compared with single-model methods in case of high-noise simulated data and real-world data. The non-negative Bayesian stacking method can identify important biological signal pathways and genes that are associated with the prognosis of cancer. CONCLUSIONS: This study proposed a novel survival stacking strategy incorporating biological group information into the cancer prognosis models. Additionally, this study extended the super learner to non-negative Bayesian model and ANN, enriching the combination of sub-models. The proposed Bayesian stacking strategy exhibited favorable properties in the prediction and interpretation of complex survival data, which may aid in discovering cancer targets.

Assuntos

Teorema de Bayes , Genômica , Neoplasias , Humanos , Neoplasias/genética , Neoplasias/mortalidade , Genômica/métodos , Prognóstico , Algoritmos , Modelos de Riscos Proporcionais , Redes Neurais de Computação , Análise de Sobrevida , Biologia Computacional/métodos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA