RESUMEN
The aim of fine mapping is to identify genetic variants causally contributing to complex traits or diseases. Existing fine-mapping methods employ Bayesian discrete mixture priors and depend on a pre-specified maximum number of causal variants, which may lead to sub-optimal solutions. In this work, we propose a Bayesian fine-mapping method called h2-D2, utilizing a continuous global-local shrinkage prior. We also present an approach to define credible sets of causal variants in continuous prior settings. Simulation studies demonstrate that h2-D2 outperforms current state-of-the-art fine-mapping methods such as SuSiE and FINEMAP in accurately identifying causal variants and estimating their effect sizes. We further applied h2-D2 to prostate cancer analysis and discovered some previously unknown causal variants. In addition, we inferred 369 target genes associated with the detected causal variants and several pathways that were significantly over-represented by these genes, shedding light on their potential roles in prostate cancer development and progression.
Asunto(s)
Neoplasias de la Próstata , Sitios de Carácter Cuantitativo , Masculino , Humanos , Teorema de Bayes , Polimorfismo de Nucleótido Simple/genética , Simulación por Computador , Neoplasias de la Próstata/genética , Estudio de Asociación del Genoma Completo/métodosRESUMEN
Recent advances in microfluidics and sequencing technologies allow researchers to explore cellular heterogeneity at single-cell resolution. In recent years, deep learning frameworks, such as generative models, have brought great changes to the analysis of transcriptomic data. Nevertheless, relying on the potential space of these generative models alone is insufficient to generate biological explanations. In addition, most of the previous work based on generative models is limited to shallow neural networks with one to three layers of latent variables, which may limit the capabilities of the models. Here, we propose a deep interpretable generative model called d-scIGM for single-cell data analysis. d-scIGM combines sawtooth connectivity techniques and residual networks, thereby constructing a deep generative framework. In addition, d-scIGM incorporates hierarchical prior knowledge of biological domains to enhance the interpretability of the model. We show that d-scIGM achieves excellent performance in a variety of fundamental tasks, including clustering, visualization, and pseudo-temporal inference. Through topic pathway studies, we found that d-scIGM-learned topics are better enriched for biologically meaningful pathways compared to the baseline models. Furthermore, the analysis of drug response data shows that d-scIGM can capture drug response patterns in large-scale experiments, which provides a promising way to elucidate the underlying biological mechanisms. Lastly, in the melanoma dataset, d-scIGM accurately identified different cell types and revealed multiple melanin-related driver genes and key pathways, which are critical for understanding disease mechanisms and drug development.
Asunto(s)
Aprendizaje Profundo , RNA-Seq , Análisis de Expresión Génica de una Sola Célula , Humanos , Algoritmos , Biología Computacional/métodos , Redes Neurales de la Computación , RNA-Seq/métodos , Análisis de Expresión Génica de una Sola Célula/métodosRESUMEN
New tools for cell signaling pathway inference from multi-omics data that are independent of previous knowledge are needed. Here, we propose a new de novo method, the de novo multi-omics pathway analysis (DMPA), to model and combine omics data into network modules and pathways. DMPA was validated with published omics data and was found accurate in discovering reported molecular associations in transcriptome, interactome, phosphoproteome, methylome, and metabolomics data, and signaling pathways in multi-omics data. DMPA was benchmarked against module discovery and multi-omics integration methods and outperformed previous methods in module and pathway discovery especially when applied to datasets of relatively low sample sizes. Transcription factor, kinase, subcellular location, and function prediction algorithms were devised for transcriptome, phosphoproteome, and interactome modules and pathways, respectively. To apply DMPA in a biologically relevant context, interactome, phosphoproteome, transcriptome, and proteome data were collected from analyses carried out using melanoma cells to address gamma-secretase cleavage-dependent signaling characteristics of the receptor tyrosine kinase TYRO3. The pathways modeled with DMPA reflected the predicted function and its direction in validation experiments.
Asunto(s)
Proteómica , Transducción de Señal , Humanos , Proteómica/métodos , Algoritmos , Transcriptoma , Metabolómica/métodos , Biología Computacional/métodos , Proteoma/metabolismo , Fosfoproteínas/metabolismo , MultiómicaRESUMEN
Epidemiology has been transformed by the advent of Bayesian phylodynamic models that allow researchers to infer the geographic history of pathogen dispersal over a set of discrete geographic areas [1, 2]. These models provide powerful tools for understanding the spatial dynamics of disease outbreaks, but contain many parameters that are inferred from minimal geographic information (i.e., the single area in which each pathogen was sampled). Consequently, inferences under these models are inherently sensitive to our prior assumptions about the model parameters. Here, we demonstrate that the default priors used in empirical phylodynamic studies make strong and biologically unrealistic assumptions about the underlying geographic process. We provide empirical evidence that these unrealistic priors strongly (and adversely) impact commonly reported aspects of epidemiological studies, including: 1) the relative rates of dispersal between areas; 2) the importance of dispersal routes for the spread of pathogens among areas; 3) the number of dispersal events between areas, and; 4) the ancestral area in which a given outbreak originated. We offer strategies to avoid these problems, and develop tools to help researchers specify more biologically reasonable prior models that will realize the full potential of phylodynamic methods to elucidate pathogen biology and, ultimately, inform surveillance and monitoring policies to mitigate the impacts of disease outbreaks.
Asunto(s)
Brotes de Enfermedades , Filogenia , Teorema de BayesRESUMEN
One of the key objectives in geophysics is to characterize the subsurface through the process of analyzing and interpreting geophysical field data that are typically acquired at the surface. Data-driven deep learning methods have enormous potential for accelerating and simplifying the process but also face many challenges, including poor generalizability, weak interpretability, and physical inconsistency. We present three strategies for imposing domain knowledge constraints on deep neural networks (DNNs) to help address these challenges. The first strategy is to integrate constraints into data by generating synthetic training datasets through geological and geophysical forward modeling and properly encoding prior knowledge as part of the input fed into the DNNs. The second strategy is to design nontrainable custom layers of physical operators and preconditioners in the DNN architecture to modify or shape feature maps calculated within the network to make them consistent with the prior knowledge. The final strategy is to implement prior geological information and geophysical laws as regularization terms in loss functions for training the DNNs. We discuss the implementation of these strategies in detail and demonstrate their effectiveness by applying them to geophysical data processing, imaging, interpretation, and subsurface model building.
RESUMEN
In this paper, we introduce an efficient method for computing curves minimizing a variant of the Euler-Mumford elastica energy, with fixed endpoints and tangents at these endpoints, where the bending energy is enhanced with a user-defined and data-driven scalar-valued term referred to as the curvature prior. In order to guarantee that the globally optimal curve is extracted, the proposed method involves the numerical computation of the viscosity solution to a specific static Hamilton-Jacobi-Bellman (HJB) partial differential equation (PDE). For that purpose, we derive the explicit Hamiltonian associated with this variant model equipped with a curvature prior, discretize the resulting HJB PDE using an adaptive finite difference scheme, and solve it in a single pass using a generalized fast-marching method. In addition, we also present a practical method for estimating the curvature prior values from image data, designed for the task of accurately tracking curvilinear structure centerlines. Numerical experiments on synthetic and real-image data illustrate the advantages of the considered variant of the elastica model with a prior curvature enhancement in complex scenarios where challenging geometric structures appear.
RESUMEN
Infectious virus shedding from individuals infected with severe acute respiratory syndrome-coronavirus 2 (SARS-CoV-2) is used to estimate human-to-human transmission risk. Control of SARS-CoV-2 transmission requires identifying the immune correlates that protect infectious virus shedding. Mucosal immunity prevents infection by SARS-CoV-2, which replicates in the respiratory epithelium and spreads rapidly to other hosts. However, whether mucosal immunity prevents the shedding of the infectious virus in SARS-CoV-2-infected individuals is unknown. We examined the relationship between viral RNA shedding dynamics, duration of infectious virus shedding, and mucosal antibody responses during SARS-CoV-2 infection. Anti-spike secretory IgA antibodies (S-IgA) reduced viral RNA load and infectivity more than anti-spike IgG/IgA antibodies in infected nasopharyngeal samples. Compared with the IgG/IgA response, the anti-spike S-IgA post-infection responses affected the viral RNA shedding dynamics and predicted the duration of infectious virus shedding regardless of the immune history. These findings highlight the importance of anti-spike S-IgA responses in individuals infected with SARS-CoV-2 for preventing infectious virus shedding and SARS-CoV-2 transmission. Developing medical countermeasures to shorten S-IgA response time may help control human-to-human transmission of SARS-CoV-2 infection and prevent future respiratory virus pandemics.
Asunto(s)
COVID-19 , Humanos , SARS-CoV-2 , Esparcimiento de Virus , Formación de Anticuerpos , Tiempo de Reacción , Anticuerpos Antivirales , ARN Viral , Inmunoglobulina G , Inmunoglobulina A , Inmunoglobulina A SecretoraRESUMEN
Cancer is molecularly heterogeneous, with seemingly similar patients having different molecular landscapes and accordingly different clinical behaviors. In recent studies, gene expression networks have been shown as more effective/informative for cancer heterogeneity analysis than some simpler measures. Gene interconnections can be classified as "direct" and "indirect," where the latter can be caused by shared genomic regulators (such as transcription factors, microRNAs, and other regulatory molecules) and other mechanisms. It has been suggested that incorporating the regulators of gene expressions in network analysis and focusing on the direct interconnections can lead to a deeper understanding of the more essential gene interconnections. Such analysis can be seriously challenged by the large number of parameters (jointly caused by network analysis, incorporation of regulators, and heterogeneity) and often weak signals. To effectively tackle this problem, we propose incorporating prior information contained in the published literature. A key challenge is that such prior information can be partial or even wrong. We develop a two-step procedure that can flexibly accommodate different levels of prior information quality. Simulation demonstrates the effectiveness of the proposed approach and its superiority over relevant competitors. In the analysis of a breast cancer dataset, findings different from the alternatives are made, and the identified sample subgroups have important clinical differences.
RESUMEN
Determining causes of deaths (CODs) occurred outside of civil registration and vital statistics systems is challenging. A technique called verbal autopsy (VA) is widely adopted to gather information on deaths in practice. A VA consists of interviewing relatives of a deceased person about symptoms of the deceased in the period leading to the death, often resulting in multivariate binary responses. While statistical methods have been devised for estimating the cause-specific mortality fractions (CSMFs) for a study population, continued expansion of VA to new populations (or "domains") necessitates approaches that recognize between-domain differences while capitalizing on potential similarities. In this article, we propose such a domain-adaptive method that integrates external between-domain similarity information encoded by a prespecified rooted weighted tree. Given a cause, we use latent class models to characterize the conditional distributions of the responses that may vary by domain. We specify a logistic stick-breaking Gaussian diffusion process prior along the tree for class mixing weights with node-specific spike-and-slab priors to pool information between the domains in a data-driven way. The posterior inference is conducted via a scalable variational Bayes algorithm. Simulation studies show that the domain adaptation enabled by the proposed method improves CSMF estimation and individual COD assignment. We also illustrate and evaluate the method using a validation dataset. The article concludes with a discussion of limitations and future directions.
Asunto(s)
Autopsia , Teorema de Bayes , Causas de Muerte , Humanos , Autopsia/métodos , Modelos Estadísticos , Bioestadística/métodosRESUMEN
Bayesian graphical models are powerful tools to infer complex relationships in high dimension, yet are often fraught with computational and statistical challenges. If exploited in a principled way, the increasing information collected alongside the data of primary interest constitutes an opportunity to mitigate these difficulties by guiding the detection of dependence structures. For instance, gene network inference may be informed by the use of publicly available summary statistics on the regulation of genes by genetic variants. Here we present a novel Gaussian graphical modeling framework to identify and leverage information on the centrality of nodes in conditional independence graphs. Specifically, we consider a fully joint hierarchical model to simultaneously infer (i) sparse precision matrices and (ii) the relevance of node-level information for uncovering the sought-after network structure. We encode such information as candidate auxiliary variables using a spike-and-slab submodel on the propensity of nodes to be hubs, which allows hypothesis-free selection and interpretation of a sparse subset of relevant variables. As efficient exploration of large posterior spaces is needed for real-world applications, we develop a variational expectation conditional maximization algorithm that scales inference to hundreds of samples, nodes and auxiliary variables. We illustrate and exploit the advantages of our approach in simulations and in a gene network study which identifies hub genes involved in biological pathways relevant to immune-mediated diseases.
RESUMEN
Protein arginine methylation is an important posttranslational modification (PTM) associated with protein functional diversity and pathological conditions including cancer. Identification of methylation binding sites facilitates a better understanding of the molecular function of proteins. Recent developments in the field of deep neural networks have led to a proliferation of deep learning-based methylation identification studies because of their fast and accurate prediction. In this paper, we propose DeepGpgs, an advanced deep learning model incorporating Gaussian prior and gated attention mechanism. We introduce a residual network channel to extract the evolutionary information of proteins. Then we combine the adaptive embedding with bidirectional long short-term memory networks to form a context-shared encoder layer. A gated multi-head attention mechanism is followed to obtain the global information about the sequence. A Gaussian prior is injected into the sequence to assist in predicting PTMs. We also propose a weighted joint loss function to alleviate the false negative problem. We empirically show that DeepGpgs improves Matthews correlation coefficient by 6.3% on the arginine methylation independent test set compared with the existing state-of-the-art methylation site prediction methods. Furthermore, DeepGpgs has good robustness in phosphorylation site prediction of SARS-CoV-2, which indicates that DeepGpgs has good transferability and the potential to be extended to other modification sites prediction. The open-source code and data of the DeepGpgs can be obtained from https://github.com/saizhou1/DeepGpgs.
Asunto(s)
COVID-19 , Aprendizaje Profundo , Humanos , Metilación , Arginina/metabolismo , SARS-CoV-2/metabolismo , Proteínas/metabolismoRESUMEN
BACKGROUND: Single cell RNA sequencing technology (scRNA-seq) has been proven useful in understanding cell-specific disease mechanisms. However, identifying genes of interest remains a key challenge. Pseudo-bulk methods that pool scRNA-seq counts in the same biological replicates have been commonly used to identify differentially expressed genes. However, such methods may lack power due to the limited sample size of scRNA-seq datasets, which can be prohibitively expensive. RESULTS: Motivated by this, we proposed to use the Bayesian-frequentist hybrid (BFH) framework to increase the power and we showed in simulated scenario, the proposed BFH would be an optimal method when compared with other popular single cell differential expression methods if both FDR and power were considered. As an example, the method was applied to an idiopathic pulmonary fibrosis (IPF) case study. CONCLUSION: In our IPF example, we demonstrated that with a proper informative prior, the BFH approach identified more genes of interest. Furthermore, these genes were reasonable based on the current knowledge of IPF. Thus, the BFH offers a unique and flexible framework for future scRNA-seq analyses.
Asunto(s)
Teorema de Bayes , RNA-Seq , Análisis de Secuencia de ARN , Análisis de la Célula Individual , Análisis de la Célula Individual/métodos , Humanos , RNA-Seq/métodos , Análisis de Secuencia de ARN/métodos , Fibrosis Pulmonar Idiopática/genética , Fibrosis Pulmonar Idiopática/patología , Perfilación de la Expresión Génica/métodos , AlgoritmosRESUMEN
Bayesian models of perception posit that percepts result from the optimal integration of new sensory information and prior expectations. In turn, prominent models of perceptual disturbances in psychosis frame hallucination-like phenomena as percepts excessively biased toward perceptual prior expectations. Despite mounting support for this notion, whether this hallucination-related prior bias results secondarily from imprecise sensory representations at early processing stages or directly from alterations in perceptual priors-both suggested candidates potentially consistent with Bayesian models-remains to be tested. Using modified interval timing paradigms designed to arbitrate between these alternative hypotheses, we show in human participants (16 females and 24 males) from a nonclinical population that hallucination proneness correlates with a circumscribed form of prior bias that reflects selective differences in weighting of contextual prior variance, a prior bias that is unrelated to the effect of sensory noise and to a separate index of sensory resolution. Our results thus suggest distinct mechanisms underlying prior biases in perceptual inference and favor the notion that hallucination proneness could reflect direct alterations in the representation or use of perceptual priors independent of sensory noise.SIGNIFICANCE STATEMENT Current theories of psychosis posit that hallucination proneness results from excessive influence of prior expectations on perception. It is not clear whether this prior bias represents a primary top-down process related to the representation or use of prior beliefs or instead a secondary bottom-up process stemming from imprecise sensory representations at early processing stages. To address this question, we examined interval timing behaviors captured by Bayesian perceptual-inference models. Our data support the notion that excessive influence of prior expectations associated with hallucination propensity is not directly secondary to sensory imprecision and is instead more consistent with a primary top-down process. These results help refine computational theories of psychosis and may contribute to the development of improved intervention targets.
Asunto(s)
Ilusiones , Trastornos Psicóticos , Masculino , Femenino , Humanos , Teorema de Bayes , Alucinaciones , SesgoRESUMEN
Prior knowledge has a profound impact on the way we perceive the world. However, it remains unclear how the prior knowledge is maintained in our brains and thereby influences the subsequent conscious perception. The Dalmatian dog illusion is a perfect tool to study prior knowledge, where the picture is initially perceived as noise. Once the prior knowledge was introduced, a Dalmatian dog could be consciously seen, and the picture immediately became meaningful. Using pictures with hidden objects as standard stimuli and similar pictures without hidden objects as deviant stimuli, we investigated the neural representation of prior knowledge and its impact on conscious perception in an oddball paradigm using electroencephalogram (EEG) in both male and female human subjects. We found that the neural patterns between the prestimulus alpha band oscillations and poststimulus EEG activity were significantly more similar for the standard stimuli than for the deviant stimuli after prior knowledge was provided. Furthermore, decoding analysis revealed that persistent neural templates were evoked after the introduction of prior knowledge, similar to that evoked in the early stages of visual processing. In conclusion, the current study suggests that prior knowledge uses alpha band oscillations in a multivariate manner in the prestimulus period and induces specific persistent neural templates in the poststimulus period, enabling the conscious perception of the hidden objects.SIGNIFICANCE STATEMENT The visual world we live in is not always optimal. In dark or noisy environments, prior knowledge can help us interpret imperfect sensory signals and enable us to consciously perceive hidden objects. However, we still know very little about how prior knowledge works at the neural level. Using the Dalmatian dog illusion and multivariate methods, we found that prior knowledge uses prestimulus alpha band oscillations to carry information about the hidden object and exerts a persistent influence in the poststimulus period by inducing specific neural templates. Our findings provide a window into the neural underpinnings of prior knowledge and offer new insights into the role of alpha band oscillations and neural templates associated with conscious perception.
Asunto(s)
Ilusiones , Animales , Perros , Humanos , Masculino , Femenino , Ilusiones/fisiología , Percepción Visual/fisiología , Electroencefalografía/métodos , Encéfalo , Estado de Conciencia/fisiología , Estimulación Luminosa/métodosRESUMEN
BACKGROUND: Clustering is a fundamental problem in statistics and has broad applications in various areas. Traditional clustering methods treat features equally and ignore the potential structure brought by the characteristic difference of features. Especially in cancer diagnosis and treatment, several types of biological features are collected and analyzed together. Treating these features equally fails to identify the heterogeneity of both data structure and cancer itself, which leads to incompleteness and inefficacy of current anti-cancer therapies. OBJECTIVES: In this paper, we propose a clustering framework based on hierarchical heterogeneous data with prior pairwise relationships. The proposed clustering method fully characterizes the difference of features and identifies potential hierarchical structure by rough and refined clusters. RESULTS: The refined clustering further divides the clusters obtained by the rough clustering into different subtypes. Thus it provides a deeper insight of cancer that can not be detected by existing clustering methods. The proposed method is also flexible with prior information, additional pairwise relationships of samples can be incorporated to help to improve clustering performance. Finally, well-grounded statistical consistency properties of our proposed method are rigorously established, including the accurate estimation of parameters and determination of clustering structures. CONCLUSIONS: Our proposed method achieves better clustering performance than other methods in simulation studies, and the clustering accuracy increases with prior information incorporated. Meaningful biological findings are obtained in the analysis of lung adenocarcinoma with clinical imaging data and omics data, showing that hierarchical structure produced by rough and refined clustering is necessary and reasonable.
Asunto(s)
Adenocarcinoma del Pulmón , Neoplasias Pulmonares , Humanos , Análisis por Conglomerados , Simulación por ComputadorRESUMEN
BACKGROUND: High-dimensional omics data are increasingly utilized in clinical and public health research for disease risk prediction. Many previous sparse methods have been proposed that using prior knowledge, e.g., biological group structure information, to guide the model-building process. However, these methods are still based on a single model, offen leading to overconfident inferences and inferior generalization. RESULTS: We proposed a novel stacking strategy based on a non-negative spike-and-slab Lasso (nsslasso) generalized linear model (GLM) for disease risk prediction in the context of high-dimensional omics data. Briefly, we used prior biological knowledge to segment omics data into a set of sub-data. Each sub-model was trained separately using the features from the group via a proper base learner. Then, the predictions of sub-models were ensembled by a super learner using nsslasso GLM. The proposed method was compared to several competitors, such as the Lasso, grlasso, and gsslasso, using simulated data and two open-access breast cancer data. As a result, the proposed method showed robustly superior prediction performance to the optimal single-model method in high-noise simulated data and real-world data. Furthermore, compared to the traditional stacking method, the proposed nsslasso stacking method can efficiently handle redundant sub-models and identify important sub-models. CONCLUSIONS: The proposed nsslasso method demonstrated favorable predictive accuracy, stability, and biological interpretability. Additionally, the proposed method can also be used to detect new biomarkers and key group structures.
Asunto(s)
Neoplasias de la Mama , Humanos , Femenino , Modelos Lineales , Neoplasias de la Mama/genéticaRESUMEN
Currently, the only effect size prior that is routinely implemented in a Bayesian fine-mapping multi-single-nucleotide polymorphism (SNP) analysis is the Gaussian prior. Here, we show how the Laplace prior can be deployed in Bayesian multi-SNP fine mapping studies. We compare the ranking performance of the posterior inclusion probability (PIP) using a Laplace prior with the ranking performance of the corresponding Gaussian prior and FINEMAP. Our results indicate that, for the simulation scenarios we consider here, the Laplace prior can lead to higher PIPs than either the Gaussian prior or FINEMAP, particularly for moderately sized fine-mapping studies. The Laplace prior also appears to have better worst-case scenario properties. We reanalyse the iCOGS case-control data from the CASP8 region on Chromosome 2. Even though this study has a total sample size of nearly 90,000 individuals, there are still some differences in the top few ranked SNPs if the Laplace prior is used rather than the Gaussian prior. R code to implement the Laplace (and Gaussian) prior is available at https://github.com/Kevin-walters/lapmapr.
Asunto(s)
Modelos Genéticos , Polimorfismo de Nucleótido Simple , Humanos , Teorema de Bayes , Simulación por Computador , ProbabilidadRESUMEN
The data-driven approach of supervised learning methods has limited applicability in solving dipole inversion in Quantitative Susceptibility Mapping (QSM) with varying scan parameters across different objects. To address this generalization issue in supervised QSM methods, we propose a novel training-free model-based unsupervised method called MoDIP (Model-based Deep Image Prior). MoDIP comprises a small, untrained network and a Data Fidelity Optimization (DFO) module. The network converges to an interim state, acting as an implicit prior for image regularization, while the optimization process enforces the physical model of QSM dipole inversion. Experimental results demonstrate MoDIP's excellent generalizability in solving QSM dipole inversion across different scan parameters. It exhibits robustness against pathological brain QSM, achieving over 32 % accuracy improvement than supervised deep learning methods. It is also 33 % more computationally efficient and runs 4 times faster than conventional DIP-based approaches, enabling 3D high-resolution image reconstruction in under 4.5 min.
Asunto(s)
Encéfalo , Felodipino , Humanos , Encéfalo/diagnóstico por imagen , Procesamiento de Imagen Asistido por Computador/métodos , Imagen por Resonancia Magnética/métodos , Mapeo Encefálico/métodos , AlgoritmosRESUMEN
To configure our limbs in space the brain must compute their position based on sensory information provided by mechanoreceptors in the skin, muscles, and joints. Because this information is corrupted by noise, the brain is thought to process it probabilistically, and integrate it with prior belief about arm posture, following Bayes' rule. Here, we combined computational modeling with behavioral experimentation to test this hypothesis. The model conceives the perception of arm posture as the combination of a probabilistic kinematic chain composed by the shoulder, elbow, and wrist angles, compromised with additive Gaussian noise, with a Gaussian prior about these joint angles. We tested whether the model explains errors in a VR-based posture matching task better than a model that assumes a uniform prior. Human participants were required to align their unseen right arm to a target posture, presented as a visual configuration of the arm in the horizontal plane. Results show idiosyncratic biases in how participants matched their unseen arm to the target posture. We used maximum likelihood estimation to fit the Bayesian model to these observations and estimate key parameters including the prior means and its variance-covariance structure. The Bayesian model including a Gaussian prior explained the response biases and variance much better than a model with a uniform prior. The prior varied across participants, consistent with the idiosyncrasies in arm posture perception, and in alignment with previous behavioral research. Our work clarifies the biases in arm posture perception within a new perspective on the nature of proprioceptive computations.
RESUMEN
Molecular evolutionary rate variation is a key aspect of the evolution of many organisms that can be modeled using molecular clock models. For example, fixed local clocks revealed the role of episodic evolution in the emergence of SARS-CoV-2 variants of concern. Like all statistical models, however, the reliability of such inferences is contingent on an assessment of statistical evidence. We present a novel Bayesian phylogenetic approach for detecting episodic evolution. It consists of computing Bayes factors, as the ratio of posterior and prior odds of evolutionary rate increases, effectively quantifying support for the effect size. We conducted an extensive simulation study to illustrate the power of this method and benchmarked it to formal model comparison of a range of molecular clock models using (log) marginal likelihood estimation, and to inference under a random local clock model. Quantifying support for the effect size has higher sensitivity than formal model testing and is straight-forward to compute, because it only needs samples from the posterior and prior distribution. However, formal model testing has the advantage of accommodating a wide range molecular clock models. We also assessed the ability of an automated approach, known as the random local clock, where branches under episodic evolution may be detected without their a priori definition. In an empirical analysis of a data set of SARS-CoV-2 genomes, we find "very strong" evidence for episodic evolution. Our results provide guidelines and practical methods for Bayesian detection of episodic evolution, as well as avenues for further research into this phenomenon.