RESUMO
This is a commentary for a special issue on predictive processing and rational constructivist models of development. Mainly I use the opportunity to ask a bunch of questions about what these theoretical frameworks show us (and what they do not) and mostly where the open questions still are. To get meta for a moment, I thought these questions were the best way to maximize the value of my commentary: They have the highest probability of leading to the most uncertainty reduction for our field in the long term. Please read in that spirit.
RESUMO
Prior beliefs are central to Bayesian accounts of cognition, but many of these accounts do not directly measure priors. More specifically, initial states of belief heavily influence how new information is assumed to be utilized when updating a particular model. Despite this, prior and posterior beliefs are either inferred from sequential participant actions or elicited through impoverished means. We had participants to play a version of the game 'Plinko', to first elicit individual participant priors in a theoretically agnostic manner. Subsequent learning and updating of participant beliefs was then directly measured. We show that participants hold various priors that cluster around prototypical probability distributions that in turn influence learning. In follow-up studies, we show that participant priors are stable over time and that the ability to update beliefs is influenced by a simple environmental manipulation (i.e., a short break). These data reveal the importance of directly measuring participant beliefs rather than assuming or inferring them as has been widely done in the literature to date. The Plinko game provides a flexible and fecund means for examining statistical learning and mental model updating.
Assuntos
Teorema de Bayes , Aprendizagem , Modelos Psicológicos , Humanos , Masculino , Feminino , Adulto , Adulto Jovem , Cognição/fisiologia , CulturaRESUMO
Epidemic models serve as a useful analytical tool to study how a disease behaves in a given population. Individual-level models (ILMs) can incorporate individual-level covariate information including spatial information, accounting for heterogeneity within the population. However, the high-level data required to parameterize an ILM may often be available only for a sub-population of a larger population (e.g., a given county, province, or country). As a result, parameter estimates may be affected by edge effects caused by infection originating from outside the observed population. Here, we look at how such edge effects can bias parameter estimates for within the context of spatial ILMs, and suggest a method to improve model fitting in the presence of edge effects when some global measure of epidemic severity is available from the unobserved part of the population. We apply our models to simulated data, as well as data from the UK 2001 foot-and-mouth disease epidemic.
Assuntos
Febre Aftosa , Humanos , Febre Aftosa/epidemiologia , Reino Unido/epidemiologia , Análise Espacial , Modelos Epidemiológicos , Epidemias , Doenças Transmissíveis/epidemiologia , Simulação por Computador , Modelos EstatísticosRESUMO
Interactions between density and environmental conditions have important effects on vital rates and consequently on population dynamics and can take complex pathways in species whose demography is strongly influenced by social context, such as the African lion, Panthera leo. In populations of such species, the response of vital rates to density can vary depending on the social structure (e.g. effects of group size or composition). However, studies assessing density dependence in populations of lions and other social species have seldom considered the effects of multiple socially explicit measures of density, and-more particularly for lions-of nomadic males. Additionally, vital-rate responses to interactions between the environment and various measures of density remain largely uninvestigated. To fill these knowledge gaps, we aimed to understand how a socially and spatially explicit consideration of density (i.e. at the local scale) and its interaction with environmental seasonality affect vital rates of lions in the Serengeti National Park, Tanzania. We used a Bayesian multistate capture-recapture model and Bayesian generalized linear mixed models to estimate lion stage-specific survival and between-stage transition rates, as well as reproduction probability and recruitment, while testing for season-specific effects of density measures at the group and home-range levels. We found evidence for several such effects. For example, resident-male survival increased more strongly with coalition size in the dry season compared with the wet season, and adult-female abundance affected subadult survival negatively in the wet season, but positively in the dry season. Additionally, while our models showed no effect of nomadic males on adult-female survival, they revealed strong effects of nomads on key processes such as reproduction and takeover dynamics. Therefore, our results highlight the importance of accounting for seasonality and social context when assessing the effects of density on vital rates of Serengeti lions and of social species more generally.
Assuntos
Leões , Densidade Demográfica , Dinâmica Populacional , Estações do Ano , Animais , Tanzânia , Leões/fisiologia , Masculino , Feminino , Comportamento Social , Teorema de Bayes , Demografia , ReproduçãoRESUMO
The brain exhibits a remarkable ability to learn and execute context-appropriate behaviors. How it achieves such flexibility, without sacrificing learning efficiency, is an important open question. Neuroscience, psychology, and engineering suggest that reusing and repurposing computations are part of the answer. Here, we review evidence that thalamocortical architectures may have evolved to facilitate these objectives of flexibility and efficiency by coordinating distributed computations. Recent work suggests that distributed prefrontal cortical networks compute with flexible codes, and that the mediodorsal thalamus provides regularization to promote efficient reuse. Thalamocortical interactions resemble hierarchical Bayesian computations, and their network implementation can be related to existing gating, synchronization, and hub theories of thalamic function. By reviewing recent findings and providing a novel synthesis, we highlight key research horizons integrating computation, cognition, and systems neuroscience.
Assuntos
Córtex Cerebral , Cognição , Aprendizagem , Tálamo , Humanos , Tálamo/fisiologia , Aprendizagem/fisiologia , Cognição/fisiologia , Córtex Cerebral/fisiologia , Animais , Rede Nervosa/fisiologia , Modelos Neurológicos , Vias Neurais/fisiologiaRESUMO
Process models specify a series of mental operations necessary to complete a task. We demonstrate how to use process models to analyze response-time data and obtain parameter estimates that have a clear psychological interpretation. A prerequisite for our analysis is a process model that generates a count of elementary information processing steps (EIP steps) for each trial of an experiment. We can estimate the duration of an EIP step by assuming that every EIP step is of random duration, modeled as draws from a gamma distribution. A natural effect of summing several random EIP steps is that the expected spread of the overall response time increases with a higher EIP step count. With modern probabilistic programming tools, it becomes relatively easy to fit Bayesian hierarchical models to data and thus estimate the duration of a step for each individual participant. We present two examples in this paper: The first example is children's performance on simple addition tasks, where the response time is often well predicted by the smaller of the two addends. The second example is response times in a Sudoku task. Here, the process model contains some random decisions and the EIP step count thus becomes latent. We show how our EIP regression model can be extended to such a case. We believe this approach can be used to bridge the gap between classical cognitive modeling and statistical inference and will be easily applicable to many use cases.
Assuntos
Teorema de Bayes , Tempo de Reação , Humanos , Tempo de Reação/fisiologia , Análise de Regressão , Modelos Estatísticos , CriançaRESUMO
Background: Studying how the bull sharks aggregate and how they can be driven by life history traits such as reproduction, prey availability, predator avoidance and social interaction in a National Park such as Cabo Pulmo, is key to understand and protect the species. Methods: The occurrence variability of 32 bull sharks tracked with passive acoustic telemetry were investigated via a hierarchical logistic regression model, with inference conducted in a Bayesian framework, comparing sex, and their response to temperature and chlorophyll. Results: Based on the fitted model, occurrence probability varied by sex and length. Juvenile females had the highest values, whereas adult males the lowest. A strong seasonality or day of the year was recorded, where sharks were generally absent during September-November. However, some sharks did not show the common pattern, being detected just for a short period. This is one of the first studies where the Bayesian framework is used to study passive acoustic telemetry proving the potential to be used in further studies.
Assuntos
Teorema de Bayes , Estações do Ano , Tubarões , Animais , Tubarões/fisiologia , Feminino , Masculino , California , TelemetriaRESUMO
BACKGROUND: Distributed lag non-linear models (DLNMs) are the reference framework for modelling lagged non-linear associations. They are usually used in large-scale multi-location studies. Attempts to study these associations in small areas either did not include the lagged non-linear effects, did not allow for geographically-varying risks or downscaled risks from larger spatial units through socioeconomic and physical meta-predictors when the estimation of the risks was not feasible due to low statistical power. METHODS: Here we proposed spatial Bayesian DLNMs (SB-DLNMs) as a new framework for the estimation of reliable small-area lagged non-linear associations, and demonstrated the methodology for the case study of the temperature-mortality relationship in the 73 neighbourhoods of the city of Barcelona. We generalized location-independent DLNMs to the Bayesian framework (B-DLNMs), and extended them to SB-DLNMs by incorporating spatial models in a single-stage approach that accounts for the spatial dependence between risks. RESULTS: The results of the case study highlighted the benefits of incorporating the spatial component for small-area analysis. Estimates obtained from independent B-DLNMs were unstable and unreliable, particularly in neighbourhoods with very low numbers of deaths. SB-DLNMs addressed these instabilities by incorporating spatial dependencies, resulting in more plausible and coherent estimates and revealing hidden spatial patterns. In addition, the Bayesian framework enriches the range of estimates and tests that can be used in both large- and small-area studies. CONCLUSIONS: SB-DLNMs account for spatial structures in the risk associations across small areas. By modelling spatial differences, SB-DLNMs facilitate the direct estimation of non-linear exposure-response lagged associations at the small-area level, even in areas with as few as 19 deaths. The manuscript includes an illustrative code to reproduce the results, and to facilitate the implementation of other case studies by other researchers.
Assuntos
Poluição do Ar , Humanos , Poluição do Ar/análise , Dinâmica não Linear , Teorema de Bayes , TemperaturaRESUMO
Yellowfin tuna, Thunnus albacares, represents an important component of commercial and recreational fisheries in the Gulf of Mexico (GoM). We investigated the influence of environmental conditions on the spatiotemporal distribution of yellowfin tuna using fisheries' catch data spanning 2012-2019 within Mexican waters. We implemented hierarchical Bayesian regression models with spatial and temporal random effects and fixed effects of several environmental covariates to predict habitat suitability (HS) for the species. The best model included spatial and interannual anomalies of the absolute dynamic topography of the ocean surface (ADTSA and ADTIA, respectively), bottom depth, and a seasonal cyclical random effect. High catches occurred mainly towards anticyclonic features at bottom depths > 1000 m. The spatial extent of HS was higher in years with positive ADTIA, which implies more anticyclonic activity. The highest values of HS (> 0.7) generally occurred at positive ADTSA in oceanic waters of the central and northern GoM. However, high HS values (> 0.6) were observed in the southern GoM, in waters with cyclonic activity during summer. Our results highlight the importance of mesoscale features for the spatiotemporal distribution of yellowfin tunas and could help to develop dynamic fisheries management strategies in Mexico and the U.S. for this valuable resource.
Assuntos
Ecossistema , Atum , Animais , Golfo do México , Teorema de Bayes , Oceanos e MaresRESUMO
Causal reasoning-the ability to reason about causal relations between events-is fundamental to understanding how the world works. This paper reviews two prominent theories on early causal learning and offers possibilities for theory bridging. Both theories grow out of computational modeling and have significant areas of overlap while differing in several respects. Explanation-Based Learning (EBL) focuses on young infants' learning about causal concepts of physical objects and events, whereas Bayesian models have been used to describe causal reasoning beyond infancy across various concept domains. Connecting the two models offers a more integrated approach to clarifying the developmental processes in causal reasoning from early infancy through later childhood. We further suggest that everyday language practices offer a promising space for theory bridging. We provide a review of selective work on caregiver-child conversations, in particular, on the use of scaffolding language including causal talk and pedagogical questions. Linking the research on language practices to the two cognitive theories, we point out directions for further research to integrate EBL and Bayesian models and clarify how causal learning unfolds in real life. This article is categorized under: Psychology > Learning Cognitive Biology > Cognitive Development.
Assuntos
Teorema de Bayes , Aprendizagem , Humanos , Lactente , Pré-Escolar , Desenvolvimento Infantil/fisiologia , Idioma , Desenvolvimento da Linguagem , Formação de Conceito , CogniçãoRESUMO
Inferring the cancer-type specificities of ultra-rare, genome-wide somatic mutations is an open problem. Traditional statistical methods cannot handle such data due to their ultra-high dimensionality and extreme data sparsity. To harness information in rare mutations, we have recently proposed a formal multilevel multilogistic "hidden genome" model. Through its hierarchical layers, the model condenses information in ultra-rare mutations through meta-features embodying mutation contexts to characterize cancer types. Consistent, scalable point estimation of the model can incorporate 10s of millions of variants across thousands of tumors and permit impressive prediction and attribution. However, principled statistical inference is infeasible due to the volume, correlation, and noninterpretability of mutation contexts. In this paper, we propose a novel framework that leverages topic models from computational linguistics to effectuate dimension reduction of mutation contexts producing interpretable, decorrelated meta-feature topics. We propose an efficient MCMC algorithm for implementation that permits rigorous full Bayesian inference at a scale that is orders of magnitude beyond the capability of existing out-of-the-box inferential high-dimensional multi-class regression methods and software. Applying our model to the Pan Cancer Analysis of Whole Genomes dataset reveals interesting biological insights including somatic mutational topics associated with UV exposure in skin cancer, aging in colorectal cancer, and strong influence of epigenome organization in liver cancer. Under cross-validation, our model demonstrates highly competitive predictive performance against blackbox methods of random forest and deep learning.
Assuntos
Algoritmos , Teorema de Bayes , Mutação , Neoplasias , Humanos , Neoplasias/genética , Modelos Estatísticos , Neoplasias Cutâneas/genéticaRESUMO
BACKGROUND: Population size, prevalence, and incidence are essential metrics that influence public health programming and policy. However, stakeholders are frequently tasked with setting performance targets, reporting global indicators, and designing policies based on multiple (often incongruous) estimates of these variables, and they often do so in the absence of a formal, transparent framework for reaching a consensus estimate. OBJECTIVE: This study aims to describe a model to synthesize multiple study estimates while incorporating stakeholder knowledge, introduce an R Shiny app to implement the model, and demonstrate the model and app using real data. METHODS: In this study, we developed a Bayesian hierarchical model to synthesize multiple study estimates that allow the user to incorporate the quality of each estimate as a confidence score. The model was implemented as a user-friendly R Shiny app aimed at practitioners of population size estimation. The underlying Bayesian model was programmed in Stan for efficient sampling and computation. RESULTS: The app was demonstrated using biobehavioral survey-based population size estimates (and accompanying confidence scores) of female sex workers and men who have sex with men from 3 survey locations in a country in sub-Saharan Africa. The consensus results incorporating confidence scores are compared with the case where they are absent, and the results with confidence scores are shown to perform better according to an app-supplied metric for unaccounted-for variation. CONCLUSIONS: The utility of the triangulator model, including the incorporation of confidence scores, as a user-friendly app is demonstrated using a use case example. Our results offer empirical evidence of the model's effectiveness in producing an accurate consensus estimate and emphasize the significant impact that the accessible model and app offer for public health. It offers a solution to the long-standing problem of synthesizing multiple estimates, potentially leading to more informed and evidence-based decision-making processes. The Triangulator has broad utility and flexibility to be adapted and used in various other contexts and regions to address similar challenges.
Assuntos
Profissionais do Sexo , Minorias Sexuais e de Gênero , Masculino , Humanos , Feminino , Prevalência , Teorema de Bayes , Consenso , Homossexualidade Masculina , Densidade DemográficaRESUMO
Population pharmacokinetic (pop-PK) models constructed for model-informed precision dosing often have limited utility due to the low number of patients recruited. To augment such models, an approach is presented for generating fully artificial quasi-models which can be employed to make individual estimates of pharmacokinetic parameters. Based on 72 concentrations obtained in 12 patients, one- and two-compartment pop-PK models with or without creatinine clearance as a covariate were generated for piperacillin using the nonparametric adaptive grid algorithm. Thirty quasi-models were subsequently generated for each model type, and nonparametric maximum a posteriori probability Bayesian estimates were established for each patient. A significant difference in performance was found between one- and two-compartment models. Acceptable agreement was found between predicted and observed piperacillin concentrations, and between the estimates of the random-effect pharmacokinetic variables obtained using the so-called support points of the pop-PK models or the quasi-models as priors. The mean squared errors of the predictions made using the quasi-models were similar to, or even considerably lower than those obtained when employing the pop-PK models. Conclusion: fully artificial nonparametric quasi-models can efficiently augment pop-PK models containing few support points, to make individual pharmacokinetic estimates in the clinical setting.
RESUMO
Understanding the processes that underlie the development of population genetic structure is central to the study of evolution. Patterns of genetic structure, in turn, can reveal signatures of isolation by distance (IBD), barriers to gene flow, or even the genesis of speciation. However, it is unclear how severe range restriction might impact the processes that dominate the development of genetic structure. In narrow endemic species, is population structure likely to be adaptive in nature, or rather the result of genetic drift? In this study, we investigated patterns of genetic diversity and structure in the narrow endemic Hayden's ringlet butterfly. Specifically, we asked to what degree genetic structure in the Hayden's ringlet can be explained by IBD, isolation by resistance (IBR) (in the form of geographic or ecological barriers to migration between populations), and isolation by environment (in the form of differences in host plant availability and preference). We employed a genotyping-by-sequencing (GBS) approach coupled with host preference assays, Bayesian modelling, and population genomic analyses to answer these questions. Our results suggest that despite their restricted range, levels of genetic diversity in the Hayden's ringlet are comparable to those seen in more widespread butterfly species. Hayden's ringlets showed a strong preference for feeding on grasses relative to sedges, but neither larval preference nor potential host availability at sampling sites correlated with genetic structure. We conclude that geography, in the form of IBR and simple IBD, was the major driver of contemporary patterns of differentiation in this narrow endemic species.
Assuntos
Borboletas , Variação Genética , Animais , Borboletas/genética , Teorema de Bayes , Deriva Genética , Geografia , Genética PopulacionalRESUMO
Species distribution models and maps from large-scale biodiversity data are necessary for conservation management. One current issue is that biodiversity data are prone to taxonomic misclassifications. Methods to account for these misclassifications in multi-species distribution models have assumed that the classification probabilities are constant throughout the study. In reality, classification probabilities are likely to vary with several covariates. Failure to account for such heterogeneity can lead to biased prediction of species distributions. Here, we present a general multi-species distribution model that accounts for heterogeneity in the classification process. The proposed model assumes a multinomial generalised linear model for the classification confusion matrix. We compare the performance of the heterogeneous classification model to that of the homogeneous classification model by assessing how well they estimate the parameters in the model and their predictive performance on hold-out samples. We applied the model to gull data from Norway, Denmark and Finland, obtained from the Global Biodiversity Information Facility. Our simulation study showed that accounting for heterogeneity in the classification process increased the precision of true species' identity predictions by 30% and accuracy and recall by 6%. Since all the models in this study accounted for misclassification of some sort, there was no significant effect of accounting for heterogeneity in the classification process on the inference about the ecological process. Applying the model framework to the gull dataset did not improve the predictive performance between the homogeneous and heterogeneous models (with parametric distributions) due to the smaller misclassified sample sizes. However, when machine learning predictive scores were used as weights to inform the species distribution models about the classification process, the precision increased by 70%. We recommend multiple multinomial regression to be used to model the variation in the classification process when the data contains relatively larger misclassified samples. Machine learning prediction scores should be used when the data contains relatively smaller misclassified samples.
RESUMO
Genomic selection (GS) has great potential to increase genetic gain in poultry breeding. However, the performance of genomic prediction in duck growth and breast morphological (BM) traits remains largely unknown. The objective of this study was to evaluate the benefits of genomic prediction for duck growth and BM traits using methods such as GBLUP, single-step GBLUP, Bayesian models, and different marker densities. This study collected phenotypic data for 14 growth and BM traits in a crossbreed population of 1893 Pekin duck × mallard, which included 941 genotyped ducks. The estimation of genetic parameters indicated high heritabilities for body weight (0.54-0.72), whereas moderate-to-high heritabilities for average daily gain (0.21-0.57) traits. The heritabilities of BM traits ranged from low to moderate (0.18-0.39). The prediction ability of GS on growth and BM traits increased by 7.6% on average compared to the pedigree-based BLUP method. The single-step GBLUP outperformed GBLUP in most traits with an average of 0.3% higher reliability in our study. Most of the Bayesian models had better performance on predictive reliability, except for BayesR. BayesN emerged as the top-performing model for genomic prediction of both growth and BM traits, exhibiting an average increase in reliability of 3.0% compared to GBLUP. The permutation studies revealed that 50 K markers had achieved ideal prediction reliability, while 3 K markers still achieved 90.8% predictive capability would further reduce the cost for duck growth and BM traits. This study provides promising evidence for the application of GS in improving duck growth and BM traits. Our findings offer some useful strategies for optimizing the predictive ability of GS in growth and BM traits and provide theoretical foundations for designing a low-density panel in ducks.
RESUMO
When people use samples of evidence to make inferences, they consider both the sample contents and how the sample was generated ("sampling assumptions"). The current studies examined whether people can update their sampling assumptions - whether they can revise a belief about sample generation that is discovered to be incorrect, and reinterpret old data in light of the new belief. We used a property induction task where learners saw a sample of instances that shared a novel property and then inferred whether it generalized to other items. Assumptions about how the sample was selected were manipulated between conditions: in the property sampling frame condition, items were selected because they shared a property, while in the category sampling frame condition, items were selected because they belonged to a particular category. Experiment 1 found that these frames affected patterns of property generalization regardless of whether they were presented before or after the sample data was observed: in both cases, generalization was narrower under a property than a category frame. In Experiments 2 and 3, an initial category or property frame was presented before the sample, and was later retracted and replaced with the complementary frame. Learners were able to update their beliefs about sample generation, basing their property generalization on the more recent correct frame. These results show that learners can revise incorrect beliefs about data selection and adjust their inductive inferences accordingly.
Assuntos
Generalização Psicológica , HumanosRESUMO
Dynamic Contrast Enhanced Magnetic Resonance Imaging (DCE-MRI) can be used as a non-invasive method for the assessment of myocardial perfusion. The acquired images can be utilised to analyse the spatial extent and severity of myocardial ischaemia (regions with impaired microvascular blood flow). In the present paper, we propose a novel generalisable spatio-temporal hierarchical Bayesian model (GST-HBM) to automate the detection of ischaemic lesions and improve the in silico prediction accuracy by systematically integrating spatio-temporal context information. We present a computational inference procedure with an adequate trade-off between accuracy and computational efficiency, whereby model parameters are sampled from the posterior distribution with Gibbs sampling, while lower-level hyperparameters are selected using model selection strategies based on the Watanabe Akaike information criterion (WAIC). We have assessed our method on both synthetic (in silico) data with known gold-standard and 12 sets of clinical first-pass myocardial perfusion DCE-MRI datasets. We have also carried out a comparative performance evaluation with four established alternative methods: Gaussian mixture model (GMM), opening and closing operations based on Gaussian mixture model (GMMC&Omax), Markov random field constrained Gaussian mixture model (GMM-MRF) and model-based hierarchical Bayesian model (M-HBM). Our results show that the proposed GST-HBM method achieves much higher in silico prediction accuracy than the established alternative methods. Furthermore, this method appears to provide a more robust delineation of ischaemic lesions in datasets affected by spatially variant noise.
Assuntos
Doença da Artéria Coronariana , Imageamento por Ressonância Magnética , Humanos , Teorema de Bayes , Imageamento por Ressonância Magnética/métodosRESUMO
The causal modelling of Bell experiments relies on three fundamental assumptions: locality, freedom of choice and arrow-of-time. It turns out that nature violates Bell inequalities, which implies the failure of at least one of those assumptions. Since rejecting any of them, even partially, is sufficient to explain the observed correlations, it is natural to inquire about the cost in each case. This paper builds upon the findings in Blasiak et al. 2021 Proc. Natl Acad. Sci. USA 118, e2020569118 (doi:10.1073/pnas.2020569118) showing the equivalence between the locality and free choice assumptions. Here, we include retrocausal models to complete the picture of causal explanations of the observed correlations. Furthermore, we refine the discussion by considering more challenging causal scenarios which allow only single-arrow type violations of a given assumption. The figure of merit chosen for the comparison of the causal cost is defined as the minimal frequency of violation of the respective assumption required for a simulation of the observed experimental statistics. This article is part of the theme issue 'Quantum contextuality, causality and freedom of choice'.
RESUMO
Detection error can bias observations of ecological processes, especially when some species are never detected during sampling. In many communities, the probable identity of these missing species is known from previous research and natural history collections, but this information is rarely incorporated into subsequent models. Here, I present prior aggregation as a method for including information from external sources in Bayesian hierarchical detection models. Prior aggregation combines information from multiple prior distributions, in this case, an ecologically informative, species-level prior, and an uninformative community-level prior. This approach incorporates external information into the model without sacrificing the advantages of modeling species in the context of the community. Using simulated data supplied to a multispecies occupancy model, I demonstrated that prior aggregation improves estimates of (1) metacommunity richness and (2) environmental covariates were associated with species-specific occupancy probabilities. When applied to a dataset of small mammals in Vermont, prior aggregation allowed the model to estimate occupancy correlates of the Eastern cottontail Sylvilagus floridanus, a species observed at several sites in the region but never captured. Prior aggregation can be used to improve the analysis of several important metrics in population and community ecology, including abundance, survivorship, and diversity.