RESUMO
Bayesian hypothesis testing presents an attractive alternative to p value hypothesis testing. Part I of this series outlined several advantages of Bayesian hypothesis testing, including the ability to quantify evidence and the ability to monitor and update this evidence as data come in, without the need to know the intention with which the data were collected. Despite these and other practical advantages, Bayesian hypothesis tests are still reported relatively rarely. An important impediment to the widespread adoption of Bayesian tests is arguably the lack of user-friendly software for the run-of-the-mill statistical problems that confront psychologists for the analysis of almost every experiment: the t-test, ANOVA, correlation, regression, and contingency tables. In Part II of this series we introduce JASP ( http://www.jasp-stats.org ), an open-source, cross-platform, user-friendly graphical software package that allows users to carry out Bayesian hypothesis tests for standard statistical problems. JASP is based in part on the Bayesian analyses implemented in Morey and Rouder's BayesFactor package for R. Armed with JASP, the practical advantages of Bayesian hypothesis testing are only a mouse click away.
Assuntos
Teorema de Bayes , Psicologia , Software , Humanos , Projetos de PesquisaRESUMO
Bayesian parameter estimation and Bayesian hypothesis testing present attractive alternatives to classical inference using confidence intervals and p values. In part I of this series we outline ten prominent advantages of the Bayesian approach. Many of these advantages translate to concrete opportunities for pragmatic researchers. For instance, Bayesian hypothesis testing allows researchers to quantify evidence and monitor its progression as data come in, without needing to know the intention with which the data were collected. We end by countering several objections to Bayesian hypothesis testing. Part II of this series discusses JASP, a free and open source software program that makes it easy to conduct Bayesian estimation and testing for a range of popular statistical scenarios (Wagenmakers et al. this issue).
Assuntos
Teorema de Bayes , Psicologia , Humanos , Projetos de PesquisaRESUMO
This article provides a Bayes factor approach to multiway analysis of variance (ANOVA) that allows researchers to state graded evidence for effects or invariances as determined by the data. ANOVA is conceptualized as a hierarchical model where levels are clustered within factors. The development is comprehensive in that it includes Bayes factors for fixed and random effects and for within-subjects, between-subjects, and mixed designs. Different model construction and comparison strategies are discussed, and an example is provided. We show how Bayes factors may be computed with BayesFactor package in R and with the JASP statistical package. (PsycINFO Database Record
Assuntos
Análise de Variância , Teorema de Bayes , Modelos Estatísticos , Projetos de Pesquisa , HumanosRESUMO
The field of psychology, including cognitive science, is vexed by a crisis of confidence. Although the causes and solutions are varied, we focus here on a common logical problem in inference. The default mode of inference is significance testing, which has a free lunch property where researchers need not make detailed assumptions about the alternative to test the null hypothesis. We present the argument that there is no free lunch; that is, valid testing requires that researchers test the null against a well-specified alternative. We show how this requirement follows from the basic tenets of conventional and Bayesian probability. Moreover, we show in both the conventional and Bayesian framework that not specifying the alternative may lead to rejections of the null hypothesis with scant evidence. We review both frequentist and Bayesian approaches to specifying alternatives, and we show how such specifications improve inference. The field of cognitive science will benefit because consideration of reasonable alternatives will undoubtedly sharpen the intellectual underpinnings of research.
Assuntos
Ciência Cognitiva , Teorema de Bayes , Humanos , Projetos de PesquisaRESUMO
We present a suite of Bayes factor hypothesis tests that allow researchers to grade the decisiveness of the evidence that the data provide for the presence versus the absence of a correlation between two variables. For concreteness, we apply our methods to the recent work of Donnellan et al. (in press) who conducted nine replication studies with over 3,000 participants and failed to replicate the phenomenon that lonely people compensate for a lack of social warmth by taking warmer baths or showers. We show how the Bayes factor hypothesis test can quantify evidence in favor of the null hypothesis, and how the prior specification for the correlation coefficient can be used to define a broad range of tests that address complementary questions. Specifically, we show how the prior specification can be adjusted to create a two-sided test, a one-sided test, a sensitivity analysis, and a replication test.
Assuntos
Teorema de Bayes , Estatística como Assunto/métodos , HumanosRESUMO
In a series of four experiments, Topolinski and Sparenberg (2012) found support for the conjecture that clockwise movements induce psychological states of temporal progression and an orientation toward the future and novelty. Here we report the results of a preregistered replication attempt of Experiment 2 from Topolinski and Sparenberg (2012). Participants turned kitchen rolls either clockwise or counterclockwise while answering items from a questionnaire assessing openness to experience. Data from 102 participants showed that the effect went slightly in the direction opposite to that predicted by Topolinski and Sparenberg (2012), and a preregistered Bayes factor hypothesis test revealed that the data were 10.76 times more likely under the null hypothesis than under the alternative hypothesis. Our findings illustrate the theoretical importance and practical advantages of preregistered Bayes factor replication studies, both for psychological science and for empirical work in general.
RESUMO
Within the literature on emotion and behavioral action, studies on approach-avoidance take up a prominent place. Several experimental paradigms feature successful conceptual replications but many original studies have not yet been replicated directly. We present such a direct replication attempt of two seminal experiments originally conducted by Chen and Bargh (1999). In their first experiment, participants affectively evaluated attitude objects by pulling or pushing a lever. Participants who had to pull the lever with positively valenced attitude objects and push the lever with negatively valenced attitude objects (i.e., congruent instruction) did so faster than participants who had to follow the reverse (i.e., incongruent) instruction. In Chen and Bargh's second experiment, the explicit evaluative instructions were absent and participants merely responded to the attitude objects by either always pushing or always pulling the lever. Similar results were obtained as in Experiment 1. Based on these findings, Chen and Bargh concluded that (1) attitude objects are evaluated automatically; and (2) attitude objects automatically trigger a behavioral tendency to approach or avoid. We attempted to replicate both experiments and failed to find the effects reported by Chen and Bargh as indicated by our pre-registered Bayesian data analyses; nevertheless, the evidence in favor of the null hypotheses was only anecdotal, and definitive conclusions await further study.
RESUMO
A recent 'crisis of confidence' has emerged in the empirical sciences. Several studies have suggested that questionable research practices (QRPs) such as optional stopping and selective publication may be relatively widespread. These QRPs can result in a high proportion of false-positive findings, decreasing the reliability and replicability of research output. A potential solution is to register experiments prior to data acquisition and analysis. In this study we attempted to replicate studies that relate brain structure to behavior and cognition. These structural brain-behavior (SBB) correlations occasionally receive much attention in science and in the media. Given the impact of these studies, it is important to investigate their replicability. Here, we attempt to replicate five SBB correlation studies comprising a total of 17 effects. To prevent the impact of QRPs we employed a preregistered, purely confirmatory replication approach. For all but one of the 17 findings under scrutiny, confirmatory Bayesian hypothesis tests indicated evidence in favor of the null hypothesis ranging from anecdotal (Bayes factor < 3) to strong (Bayes factor > 10). In several studies, effect size estimates were substantially lower than in the original studies. To our knowledge, this is the first multi-study confirmatory replication of SBB correlations. With this study, we hope to encourage other researchers to undertake similar replication attempts.
Assuntos
Comportamento/fisiologia , Encéfalo/anatomia & histologia , Cognição/fisiologia , Vias Neurais/anatomia & histologia , Adolescente , Teorema de Bayes , Imagem de Difusão por Ressonância Magnética , Feminino , Humanos , Imageamento por Ressonância Magnética , Masculino , Tamanho do Órgão , Reprodutibilidade dos Testes , Adulto JovemRESUMO
The power fallacy refers to the misconception that what holds on average -across an ensemble of hypothetical experiments- also holds for each case individually. According to the fallacy, high-power experiments always yield more informative data than do low-power experiments. Here we expose the fallacy with concrete examples, demonstrating that a particular outcome from a high-power experiment can be completely uninformative, whereas a particular outcome from a low-power experiment can be highly informative. Although power is useful in planning an experiment, it is less useful-and sometimes even misleading-for making inferences from observed data. To make inferences from data, we recommend the use of likelihood ratios or Bayes factors, which are the extension of likelihood ratios beyond point hypotheses. These methods of inference do not average over hypothetical replications of an experiment, but instead condition on the data that have actually been observed. In this way, likelihood ratios and Bayes factors rationally quantify the evidence that a particular data set provides for or against the null or any other hypothesis.
Assuntos
Estatística como Assunto , Teorema de Bayes , Humanos , Tamanho da AmostraRESUMO
Replication attempts are essential to the empirical sciences. Successful replication attempts increase researchers' confidence in the presence of an effect, whereas failed replication attempts induce skepticism and doubt. However, it is often unclear to what extent a replication attempt results in success or failure. To quantify replication outcomes we propose a novel Bayesian replication test that compares the adequacy of 2 competing hypotheses. The 1st hypothesis is that of the skeptic and holds that the effect is spurious; this is the null hypothesis that postulates a zero effect size, H0 : δ = 0. The 2nd hypothesis is that of the proponent and holds that the effect is consistent with the one found in the original study, an effect that can be quantified by a posterior distribution. Hence, the 2nd hypothesis-the replication hypothesis-is given by Hr : δ ⼠"posterior distribution from original study." The weighted-likelihood ratio between H0 and Hr quantifies the evidence that the data provide for replication success and failure. In addition to the new test, we present several other Bayesian tests that address different but related questions concerning a replication study. These tests pertain to the independent conclusions of the separate experiments, the difference in effect size between the original experiment and the replication attempt, and the overall conclusion based on the pooled results. Together, this suite of Bayesian tests allows a relatively complete formalization of the way in which the result of a replication attempt alters our knowledge of the phenomenon at hand. The use of all Bayesian replication tests is illustrated with 3 examples from the literature. For experiments analyzed using the t test, computation of the new replication test only requires the t values and the numbers of participants from the original study and the replication study.
Assuntos
Teorema de Bayes , Reprodutibilidade dos Testes , Projetos de Pesquisa , HumanosRESUMO
Longitudinal surveys measuring physical or mental health status are a common method to evaluate treatments. Multiple items are administered repeatedly to assess changes in the underlying health status of the patient. Traditional models to analyze the resulting data assume that the characteristics of at least some items are identical over measurement occasions. When this assumption is not met, this can result in ambiguous latent health status estimates. Changes in item characteristics over occasions are allowed in the proposed measurement model, which includes truncated and correlated random effects and a growth model for item parameters. In a joint estimation procedure adopting MCMC methods, both item and latent health status parameters are modeled as longitudinal random effects. Simulation study results show accurate parameter recovery. Data from a randomized clinical trial concerning the treatment of depression by increasing psychological acceptance showed significant item parameter shifts. For some items, the probability of responding in the middle category versus the highest or lowest category increased significantly over time. The resulting latent depression scores decreased more over time for the experimental group than for the control group and the amount of decrease was related to the increase in acceptance level.