RESUMO
Researchers would be more willing to prioritize research quality over quantity if the incentive structure of the academic system aligned with this goal. The winner of a 2023 Einstein Foundation Award for Promoting Quality in Research explains how they rose to this challenge.
Assuntos
Distinções e Prêmios , Recompensa , Humanos , Motivação , PesquisadoresRESUMO
Although many studies supported the use of actuarial risk assessment instruments (ARAIs) because they outperformed unstructured judgments, it remains an ongoing challenge to seek potentials for improvement of their predictive performance. Machine learning (ML) algorithms, like random forests, are able to detect patterns in data useful for prediction purposes without explicitly programming them (e.g., by considering nonlinear effects between risk factors and the criterion). Therefore, the current study aims to compare conventional logistic regression analyses with the random forest algorithm on a sample of N = 511 adult male individuals convicted of sexual offenses. Data were collected at the Federal Evaluation Center for Violent and Sexual Offenders in Austria within a prospective-longitudinal research design and participants were followed-up for an average of M = 8.2 years. The Static-99, containing static risk factors, and the Stable-2007, containing stable dynamic risk factors, were included as predictors. The results demonstrated no superior predictive performance of the random forest compared with logistic regression; furthermore, methods of interpretable ML did not point to any robust nonlinear effects. Altogether, results supported the statistical use of logistic regression for the development and clinical application of ARAIs.
Assuntos
Reincidência , Delitos Sexuais , Adulto , Humanos , Masculino , Algoritmo Florestas Aleatórias , Modelos Logísticos , Estudos Prospectivos , Medição de Risco/métodosRESUMO
In many research fields, the widespread use of questionable research practices has jeopardized the credibility of scientific results. One of the most prominent questionable research practices is p-hacking. Typically, p-hacking is defined as a compound of strategies targeted at rendering non-significant hypothesis testing results significant. However, a comprehensive overview of these p-hacking strategies is missing, and current meta-scientific research often ignores the heterogeneity of strategies. Here, we compile a list of 12 p-hacking strategies based on an extensive literature review, identify factors that control their level of severity, and demonstrate their impact on false-positive rates using simulation studies. We also use our simulation results to evaluate several approaches that have been proposed to mitigate the influence of questionable research practices. Our results show that investigating p-hacking at the level of strategies can provide a better understanding of the process of p-hacking, as well as a broader basis for developing effective countermeasures. By making our analyses available through a Shiny app and R package, we facilitate future meta-scientific research aimed at investigating the ramifications of p-hacking across multiple strategies, and we hope to start a broader discussion about different manifestations of p-hacking in practice.
RESUMO
BACKGROUND: The scale of the global mental health burden indicates the inadequacy not only of current treatment options, but also the pace of the standard treatment development process. The 'leapfrog' trial design is a newly-developed simple Bayesian adaptive trial design with potential to accelerate treatment development. A first leapfrog trial was conducted to provide a demonstration and test feasibility, applying the method to a low-intensity internet-delivered intervention targeting anhedonia. METHODS: At the start of this online, single-blind leapfrog trial, participants self-reporting depression were randomized to an initial control arm comprising four weeks of weekly questionnaires, or one of two versions of a four-week cognitive training intervention, imagery cognitive bias modification (imagery CBM). Intervention arms were compared to control on an ongoing basis via sequential Bayesian analyses, based on a primary outcome of anhedonia at post-intervention. Results were used to eliminate and replace arms, or to promote them to become the control condition based on pre-specified Bayes factor and sample size thresholds. Two further intervention arms (variants of imagery CBM) were added into the trial as it progressed. RESULTS: N = 188 participants were randomized across the five trial arms. The leapfrog methodology was successfully implemented to identify a 'winning' version of the imagery CBM, i.e. the version most successful in reducing anhedonia, following sequential elimination of the other arms. CONCLUSIONS: The study demonstrates feasibility of the leapfrog design and provides a foundation for its adoption as a method to accelerate treatment development in mental health. Registration: clinicaltrials.gov, NCT04791137.
Assuntos
Anedonia , Intervenção Psicossocial , Humanos , Teorema de Bayes , Método Simples-Cego , Inquéritos e Questionários , Resultado do TratamentoRESUMO
The last 25 years have shown a steady increase in attention for the Bayes factor as a tool for hypothesis evaluation and model selection. The present review highlights the potential of the Bayes factor in psychological research. We discuss six types of applications: Bayesian evaluation of point null, interval, and informative hypotheses, Bayesian evidence synthesis, Bayesian variable selection and model averaging, and Bayesian evaluation of cognitive models. We elaborate what each application entails, give illustrative examples, and provide an overview of key references and software with links to other applications. The article is concluded with a discussion of the opportunities and pitfalls of Bayes factor applications and a sketch of corresponding future research lines. (PsycInfo Database Record (c) 2023 APA, all rights reserved).
Assuntos
Teorema de Bayes , Pesquisa Comportamental , Psicologia , Humanos , Pesquisa Comportamental/métodos , Psicologia/métodos , Software , Projetos de PesquisaRESUMO
OBJECTIVE: People's psychological tendencies are attuned to their sociocultural context and culture-specific ways of being, feeling, and thinking are believed to assist individuals in successfully navigating their environment. Supporting this idea, a stronger "fit" with one's cultural environment has often been linked to positive psychological outcomes. The current research expands the cultural, conceptual, and methodological space of cultural fit research by exploring the link between well-being and honor, a central driver of social behavior in the Mediterranean region. METHOD: Drawing on a multi-national sample from eight countries circum-Mediterranean (N = 2257), we examined the relationship between cultural fit in honor and well-being at the distal level (fit with one's perceived society) using response surface analysis (RSA) and at the proximal level (fit with one's university gender group) using profile analysis. RESULTS: We found positive links between fit and well-being in both distal (for some, but not all, honor facets) and proximal fit analyses (across all honor facets). Furthermore, most fit effects in the RSA were complemented with positive level effects of the predictors, with higher average honor levels predicting higher well-being. CONCLUSIONS: Our findings highlight the interplay between individual and environmental factors in honor as well as the important role honor plays in well-being in the Mediterranean region.
RESUMO
Condition-based regression analysis (CRA) is a statistical method for testing self-enhancement effects. That is, CRA indicates whether, in a set of empirical data, people with higher values on the directed discrepancy self-view S minus reality criterion R (i.e., S-R) tend to have higher values on some outcome variable (e.g., happiness). In a critical comment, Fiedler (2021) claims that CRA yields inaccurate conclusions in data with a suppressor effect. Here, we show that Fiedler's critique is unwarranted. All data that are simulated in his comment show a positive association between S-R and H, which is accurately detected by CRA. By construction, CRA indicates an association between S-R and H only when it is present in the data. In contrast to Fiedler's claim, it also yields valid conclusions when the outcome variable is related only to the self-view or when there is a suppressor effect. Our clarifications provide guidance for evaluating Fiedler's comment, clear up with the common heuristic that suppressor effects are always problematic, and assist readers in fully understanding CRA. (PsycInfo Database Record (c) 2022 APA, all rights reserved).
Assuntos
Autoimagem , Humanos , Análise de RegressãoRESUMO
This article presents an integrative conceptual model of motivational interdependence in couples, the MIC model. Based on theoretical tenets in motivation psychology, personality psychology, and research on interpersonal perception, the MIC model postulates that two partners' motive dispositions fundamentally interact in shaping their individual motivation and behavior. On a functional level, a partner's motivated behavior is conceptualized as an environmental cue that can contribute to an actor's motive expression and satisfaction. However, the partner's motivated behavior is considered to gain this motivational relevance only via the actor's subjective perception. Multilevel analyses of an extensive experience sampling study on partner-related communal motivation (N = up to 60,803 surveys from 508 individuals nested in 258 couples) supported the MIC model. Participants, particularly those with strong communal motive dispositions, behaved more communally at moments when they perceived their partners to behave more communally. In addition, participants experienced momentary boosts in satisfaction when they behaved more communally and, at the same time, perceived their partners' behavior as similarly communal. Broader implications of the MIC model for research on romantic relationships are discussed.
RESUMO
In a sequential hypothesis test, the analyst checks at multiple steps during data collection whether sufficient evidence has accrued to make a decision about the tested hypotheses. As soon as sufficient information has been obtained, data collection is terminated. Here, we compare two sequential hypothesis testing procedures that have recently been proposed for use in psychological research: Sequential Probability Ratio Test (SPRT; Psychological Methods, 25(2), 206-226, 2020) and the Sequential Bayes Factor Test (SBFT; Psychological Methods, 22(2), 322-339, 2017). We show that although the two methods have different philosophical roots, they share many similarities and can even be mathematically regarded as two instances of an overarching hypothesis testing framework. We demonstrate that the two methods use the same mechanisms for evidence monitoring and error control, and that differences in efficiency between the methods depend on the exact specification of the statistical models involved, as well as on the population truth. Our simulations indicate that when deciding on a sequential design within a unified sequential testing framework, researchers need to balance the needs of test efficiency, robustness against model misspecification, and appropriate uncertainty quantification. We provide guidance for navigating these design decisions based on individual preferences and simulation-based design analyses.
Assuntos
Projetos de Pesquisa , Humanos , Teorema de BayesRESUMO
Congruence hypotheses play a major role in many areas of psychology. They refer to, for example, the consequences of person-environment fit, similarity, or self-other agreement. For example, are people psychologically better adjusted when their self-view is in line with their reputation? A valid statistical approach that can be applied to investigate congruence hypotheses of this kind is quadratic Response Surface Analysis (RSA) in which a second-order polynomial model is fit to the data and appropriately interpreted. However, quadratic RSA does not allow researchers to investigate more precise expectations about a congruence effect. Do the data support an asymmetric congruence effect, in the sense that congruence leads to the highest (or lowest) outcome, but incongruence in one direction (e.g., self-view exceeds reputation) affects the outcome differently than incongruence in the other direction (e.g., self-view falls behind reputation)? Is there a level-dependent congruence effect, such that the amount of congruence is more strongly related to the outcome variable for some levels of the predictors (e.g., high self-view and reputation) than for others (e.g., low self-view and reputation)? Such complex congruence hypotheses have frequently been suggested in the literature, but they could not be investigated because an appropriate statistical approach has yet to be developed. Here, we present analytical strategies, based on third-order polynomial models, that enable users to investigate asymmetric and level-dependent congruence effects, respectively. To facilitate the correct application of the suggested approaches, we provide respective step-by-step guidelines, corresponding R syntax, and illustrative analyses using simulated and real data. (PsycInfo Database Record (c) 2022 APA, all rights reserved).
Assuntos
Modelos Estatísticos , HumanosRESUMO
The investigation of within-person process models, often done in experience sampling designs, requires a reliable assessment of within-person change. In this paper, we focus on dyadic intensive longitudinal designs where both partners of a couple are assessed multiple times each day across several days. We introduce a statistical model for variance decomposition based on generalizability theory (extending P. E. Shrout & S. P. Lane, 2012), which can estimate the relative proportion of variability on four hierarchical levels: moments within a day, days, persons, and couples. Based on these variance estimates, four reliability coefficients are derived: between-couples, between-persons, within-persons/between-days, and within-persons/between-moments. We apply the model to two dyadic intensive experience sampling studies (n1 = 130 persons, 5 surveys each day for 14 days, ≥ 7508 unique surveys; n2 = 508 persons, 5 surveys each day for 28 days, ≥ 47764 unique surveys). Five different scales in the domain of motivational processes and relationship quality were assessed with 2 to 5 items: State relationship satisfaction, communal motivation, and agentic motivation; the latter consists of two subscales, namely power and independence motivation. Largest variance components were on the level of persons, moments, couples, and days, where within-day variance was generally larger than between-day variance. Reliabilities ranged from .32 to .76 (couple level), .93 to .98 (person level), .61 to .88 (day level), and .28 to .72 (moment level). Scale intercorrelations reveal differential structures between and within persons, which has consequences for theory building and statistical modeling.
Assuntos
Motivação , Humanos , Reprodutibilidade dos Testes , Inquéritos e QuestionáriosRESUMO
Replication-an important, uncommon, and misunderstood practice-is gaining appreciation in psychology. Achieving replicability is important for making research progress. If findings are not replicable, then prediction and theory development are stifled. If findings are replicable, then interrogation of their meaning and validity can advance knowledge. Assessing replicability can be productive for generating and testing hypotheses by actively confronting current understandings to identify weaknesses and spur innovation. For psychology, the 2010s might be characterized as a decade of active confrontation. Systematic and multi-site replication projects assessed current understandings and observed surprising failures to replicate many published findings. Replication efforts highlighted sociocultural challenges such as disincentives to conduct replications and a tendency to frame replication as a personal attack rather than a healthy scientific practice, and they raised awareness that replication contributes to self-correction. Nevertheless, innovation in doing and understanding replication and its cousins, reproducibility and robustness, has positioned psychology to improve research practices and accelerate progress.
Assuntos
Projetos de Pesquisa , Humanos , Reprodutibilidade dos TestesRESUMO
We present two openly accessible databases related to the assessment of implicit motives using Picture Story Exercises (PSEs): (a) A database of 183,415 German sentences, nested in 26,389 stories provided by 4,570 participants, which have been coded by experts using Winter's coding system for the implicit affiliation/intimacy, achievement, and power motives, and (b) a database of 54 classic and new pictures which have been used as PSE stimuli. Updated picture norms are provided which can be used to select appropriate pictures for PSE applications. Based on an analysis of the relations between raw motive scores, word count, and sentence count, we give recommendations on how to control motive scores for story length, and validate the recommendation with a meta-analysis on gender differences in the implicit affiliation motive that replicates existing findings. We discuss to what extent the guiding principles of the story length correction can be generalized to other content coding systems for narrative material. Several potential applications of the databases are discussed, including (un)supervised machine learning of text content, psychometrics, and better reproducibility of PSE research.
Assuntos
Logro , Identificação Psicológica , Relações Interpessoais , Autoimagem , Teste de Apercepção Temática/normas , Adulto , Alemanha , Humanos , Masculino , Motivação , Psicometria , Reprodutibilidade dos Testes , Fatores Sexuais , Inquéritos e QuestionáriosRESUMO
The present study explored the interrelations between a broad set of appraisal ratings and five physiological signals, including facial EMG, electrodermal activity, and heart rate variability, that were assessed in 157 participants watching 10 emotionally charged videos. A total of 134 features were extracted from the physiological data, and a benchmark comparing different kinds of machine learning algorithms was conducted to test how well the appraisal dimensions can be predicted from these features. For 13 out of 21 appraisals, a robust positive R2 was attained, indicating that the dimensions are actually related to the considered physiological channels. The highest R2 (.407) was reached for the appraisal dimension intrinsic pleasantness. Moreover, the comparison of linear and nonlinear algorithms and the inspection of the links between the appraisals and single physiological features using accumulated local effects plots indicates that the relationship between physiology and appraisals is nonlinear. By constructing different importance measures for the assessed physiological channels, we showed that for the 13 predictable appraisals, the five channels explained different amounts of variance and that only a few blocks incrementally explained variance beyond the other physiological channels.
Assuntos
Emoções , Expressão Facial , Atenção , Face , Humanos , Aprendizado de MáquinaRESUMO
Response surface analysis (RSA) is a statistical approach that enables researchers to test congruence hypotheses; the proposition that the degree of congruence between people's values in 2 psychological constructs should be positively or negatively related to their value in an outcome variable. This is done by estimating a polynomial regression model and using the graph of the model and several parameters as a guide to interpret the resulting regression coefficients in terms of the congruence hypothesis. One problem with using RSA in applied research is that the model and the interpretation of the model's parameters in terms of congruence effects have only been thoroughly developed for single-level data. Here, we present an extension of RSA to multilevel data. Among other things we show how the standard errors can be computed and how researchers can decide whether the occurrence of a congruence effect depends on a Level 2 covariate. We illustrate the suggested extension with 2 examples that guide readers through the test of congruence effects in the case of multilevel data. We also provide R scripts that researchers can adopt to conduct multilevel RSA. (PsycINFO Database Record (c) 2019 APA, all rights reserved).
Assuntos
Bioestatística/métodos , Modelos Estatísticos , Análise Multinível , Psicologia/métodos , Análise de Regressão , Adulto , Fatores Etários , Idoso , Idoso de 80 Anos ou mais , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Satisfação PessoalRESUMO
Well-designed experiments are likely to yield compelling evidence with efficient sample sizes. Bayes Factor Design Analysis (BFDA) is a recently developed methodology that allows researchers to balance the informativeness and efficiency of their experiment (Schönbrodt & Wagenmakers, Psychonomic Bulletin & Review, 25(1), 128-142 2018). With BFDA, researchers can control the rate of misleading evidence but, in addition, they can plan for a target strength of evidence. BFDA can be applied to fixed-N and sequential designs. In this tutorial paper, we provide an introduction to BFDA and analyze how the use of informed prior distributions affects the results of the BFDA. We also present a user-friendly web-based BFDA application that allows researchers to conduct BFDAs with ease. Two practical examples highlight how researchers can use a BFDA to plan for informative and efficient research designs.
Assuntos
Teorema de Bayes , Análise Fatorial , Projetos de Pesquisa , Tamanho da AmostraRESUMO
Empirical research on the (mal-)adaptiveness of favorable self-perceptions, self-enhancement, and self-knowledge has typically applied a classical null-hypothesis testing approach and provided mixed and even contradictory findings. Using data from 5 studies (laboratory and field, total N = 2,823), we used an information-theoretic approach combined with Response Surface Analysis to provide the first competitive test of 6 popular hypotheses: that more favorable self-perceptions are adaptive versus maladaptive (Hypotheses 1 and 2: Positivity of self-view hypotheses), that higher levels of self-enhancement (i.e., a higher discrepancy of self-viewed and objectively assessed ability) are adaptive versus maladaptive (Hypotheses 3 and 4: Self-enhancement hypotheses), that accurate self-perceptions are adaptive (Hypothesis 5: Self-knowledge hypothesis), and that a slight degree of self-enhancement is adaptive (Hypothesis 6: Optimal margin hypothesis). We considered self-perceptions and objective ability measures in two content domains (reasoning ability, vocabulary knowledge) and investigated 6 indicators of intra- and interpersonal psychological adjustment. Results showed that most adjustment indicators were best predicted by the positivity of self-perceptions. There were some specific self-enhancement effects, and evidence generally spoke against the self-knowledge and optimal margin hypotheses. Our results highlight the need for comprehensive and simultaneous tests of competing hypotheses. Implications for the understanding of underlying processes are discussed. (PsycINFO Database Record (c) 2019 APA, all rights reserved).
Assuntos
Ajustamento Emocional , Emoções , Autoimagem , Adolescente , Adulto , Feminino , Alemanha , Humanos , Masculino , Países Baixos , Personalidade , Adulto JovemRESUMO
A sizeable literature exists on the use of frequentist power analysis in the null-hypothesis significance testing (NHST) paradigm to facilitate the design of informative experiments. In contrast, there is almost no literature that discusses the design of experiments when Bayes factors (BFs) are used as a measure of evidence. Here we explore Bayes Factor Design Analysis (BFDA) as a useful tool to design studies for maximum efficiency and informativeness. We elaborate on three possible BF designs, (a) a fixed-n design, (b) an open-ended Sequential Bayes Factor (SBF) design, where researchers can test after each participant and can stop data collection whenever there is strong evidence for either [Formula: see text] or [Formula: see text], and (c) a modified SBF design that defines a maximal sample size where data collection is stopped regardless of the current state of evidence. We demonstrate how the properties of each design (i.e., expected strength of evidence, expected sample size, expected probability of misleading evidence, expected probability of weak evidence) can be evaluated using Monte Carlo simulations and equip researchers with the necessary information to compute their own Bayesian design analyses.
Assuntos
Teorema de Bayes , Projetos de Pesquisa , Coleta de Dados , Humanos , Probabilidade , Tamanho da AmostraRESUMO
Despite a large body of literature and ongoing refinements of analytical techniques, research on the consequences of self-enhancement (SE) is still vague about how to define SE effects, and empirical results are inconsistent. In this paper, we point out that part of this confusion is due to a lack of conceptual and methodological differentiation between effects of individual differences in how much people enhance themselves (SE) and in how positively they view themselves (positivity of self-view; PSV). We show that methods commonly used to analyze SE effects are biased because they cannot differentiate between the effects of PSV and the effects of SE. We provide a new condition-based regression analysis (CRA) that unequivocally identifies effects of SE by testing intuitive and mathematically derived conditions on the coefficients in a bivariate linear regression. Using data from 3 studies on intellectual SE (total N = 566), we then illustrate that the CRA provides novel results as compared with traditional methods. Results suggest that many previously identified SE effects are in fact effects of PSV alone. The new CRA approach thus provides a clear and unbiased understanding of the consequences of SE. It can be applied to all conceptualizations of SE and, more generally, to every context in which the effects of the discrepancy between 2 variables on a third variable are examined. (PsycINFO Database Record
Assuntos
Ajustamento Emocional , Autoimagem , Adolescente , Adulto , Feminino , Humanos , Masculino , Análise de Regressão , Adulto JovemRESUMO
[This corrects the article DOI: 10.1098/rsos.160426.].