How pre-processing decisions affect the reliability and validity of the approach-avoidance task: Evidence from simulations and multiverse analyses with six datasets.

Kahveci, Sercan; Rinck, Mike; van Alebeek, Hannah; Blechert, Jens

Kahveci, Sercan; Rinck, Mike; van Alebeek, Hannah; Blechert, Jens.

Afiliação

Kahveci S; Department of Psychology, Paris-Lodron-University of Salzburg, Hellbrunner Straße 34, 5020, Salzburg, Austria. sercan.kahveci@plus.ac.at.
Rinck M; Centre for Cognitive Neuroscience, Paris-Lodron-University of Salzburg, Salzburg, Austria. sercan.kahveci@plus.ac.at.
van Alebeek H; Behavioural Science Institute, Radboud University, Nijmegen, The Netherlands.
Blechert J; Department of Psychology, Paris-Lodron-University of Salzburg, Hellbrunner Straße 34, 5020, Salzburg, Austria.

Behav Res Methods ; 56(3): 1551-1582, 2024 Mar.

Article em En | MEDLINE | ID: mdl-37221345

ABSTRACT

ABSTRACT

Reaction time (RT) data are often pre-processed before analysis by rejecting outliers and errors and aggregating the data. In stimulus-response compatibility paradigms such as the approach-avoidance task (AAT), researchers often decide how to pre-process the data without an empirical basis, leading to the use of methods that may harm data quality. To provide this empirical basis, we investigated how different pre-processing methods affect the reliability and validity of the AAT. Our literature review revealed 108 unique pre-processing pipelines among 163 examined studies. Using empirical datasets, we found that validity and reliability were negatively affected by retaining error trials, by replacing error RTs with the mean RT plus a penalty, and by retaining outliers. In the relevant-feature AAT, bias scores were more reliable and valid if computed with D-scores; medians were less reliable and more unpredictable, while means were also less valid. Simulations revealed bias scores were likely to be less accurate if computed by contrasting a single aggregate of all compatible conditions with that of all incompatible conditions, rather than by contrasting separate averages per condition. We also found that multilevel model random effects were less reliable, valid, and stable, arguing against their use as bias scores. We call upon the field to drop these suboptimal practices to improve the psychometric properties of the AAT. We also call for similar investigations in related RT-based bias measures such as the implicit association task, as their commonly accepted pre-processing practices involve many of the aforementioned discouraged methods. HIGHLIGHTS â¢ Rejecting RTs deviating more than 2 or 3 SD from the mean gives more reliable and valid results than other outlier rejection methods in empirical data â¢ Removing error trials gives more reliable and valid results than retaining them or replacing them with the block mean and an added penalty â¢ Double-difference scores are more reliable than compatibility scores under most circumstances â¢ More reliable and valid results are obtained both in simulated and real data by using double-difference D-scores, which are obtained by dividing a participant's double mean difference score by the SD of their RTs.

Assuntos

Confiabilidade dos Dados; Humanos; Reprodutibilidade dos Testes; Tempo de Reação; Psicometria

Palavras-chave

Approach-avoidance task (AAT); Bias scores; Multiverse analysis; Outlier exclusion; Reliability; Simulation; Validity

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Confiabilidade dos Dados Tipo de estudo: Prognostic_studies Limite: Humans Idioma: En Ano de publicação: 2024 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google