Evaluating CloudResearch's Approved Group as a solution for problematic data quality on MTurk.

Hauser, David J; Moss, Aaron J; Rosenzweig, Cheskie; Jaffe, Shalom N; Robinson, Jonathan; Litman, Leib

Hauser, David J; Moss, Aaron J; Rosenzweig, Cheskie; Jaffe, Shalom N; Robinson, Jonathan; Litman, Leib.

Afiliação

Hauser DJ; Department of Psychology, Queen's University, Kingston, ON, Canada. david.hauser@queensu.ca.
Moss AJ; CloudResearch, Queens, NY, USA.
Rosenzweig C; CloudResearch, Queens, NY, USA.
Jaffe SN; Department of Clinical Psychology, Columbia University, New York, NY, USA.
Robinson J; CloudResearch, Queens, NY, USA.
Litman L; Department of Psychology, Lander College, Flushing, NY, USA.

Behav Res Methods ; 55(8): 3953-3964, 2023 12.

Article em En | MEDLINE | ID: mdl-36326997

RESUMO

Maintaining data quality on Amazon Mechanical Turk (MTurk) has always been a concern for researchers. These concerns have grown recently due to the bot crisis of 2018 and observations that past safeguards of data quality (e.g., approval ratings of 95%) no longer work. To address data quality concerns, CloudResearch, a third-party website that interfaces with MTurk, has assessed ~165,000 MTurkers and categorized them into those that provide high- (~100,000, Approved) and low- (~65,000, Blocked) quality data. Here, we examined the predictive validity of CloudResearch's vetting. In a pre-registered study, participants (N = 900) from the Approved and Blocked groups, along with a Standard MTurk sample (95% HIT acceptance ratio, 100+ completed HITs), completed an array of data-quality measures. Across several indices, Approved participants (i) identified the content of images more accurately, (ii) answered more reading comprehension questions correctly, (iii) responded to reversed coded items more consistently, (iv) passed a greater number of attention checks, (v) self-reported less cheating and actually left the survey window less often on easily Googleable questions, (vi) replicated classic psychology experimental effects more reliably, and (vii) answered AI-stumping questions more accurately than Blocked participants, who performed at chance on multiple outcomes. Data quality of the Standard sample was generally in between the Approved and Blocked groups. We discuss how MTurk's Approval Rating system is no longer an effective data-quality control, and we discuss the advantages afforded by using the Approved group for scientific studies on MTurk.

Assuntos

Crowdsourcing; Confiabilidade dos Dados; Humanos; Inquéritos e Questionários; Autorrelato; Atenção; Crowdsourcing/métodos

Palavras-chave

Data quality; Participant recruitment; Response bias; Test validity

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Crowdsourcing / Confiabilidade dos Dados Tipo de estudo: Prognostic_studies Limite: Humans Idioma: En Ano de publicação: 2023 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google