Implicit Value Updating Explains Transitive Inference Performance: The Betasort Model.

Jensen, Greg; Muñoz, Fabian; Alkan, Yelda; Ferrera, Vincent P; Terrace, Herbert S

Jensen, Greg; Muñoz, Fabian; Alkan, Yelda; Ferrera, Vincent P; Terrace, Herbert S.

Afiliação

Jensen G; Department of Neuroscience, Columbia University, New York, New York, United States of America; Department of Psychology, Columbia University, New York, New York, United States of America.
Muñoz F; Department of Neuroscience, Columbia University, New York, New York, United States of America.
Alkan Y; Department of Neuroscience, Columbia University, New York, New York, United States of America.
Ferrera VP; Department of Neuroscience, Columbia University, New York, New York, United States of America; Department of Psychiatry, Columbia University, New York, New York, United States of America.
Terrace HS; Department of Psychology, Columbia University, New York, New York, United States of America.

PLoS Comput Biol ; 11(9): e1004523, 2015.

Article em En | MEDLINE | ID: mdl-26407227

ABSTRACT

ABSTRACT

Transitive inference (the ability to infer that B > D given that B > C and C > D) is a widespread characteristic of serial learning, observed in dozens of species. Despite these robust behavioral effects, reinforcement learning models reliant on reward prediction error or associative strength routinely fail to perform these inferences. We propose an algorithm called betasort, inspired by cognitive processes, which performs transitive inference at low computational cost. This is accomplished by (1) representing stimulus positions along a unit span using beta distributions, (2) treating positive and negative feedback asymmetrically, and (3) updating the position of every stimulus during every trial, whether that stimulus was visible or not. Performance was compared for rhesus macaques, humans, and the betasort algorithm, as well as Q-learning, an established reward-prediction error (RPE) model. Of these, only Q-learning failed to respond above chance during critical test trials. Betasort's success (when compared to RPE models) and its computational efficiency (when compared to full Markov decision process implementations) suggests that the study of reinforcement learning in organisms will be best served by a feature-driven approach to comparing formal models.

Assuntos

Aprendizagem/fisiologia; Aprendizado de Máquina; Modelos Neurológicos; Algoritmos; Animais; Biologia Computacional; Humanos; Macaca mulatta; Masculino; Modelos Estatísticos; Recompensa

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Aprendizado de Máquina / Aprendizagem / Modelos Neurológicos Tipo de estudo: Prognostic_studies / Risk_factors_studies Limite: Animals / Humans / Male Idioma: En Ano de publicação: 2015 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google