Combined model-free and model-sensitive reinforcement learning in non-human primates.

Miranda, Bruno; Malalasekera, W M Nishantha; Behrens, Timothy E; Dayan, Peter; Kennerley, Steven W

Miranda, Bruno; Malalasekera, W M Nishantha; Behrens, Timothy E; Dayan, Peter; Kennerley, Steven W.

Afiliação

Miranda B; Institute of Neurology, Department of Clinical and Movement Neurosciences, University College London, London, United Kingdom.
Malalasekera WMN; International Neuroscience Doctoral Programme, Champalimaud Foundation, Lisbon, Portugal.
Behrens TE; Instituto de Medicina Molecular, Faculdade de Medicina, Universidade de Lisboa, Lisboa, Portugal.
Dayan P; Institute of Neurology, Department of Clinical and Movement Neurosciences, University College London, London, United Kingdom.
Kennerley SW; Wellcome Centre for Integrative Neuroimaging, Centre for Functional Magnetic Resonance Imaging of the Brain, University of Oxford, Oxford, United Kingdom.

PLoS Comput Biol ; 16(6): e1007944, 2020 06.

Article em En | MEDLINE | ID: mdl-32569311

ABSTRACT

ABSTRACT

Contemporary reinforcement learning (RL) theory suggests that potential choices can be evaluated by strategies that may or may not be sensitive to the computational structure of tasks. A paradigmatic model-free (MF) strategy simply repeats actions that have been rewarded in the past; by contrast, model-sensitive (MS) strategies exploit richer information associated with knowledge of task dynamics. MF and MS strategies should typically be combined, because they have complementary statistical and computational strengths; however, this tradeoff between MF/MS RL has mostly only been demonstrated in humans, often with only modest numbers of trials. We trained rhesus monkeys to perform a two-stage decision task designed to elicit and discriminate the use of MF and MS methods. A descriptive analysis of choice behaviour revealed directly that the structure of the task (of MS importance) and the reward history (of MF and MS importance) significantly influenced both choice and response vigour. A detailed, trial-by-trial computational analysis confirmed that choices were made according to a combination of strategies, with a dominant influence of a particular form of model sensitivity that persisted over weeks of testing. The residuals from this model necessitated development of a new combined RL model which incorporates a particular credit assignment weighting procedure. Finally, response vigor exhibited a subtly different collection of MF and MS influences. These results provide new illumination onto RL behavioural processes in non-human primates.

Assuntos

Modelos Teóricos; Primatas/fisiologia; Animais; Biologia Computacional; Tomada de Decisões; Humanos

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Primatas / Modelos Teóricos Tipo de estudo: Diagnostic_studies / Prognostic_studies Limite: Animals / Humans Idioma: En Ano de publicação: 2020 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google