Active inference and learning.

Friston, Karl; FitzGerald, Thomas; Rigoli, Francesco; Schwartenbeck, Philipp; O Doherty, John; Pezzulo, Giovanni

Friston, Karl; FitzGerald, Thomas; Rigoli, Francesco; Schwartenbeck, Philipp; O Doherty, John; Pezzulo, Giovanni.

Afiliação

Friston K; The Wellcome Trust Centre for Neuroimaging, UCL, 12 Queen Square, London, United Kingdom. Electronic address: k.friston@ucl.ac.uk.
FitzGerald T; The Wellcome Trust Centre for Neuroimaging, UCL, 12 Queen Square, London, United Kingdom; Max-Planckâ¿¿UCL Centre for Computational Psychiatry and Ageing Research, London, United Kingdom. Electronic address: thomas.fitzgerald@ucl.ac.uk.
Rigoli F; The Wellcome Trust Centre for Neuroimaging, UCL, 12 Queen Square, London, United Kingdom. Electronic address: f.rigoli@ucl.ac.uk.
Schwartenbeck P; The Wellcome Trust Centre for Neuroimaging, UCL, 12 Queen Square, London, United Kingdom; Max-Planckâ¿¿UCL Centre for Computational Psychiatry and Ageing Research, London, United Kingdom; Centre for Neurocognitive Research, University of Salzburg, Salzburg, Austria; Neuroscience Institute, Christian
O Doherty J; Caltech Brain Imaging Center, California Institute of Technology, Pasadena, USA. Electronic address: jdoherty@hss.caltech.edu.
Pezzulo G; Institute of Cognitive Sciences and Technologies, National Research Council, Rome, Italy. Electronic address: giovanni.pezzulo@istc.cnr.it.

Neurosci Biobehav Rev ; 68: 862-879, 2016 Sep.

Article em En | MEDLINE | ID: mdl-27375276

ABSTRACT

ABSTRACT

This paper offers an active inference account of choice behaviour and learning. It focuses on the distinction between goal-directed and habitual behaviour and how they contextualise each other. We show that habits emerge naturally (and autodidactically) from sequential policy optimisation when agents are equipped with state-action policies. In active inference, behaviour has explorative (epistemic) and exploitative (pragmatic) aspects that are sensitive to ambiguity and risk respectively, where epistemic (ambiguity-resolving) behaviour enables pragmatic (reward-seeking) behaviour and the subsequent emergence of habits. Although goal-directed and habitual policies are usually associated with model-based and model-free schemes, we find the more important distinction is between belief-free and belief-based schemes. The underlying (variational) belief updating provides a comprehensive (if metaphorical) process theory for several phenomena, including the transfer of dopamine responses, reversal learning, habit formation and devaluation. Finally, we show that active inference reduces to a classical (Bellman) scheme, in the absence of ambiguity.

Assuntos

Aprendizagem; Comportamento de Escolha; Dopamina; Hábitos; Recompensa

Palavras-chave

Active inference; Bayesian inference; Bayesian surprise; Epistemic value; Exploitation; Exploration; Free energy; Goal-directed; Habit learning; Information gain

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Aprendizagem Idioma: En Ano de publicação: 2016 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Aprendizagem Idioma: En Ano de publicação: 2016 Tipo de documento: Article