Your browser doesn't support javascript.
loading
'Proactive' use of cue-context congruence for building reinforcement learning's reward function.
Zsuga, Judit; Biro, Klara; Tajti, Gabor; Szilasi, Magdolna Emma; Papp, Csaba; Juhasz, Bela; Gesztelyi, Rudolf.
Afiliación
  • Zsuga J; Department of Health Systems Management and Quality Management for Health Care, Faculty of Public Health, University of Debrecen, Debrecen, Nagyerdei krt. 98, 4032, Hungary. zsuga.judit@med.unideb.hu.
  • Biro K; Department of Health Systems Management and Quality Management for Health Care, Faculty of Public Health, University of Debrecen, Debrecen, Nagyerdei krt. 98, 4032, Hungary.
  • Tajti G; Department of Health Systems Management and Quality Management for Health Care, Faculty of Public Health, University of Debrecen, Debrecen, Nagyerdei krt. 98, 4032, Hungary.
  • Szilasi ME; Department of Pharmacology, Faculty of Pharmacy, University of Debrecen, Debrecen, Nagyerdei krt. 98, 4032, Hungary.
  • Papp C; Department of Health Systems Management and Quality Management for Health Care, Faculty of Public Health, University of Debrecen, Debrecen, Nagyerdei krt. 98, 4032, Hungary.
  • Juhasz B; Department of Pharmacology and Pharmacotherapy, Faculty of Medicine, University of Debrecen, Debrecen, Nagyerdei krt. 98, 4032, Hungary.
  • Gesztelyi R; Department of Pharmacology, Faculty of Pharmacy, University of Debrecen, Debrecen, Nagyerdei krt. 98, 4032, Hungary.
BMC Neurosci ; 17(1): 70, 2016 10 28.
Article en En | MEDLINE | ID: mdl-27793098
BACKGROUND: Reinforcement learning is a fundamental form of learning that may be formalized using the Bellman equation. Accordingly an agent determines the state value as the sum of immediate reward and of the discounted value of future states. Thus the value of state is determined by agent related attributes (action set, policy, discount factor) and the agent's knowledge of the environment embodied by the reward function and hidden environmental factors given by the transition probability. The central objective of reinforcement learning is to solve these two functions outside the agent's control either using, or not using a model. RESULTS: In the present paper, using the proactive model of reinforcement learning we offer insight on how the brain creates simplified representations of the environment, and how these representations are organized to support the identification of relevant stimuli and action. Furthermore, we identify neurobiological correlates of our model by suggesting that the reward and policy functions, attributes of the Bellman equitation, are built by the orbitofrontal cortex (OFC) and the anterior cingulate cortex (ACC), respectively. CONCLUSIONS: Based on this we propose that the OFC assesses cue-context congruence to activate the most context frame. Furthermore given the bidirectional neuroanatomical link between the OFC and model-free structures, we suggest that model-based input is incorporated into the reward prediction error (RPE) signal, and conversely RPE signal may be used to update the reward-related information of context frames and the policy underlying action selection in the OFC and ACC, respectively. Furthermore clinical implications for cognitive behavioral interventions are discussed.
Asunto(s)
Palabras clave

Texto completo: 1 Banco de datos: MEDLINE Asunto principal: Recompensa / Corteza Cerebral / Señales (Psicología) / Modelos Neurológicos / Modelos Psicológicos Tipo de estudio: Prognostic_studies Límite: Animals / Humans Idioma: En Revista: BMC Neurosci Asunto de la revista: NEUROLOGIA Año: 2016 Tipo del documento: Article País de afiliación: Hungria

Texto completo: 1 Banco de datos: MEDLINE Asunto principal: Recompensa / Corteza Cerebral / Señales (Psicología) / Modelos Neurológicos / Modelos Psicológicos Tipo de estudio: Prognostic_studies Límite: Animals / Humans Idioma: En Revista: BMC Neurosci Asunto de la revista: NEUROLOGIA Año: 2016 Tipo del documento: Article País de afiliación: Hungria