Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
1.
Neural Comput ; : 1-74, 2024 Aug 30.
Artículo en Inglés | MEDLINE | ID: mdl-39212963

RESUMEN

Adaptive behavior often requires predicting future events. The theory of reinforcement learning prescribes what kinds of predictive representations are useful and how to compute them. This review integrates these theoretical ideas with work on cognition and neuroscience. We pay special attention to the successor representation and its generalizations, which have been widely applied as both engineering tools and models of brain function. This convergence suggests that particular kinds of predictive representations may function as versatile building blocks of intelligence.

2.
J Neurosci ; 41(32): 6892-6904, 2021 08 11.
Artículo en Inglés | MEDLINE | ID: mdl-34244363

RESUMEN

Attributing outcomes to your own actions or to external causes is essential for appropriately learning which actions lead to reward and which actions do not. Our previous work showed that this type of credit assignment is best explained by a Bayesian reinforcement learning model which posits that beliefs about the causal structure of the environment modulate reward prediction errors (RPEs) during action value updating. In this study, we investigated the brain networks underlying reinforcement learning that are influenced by causal beliefs using functional magnetic resonance imaging while human participants (n = 31; 13 males, 18 females) completed a behavioral task that manipulated beliefs about causal structure. We found evidence that RPEs modulated by causal beliefs are represented in dorsal striatum, while standard (unmodulated) RPEs are represented in ventral striatum. Further analyses revealed that beliefs about causal structure are represented in anterior insula and inferior frontal gyrus. Finally, structural equation modeling revealed effective connectivity from anterior insula to dorsal striatum. Together, these results are consistent with a possible neural architecture in which causal beliefs in anterior insula are integrated with prediction error signals in dorsal striatum to update action values.SIGNIFICANCE STATEMENT Learning which actions lead to reward-a process known as reinforcement learning-is essential for survival. Inferring the causes of observed outcomes-a process known as causal inference-is crucial for appropriately assigning credit to one's own actions and restricting learning to effective action-outcome contingencies. Previous studies have linked reinforcement learning to the striatum, and causal inference to prefrontal regions, yet how these neural processes interact to guide adaptive behavior remains poorly understood. Here, we found evidence that causal beliefs represented in the prefrontal cortex modulate action value updating in posterior striatum, separately from the unmodulated action value update in ventral striatum posited by standard reinforcement learning models.


Asunto(s)
Encéfalo/fisiología , Aprendizaje/fisiología , Refuerzo en Psicología , Recompensa , Adolescente , Teorema de Bayes , Mapeo Encefálico/métodos , Femenino , Humanos , Imagen por Resonancia Magnética/métodos , Masculino , Red Nerviosa/fisiología , Adulto Joven
3.
PLoS Comput Biol ; 16(4): e1007594, 2020 04.
Artículo en Inglés | MEDLINE | ID: mdl-32251444

RESUMEN

We propose that humans spontaneously organize environments into clusters of states that support hierarchical planning, enabling them to tackle challenging problems by breaking them down into sub-problems at various levels of abstraction. People constantly rely on such hierarchical presentations to accomplish tasks big and small-from planning one's day, to organizing a wedding, to getting a PhD-often succeeding on the very first attempt. We formalize a Bayesian model of hierarchy discovery that explains how humans discover such useful abstractions. Building on principles developed in structure learning and robotics, the model predicts that hierarchy discovery should be sensitive to the topological structure, reward distribution, and distribution of tasks in the environment. In five simulations, we show that the model accounts for previously reported effects of environment structure on planning behavior, such as detection of bottleneck states and transitions. We then test the novel predictions of the model in eight behavioral experiments, demonstrating how the distribution of tasks and rewards can influence planning behavior via the discovered hierarchy, sometimes facilitating and sometimes hindering performance. We find evidence that the hierarchy discovery process unfolds incrementally across trials. Finally, we propose how hierarchy discovery and hierarchical planning might be implemented in the brain. Together, these findings present an important advance in our understanding of how the brain might use Bayesian inference to discover and exploit the hidden hierarchical structure of the environment.


Asunto(s)
Teorema de Bayes , Encéfalo/fisiología , Aprendizaje/fisiología , Algoritmos , Simulación por Computador , Femenino , Humanos , Masculino , Cadenas de Markov , Modelos Neurológicos , Método de Montecarlo , Recompensa , Juegos de Video
4.
J Neurosci ; 38(32): 7143-7157, 2018 08 08.
Artículo en Inglés | MEDLINE | ID: mdl-29959234

RESUMEN

Behavioral evidence suggests that beliefs about causal structure constrain associative learning, determining which stimuli can enter into association, as well as the functional form of that association. Bayesian learning theory provides one mechanism by which structural beliefs can be acquired from experience, but the neural basis of this mechanism is poorly understood. We studied this question with a combination of behavioral, computational, and neuroimaging techniques. Male and female human subjects learned to predict an outcome based on cue and context stimuli while being scanned using fMRI. Using a model-based analysis of the fMRI data, we show that structure learning signals are encoded in posterior parietal cortex, lateral prefrontal cortex, and the frontal pole. These structure learning signals are distinct from associative learning signals. Moreover, representational similarity analysis and information mapping revealed that the multivariate patterns of activity in posterior parietal cortex and anterior insula encode the full posterior distribution over causal structures. Variability in the encoding of the posterior across subjects predicted variability in their subsequent behavioral performance. These results provide evidence for a neural architecture in which structure learning guides the formation of associations.SIGNIFICANCE STATEMENT Animals are able to infer the hidden structure behind causal relations between stimuli in the environment, allowing them to generalize this knowledge to stimuli they have never experienced before. A recently published computational model based on this idea provided a parsimonious account of a wide range of phenomena reported in the animal learning literature, suggesting a dedicated neural mechanism for learning this hidden structure. Here, we validate this model by measuring brain activity during a task that involves both structure learning and associative learning. We show that a distinct network of regions supports structure learning and that the neural signal corresponding to beliefs about structure predicts future behavioral performance.


Asunto(s)
Anticipación Psicológica/fisiología , Aprendizaje por Asociación/fisiología , Mapeo Encefálico , Causalidad , Lóbulo Frontal/fisiología , Modelos Neurológicos , Modelos Psicológicos , Lóbulo Parietal/fisiología , Corteza Prefrontal/fisiología , Teorema de Bayes , Señales (Psicología) , Femenino , Humanos , Imagen por Resonancia Magnética , Masculino
5.
Neuron ; 111(8): 1331-1344.e8, 2023 04 19.
Artículo en Inglés | MEDLINE | ID: mdl-36898374

RESUMEN

Humans learn internal models of the world that support planning and generalization in complex environments. Yet it remains unclear how such internal models are represented and learned in the brain. We approach this question using theory-based reinforcement learning, a strong form of model-based reinforcement learning in which the model is a kind of intuitive theory. We analyzed fMRI data from human participants learning to play Atari-style games. We found evidence of theory representations in prefrontal cortex and of theory updating in prefrontal cortex, occipital cortex, and fusiform gyrus. Theory updates coincided with transient strengthening of theory representations. Effective connectivity during theory updating suggests that information flows from prefrontal theory-coding regions to posterior theory-updating regions. Together, our results are consistent with a neural architecture in which top-down theory representations originating in prefrontal regions shape sensory predictions in visual areas, where factored theory prediction errors are computed and trigger bottom-up updates of the theory.


Asunto(s)
Aprendizaje , Refuerzo en Psicología , Humanos , Corteza Prefrontal/diagnóstico por imagen , Imagen por Resonancia Magnética/métodos
6.
bioRxiv ; 2023 Sep 05.
Artículo en Inglés | MEDLINE | ID: mdl-37732217

RESUMEN

The ability to make advantageous decisions is critical for animals to ensure their survival. Patch foraging is a natural decision-making process in which animals decide when to leave a patch of depleting resources to search for a new one. To study the algorithmic and neural basis of patch foraging behavior in a controlled laboratory setting, we developed a virtual foraging task for head-fixed mice. Mouse behavior could be explained by ramp-to-threshold models integrating time and rewards antagonistically. Accurate behavioral modeling required inclusion of a slowly varying "patience" variable, which modulated sensitivity to time. To investigate the neural basis of this decision-making process, we performed dense electrophysiological recordings with Neuropixels probes broadly throughout frontal cortex and underlying subcortical areas. We found that decision variables from the reward integrator model were represented in neural activity, most robustly in frontal cortical areas. Regression modeling followed by unsupervised clustering identified a subset of neurons with ramping activity. These neurons' firing rates ramped up gradually in single trials over long time scales (up to tens of seconds), were inhibited by rewards, and were better described as being generated by a continuous ramp rather than a discrete stepping process. Together, these results identify reward integration via a continuous ramping process in frontal cortex as a likely candidate for the mechanism by which the mammalian brain solves patch foraging problems.

7.
Nat Hum Behav ; 5(6): 764-773, 2021 06.
Artículo en Inglés | MEDLINE | ID: mdl-33510391

RESUMEN

The ability to transfer knowledge across tasks and generalize to novel ones is an important hallmark of human intelligence. Yet not much is known about human multitask reinforcement learning. We study participants' behaviour in a two-step decision-making task with multiple features and changing reward functions. We compare their behaviour with two algorithms for multitask reinforcement learning, one that maps previous policies and encountered features to new reward functions and one that approximates value functions across tasks, as well as to standard model-based and model-free algorithms. Across three exploratory experiments and a large preregistered confirmatory experiment, our results provide evidence that participants who are able to learn the task use a strategy that maps previously learned policies to novel scenarios. These results enrich our understanding of human reinforcement learning in complex environments with changing task demands.


Asunto(s)
Toma de Decisiones , Refuerzo en Psicología , Transferencia de Experiencia en Psicología , Adulto , Femenino , Humanos , Aprendizaje , Masculino , Persona de Mediana Edad , Modelos Teóricos , Adulto Joven
8.
Nat Commun ; 11(1): 2371, 2020 05 12.
Artículo en Inglés | MEDLINE | ID: mdl-32398675

RESUMEN

Most real-world decisions involve a delicate balance between exploring unfamiliar alternatives and committing to the best known option. Previous work has shown that humans rely on different forms of uncertainty to negotiate this "explore-exploit" trade-off, yet the neural basis of the underlying computations remains unclear. Using fMRI (n = 31), we find that relative uncertainty is represented in right rostrolateral prefrontal cortex and drives directed exploration, while total uncertainty is represented in right dorsolateral prefrontal cortex and drives random exploration. The decision value signal combining relative and total uncertainty to compute choice is reflected in motor cortex activity. The variance of this signal scales with total uncertainty, consistent with a sampling mechanism for random exploration. Overall, these results are consistent with a hybrid computational architecture in which different uncertainty computations are performed separately and then combined by downstream decision circuits to compute choice.


Asunto(s)
Conducta de Elección/fisiología , Conducta Exploratoria/fisiología , Modelos Neurológicos , Corteza Prefrontal/fisiología , Incertidumbre , Adolescente , Adulto , Análisis de Varianza , Femenino , Humanos , Imagen por Resonancia Magnética , Masculino , Corteza Prefrontal/diagnóstico por imagen , Adulto Joven
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA