Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 70
Filtrar
1.
Cell ; 183(6): 1600-1616.e25, 2020 12 10.
Artigo em Inglês | MEDLINE | ID: mdl-33248024

RESUMO

Rapid phasic activity of midbrain dopamine neurons is thought to signal reward prediction errors (RPEs), resembling temporal difference errors used in machine learning. However, recent studies describing slowly increasing dopamine signals have instead proposed that they represent state values and arise independent from somatic spiking activity. Here we developed experimental paradigms using virtual reality that disambiguate RPEs from values. We examined dopamine circuit activity at various stages, including somatic spiking, calcium signals at somata and axons, and striatal dopamine concentrations. Our results demonstrate that ramping dopamine signals are consistent with RPEs rather than value, and this ramping is observed at all stages examined. Ramping dopamine signals can be driven by a dynamic stimulus that indicates a gradual approach to a reward. We provide a unified computational understanding of rapid phasic and slowly ramping dopamine signals: dopamine neurons perform a derivative-like computation over values on a moment-by-moment basis.


Assuntos
Dopamina/metabolismo , Transdução de Sinais , Potenciais de Ação/fisiologia , Animais , Axônios/metabolismo , Cálcio/metabolismo , Sinalização do Cálcio , Corpo Celular/metabolismo , Sinais (Psicologia) , Neurônios Dopaminérgicos/fisiologia , Fluorometria , Masculino , Camundongos Endogâmicos C57BL , Modelos Neurológicos , Estimulação Luminosa , Recompensa , Sensação , Fatores de Tempo , Área Tegmentar Ventral/metabolismo , Realidade Virtual
2.
Cell ; 160(6): 1046-8, 2015 Mar 12.
Artigo em Inglês | MEDLINE | ID: mdl-25768902

RESUMO

Haroush and Williams trained pairs of monkeys to play in a prisoner's dilemma game, a model of social interactions. Recording from the dorsal anterior cingulate cortex (dACC), they find neurons whose activity reflects the anticipation of the opponent's yet unknown choice, which may be important in guiding animals' performance in the game.


Assuntos
Giro do Cíngulo/fisiologia , Macaca mulatta/psicologia , Neurônios/fisiologia , Comportamento Social , Animais , Masculino
3.
Nature ; 614(7946): 108-117, 2023 02.
Artigo em Inglês | MEDLINE | ID: mdl-36653449

RESUMO

Spontaneous animal behaviour is built from action modules that are concatenated by the brain into sequences1,2. However, the neural mechanisms that guide the composition of naturalistic, self-motivated behaviour remain unknown. Here we show that dopamine systematically fluctuates in the dorsolateral striatum (DLS) as mice spontaneously express sub-second behavioural modules, despite the absence of task structure, sensory cues or exogenous reward. Photometric recordings and calibrated closed-loop optogenetic manipulations during open field behaviour demonstrate that DLS dopamine fluctuations increase sequence variation over seconds, reinforce the use of associated behavioural modules over minutes, and modulate the vigour with which modules are expressed, without directly influencing movement initiation or moment-to-moment kinematics. Although the reinforcing effects of optogenetic DLS dopamine manipulations vary across behavioural modules and individual mice, these differences are well predicted by observed variation in the relationships between endogenous dopamine and module use. Consistent with the possibility that DLS dopamine fluctuations act as a teaching signal, mice build sequences during exploration as if to maximize dopamine. Together, these findings suggest a model in which the same circuits and computations that govern action choices in structured tasks have a key role in sculpting the content of unconstrained, high-dimensional, spontaneous behaviour.


Assuntos
Comportamento Animal , Reforço Psicológico , Recompensa , Animais , Camundongos , Corpo Estriado/metabolismo , Dopamina/metabolismo , Sinais (Psicologia) , Optogenética , Fotometria
4.
Nature ; 577(7792): 671-675, 2020 01.
Artigo em Inglês | MEDLINE | ID: mdl-31942076

RESUMO

Since its introduction, the reward prediction error theory of dopamine has explained a wealth of empirical phenomena, providing a unifying framework for understanding the representation of reward and value in the brain1-3. According to the now canonical theory, reward predictions are represented as a single scalar quantity, which supports learning about the expectation, or mean, of stochastic outcomes. Here we propose an account of dopamine-based reinforcement learning inspired by recent artificial intelligence research on distributional reinforcement learning4-6. We hypothesized that the brain represents possible future rewards not as a single mean, but instead as a probability distribution, effectively representing multiple future outcomes simultaneously and in parallel. This idea implies a set of empirical predictions, which we tested using single-unit recordings from mouse ventral tegmental area. Our findings provide strong evidence for a neural realization of distributional reinforcement learning.


Assuntos
Dopamina/metabolismo , Aprendizagem/fisiologia , Modelos Neurológicos , Reforço Psicológico , Recompensa , Animais , Inteligência Artificial , Neurônios Dopaminérgicos/metabolismo , Neurônios GABAérgicos/metabolismo , Camundongos , Otimismo , Pessimismo , Probabilidade , Distribuições Estatísticas , Área Tegmentar Ventral/citologia , Área Tegmentar Ventral/fisiologia
5.
Annu Rev Neurosci ; 40: 373-394, 2017 07 25.
Artigo em Inglês | MEDLINE | ID: mdl-28441114

RESUMO

Dopamine neurons facilitate learning by calculating reward prediction error, or the difference between expected and actual reward. Despite two decades of research, it remains unclear how dopamine neurons make this calculation. Here we review studies that tackle this problem from a diverse set of approaches, from anatomy to electrophysiology to computational modeling and behavior. Several patterns emerge from this synthesis: that dopamine neurons themselves calculate reward prediction error, rather than inherit it passively from upstream regions; that they combine multiple separate and redundant inputs, which are themselves interconnected in a dense recurrent network; and that despite the complexity of inputs, the output from dopamine neurons is remarkably homogeneous and robust. The more we study this simple arithmetic computation, the knottier it appears to be, suggesting a daunting (but stimulating) path ahead for neuroscience more generally.


Assuntos
Encéfalo/fisiologia , Dopamina/fisiologia , Aprendizagem/fisiologia , Rede Nervosa/fisiologia , Recompensa , Animais , Humanos , Vias Neurais/fisiologia
6.
Nat Rev Neurosci ; 20(11): 703-714, 2019 11.
Artigo em Inglês | MEDLINE | ID: mdl-31570826

RESUMO

Midbrain dopamine signals are widely thought to report reward prediction errors that drive learning in the basal ganglia. However, dopamine has also been implicated in various probabilistic computations, such as encoding uncertainty and controlling exploration. Here, we show how these different facets of dopamine signalling can be brought together under a common reinforcement learning framework. The key idea is that multiple sources of uncertainty impinge on reinforcement learning computations: uncertainty about the state of the environment, the parameters of the value function and the optimal action policy. Each of these sources plays a distinct role in the prefrontal cortex-basal ganglia circuit for reinforcement learning and is ultimately reflected in dopamine activity. The view that dopamine plays a central role in the encoding and updating of beliefs brings the classical prediction error theory into alignment with more recent theories of Bayesian reinforcement learning.


Assuntos
Gânglios da Base/metabolismo , Dopamina/metabolismo , Aprendizagem/fisiologia , Rede Nervosa/metabolismo , Córtex Pré-Frontal/metabolismo , Animais , Humanos
7.
PLoS Comput Biol ; 19(9): e1011067, 2023 09.
Artigo em Inglês | MEDLINE | ID: mdl-37695776

RESUMO

To behave adaptively, animals must learn to predict future reward, or value. To do this, animals are thought to learn reward predictions using reinforcement learning. However, in contrast to classical models, animals must learn to estimate value using only incomplete state information. Previous work suggests that animals estimate value in partially observable tasks by first forming "beliefs"-optimal Bayesian estimates of the hidden states in the task. Although this is one way to solve the problem of partial observability, it is not the only way, nor is it the most computationally scalable solution in complex, real-world environments. Here we show that a recurrent neural network (RNN) can learn to estimate value directly from observations, generating reward prediction errors that resemble those observed experimentally, without any explicit objective of estimating beliefs. We integrate statistical, functional, and dynamical systems perspectives on beliefs to show that the RNN's learned representation encodes belief information, but only when the RNN's capacity is sufficiently large. These results illustrate how animals can estimate value in tasks without explicitly estimating beliefs, yielding a representation useful for systems with limited capacity.


Assuntos
Aprendizagem , Reforço Psicológico , Animais , Teorema de Bayes , Recompensa , Redes Neurais de Computação
8.
Nature ; 556(7701): 326-331, 2018 04.
Artigo em Inglês | MEDLINE | ID: mdl-29643503

RESUMO

Parenting is essential for the survival and wellbeing of mammalian offspring. However, we lack a circuit-level understanding of how distinct components of this behaviour are coordinated. Here we investigate how galanin-expressing neurons in the medial preoptic area (MPOAGal) of the hypothalamus coordinate motor, motivational, hormonal and social aspects of parenting in mice. These neurons integrate inputs from a large number of brain areas and the activation of these inputs depends on the animal's sex and reproductive state. Subsets of MPOAGal neurons form discrete pools that are defined by their projection sites. While the MPOAGal population is active during all episodes of parental behaviour, individual pools are tuned to characteristic aspects of parenting. Optogenetic manipulation of MPOAGal projections mirrors this specificity, affecting discrete parenting components. This functional organization, reminiscent of the control of motor sequences by pools of spinal cord neurons, provides a new model for how discrete elements of a social behaviour are generated at the circuit level.


Assuntos
Comportamento Materno/fisiologia , Comportamento Materno/psicologia , Vias Neurais , Comportamento Paterno/fisiologia , Comportamento Paterno/psicologia , Comportamento Social , Animais , Feminino , Galanina/metabolismo , Hormônios/metabolismo , Lógica , Masculino , Camundongos , Motivação , Neurônios/metabolismo , Optogenética , Poder Familiar , Área Pré-Óptica/citologia , Área Pré-Óptica/fisiologia , Reprodução/fisiologia , Caracteres Sexuais
9.
Annu Rev Neurosci ; 37: 363-85, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-24905594

RESUMO

How is sensory information represented in the brain? A long-standing debate in neural coding is whether and how timing of spikes conveys information to downstream neurons. Although we know that neurons in the olfactory bulb (OB) exhibit rich temporal dynamics, the functional relevance of temporal coding remains hotly debated. Recent recording experiments in awake behaving animals have elucidated highly organized temporal structures of activity in the OB. In addition, the analysis of neural circuits in the piriform cortex (PC) demonstrated the importance of not only OB afferent inputs but also intrinsic PC neural circuits in shaping odor responses. Furthermore, new experiments involving stimulation of the OB with specific temporal patterns allowed for testing the relevance of temporal codes. Together, these studies suggest that the relative timing of neuronal activity in the OB conveys odor information and that neural circuits in the PC possess various mechanisms to decode temporal patterns of OB input.


Assuntos
Mapeamento Encefálico , Neurônios/fisiologia , Condutos Olfatórios/fisiologia , Percepção Olfatória/fisiologia , Animais , Humanos , Modelos Neurológicos
10.
Nature ; 525(7568): 243-6, 2015 Sep 10.
Artigo em Inglês | MEDLINE | ID: mdl-26322583

RESUMO

Dopamine neurons are thought to facilitate learning by comparing actual and expected reward. Despite two decades of investigation, little is known about how this comparison is made. To determine how dopamine neurons calculate prediction error, we combined optogenetic manipulations with extracellular recordings in the ventral tegmental area while mice engaged in classical conditioning. Here we demonstrate, by manipulating the temporal expectation of reward, that dopamine neurons perform subtraction, a computation that is ideal for reinforcement learning but rarely observed in the brain. Furthermore, selectively exciting and inhibiting neighbouring GABA (γ-aminobutyric acid) neurons in the ventral tegmental area reveals that these neurons are a source of subtraction: they inhibit dopamine neurons when reward is expected, causally contributing to prediction-error calculations. Finally, bilaterally stimulating ventral tegmental area GABA neurons dramatically reduces anticipatory licking to conditioned odours, consistent with an important role for these neurons in reinforcement learning. Together, our results uncover the arithmetic and local circuitry underlying dopamine prediction errors.


Assuntos
Dopamina/metabolismo , Neurônios Dopaminérgicos/metabolismo , Modelos Neurológicos , Vias Neurais/fisiologia , Área Tegmentar Ventral/citologia , Área Tegmentar Ventral/fisiologia , Animais , Condicionamento Clássico , Neurônios GABAérgicos/metabolismo , Masculino , Camundongos , Camundongos Endogâmicos C57BL , Odorantes/análise , Optogenética , Reforço Psicológico , Recompensa , Fatores de Tempo , Ácido gama-Aminobutírico/metabolismo
11.
Nature ; 507(7491): 238-42, 2014 Mar 13.
Artigo em Inglês | MEDLINE | ID: mdl-24487620

RESUMO

Hunger is a hard-wired motivational state essential for survival. Agouti-related peptide (AgRP)-expressing neurons in the arcuate nucleus (ARC) at the base of the hypothalamus are crucial to the control of hunger. They are activated by caloric deficiency and, when naturally or artificially stimulated, they potently induce intense hunger and subsequent food intake. Consistent with their obligatory role in regulating appetite, genetic ablation or chemogenetic inhibition of AgRP neurons decreases feeding. Excitatory input to AgRP neurons is important in caloric-deficiency-induced activation, and is notable for its remarkable degree of caloric-state-dependent synaptic plasticity. Despite the important role of excitatory input, its source(s) has been unknown. Here, through the use of Cre-recombinase-enabled, cell-specific neuron mapping techniques in mice, we have discovered strong excitatory drive that, unexpectedly, emanates from the hypothalamic paraventricular nucleus, specifically from subsets of neurons expressing thyrotropin-releasing hormone (TRH) and pituitary adenylate cyclase-activating polypeptide (PACAP, also known as ADCYAP1). Chemogenetic stimulation of these afferent neurons in sated mice markedly activates AgRP neurons and induces intense feeding. Conversely, acute inhibition in mice with caloric-deficiency-induced hunger decreases feeding. Discovery of these afferent neurons capable of triggering hunger advances understanding of how this intense motivational state is regulated.


Assuntos
Proteína Relacionada com Agouti/metabolismo , Fome/fisiologia , Vias Neurais/fisiologia , Neurônios/metabolismo , Núcleo Hipotalâmico Paraventricular/fisiologia , Proteína Relacionada com Agouti/deficiência , Animais , Apetite/efeitos dos fármacos , Apetite/fisiologia , Núcleo Arqueado do Hipotálamo/citologia , Núcleo Arqueado do Hipotálamo/metabolismo , Mapeamento Encefálico , Rastreamento de Células , Clozapina/análogos & derivados , Clozapina/farmacologia , Dependovirus/genética , Ingestão de Alimentos/efeitos dos fármacos , Ingestão de Alimentos/fisiologia , Feminino , Privação de Alimentos , Fome/efeitos dos fármacos , Integrases/metabolismo , Masculino , Camundongos , Vias Neurais/efeitos dos fármacos , Plasticidade Neuronal/efeitos dos fármacos , Plasticidade Neuronal/fisiologia , Neurônios/efeitos dos fármacos , Neurônios Aferentes/efeitos dos fármacos , Neurônios Aferentes/metabolismo , Núcleo Hipotalâmico Paraventricular/citologia , Fragmentos de Peptídeos/deficiência , Fragmentos de Peptídeos/metabolismo , Polipeptídeo Hipofisário Ativador de Adenilato Ciclase/metabolismo , Vírus da Raiva/genética , Resposta de Saciedade/fisiologia , Hormônio Liberador de Tireotropina/metabolismo
12.
Nature ; 482(7383): 85-8, 2012 Jan 18.
Artigo em Inglês | MEDLINE | ID: mdl-22258508

RESUMO

Dopamine has a central role in motivation and reward. Dopaminergic neurons in the ventral tegmental area (VTA) signal the discrepancy between expected and actual rewards (that is, reward prediction error), but how they compute such signals is unknown. We recorded the activity of VTA neurons while mice associated different odour cues with appetitive and aversive outcomes. We found three types of neuron based on responses to odours and outcomes: approximately half of the neurons (type I, 52%) showed phasic excitation after reward-predicting odours and rewards in a manner consistent with reward prediction error coding; the other half of neurons showed persistent activity during the delay between odour and outcome that was modulated positively (type II, 31%) or negatively (type III, 18%) by the value of outcomes. Whereas the activity of type I neurons was sensitive to actual outcomes (that is, when the reward was delivered as expected compared to when it was unexpectedly omitted), the activity of type II and type III neurons was determined predominantly by reward-predicting odours. We 'tagged' dopaminergic and GABAergic neurons with the light-sensitive protein channelrhodopsin-2 and identified them based on their responses to optical stimulation while recording. All identified dopaminergic neurons were of type I and all GABAergic neurons were of type II. These results show that VTA GABAergic neurons signal expected reward, a key variable for dopaminergic neurons to calculate reward prediction error.


Assuntos
Neurônios Dopaminérgicos/metabolismo , Neurônios GABAérgicos/metabolismo , Punição , Recompensa , Área Tegmentar Ventral/citologia , Área Tegmentar Ventral/fisiologia , Animais , Channelrhodopsins , Sinais (Psicologia) , Dopamina/metabolismo , Masculino , Camundongos , Camundongos Endogâmicos C57BL , Odorantes/análise , Análise de Componente Principal , Ácido gama-Aminobutírico/metabolismo
14.
Nature ; 455(7210): 227-31, 2008 Sep 11.
Artigo em Inglês | MEDLINE | ID: mdl-18690210

RESUMO

Humans and other animals must often make decisions on the basis of imperfect evidence. Statisticians use measures such as P values to assign degrees of confidence to propositions, but little is known about how the brain computes confidence estimates about decisions. We explored this issue using behavioural analysis and neural recordings in rats in combination with computational modelling. Subjects were trained to perform an odour categorization task that allowed decision confidence to be manipulated by varying the distance of the test stimulus to the category boundary. To understand how confidence could be computed along with the choice itself, using standard models of decision-making, we defined a simple measure that quantified the quality of the evidence contributing to a particular decision. Here we show that the firing rates of many single neurons in the orbitofrontal cortex match closely to the predictions of confidence models and cannot be readily explained by alternative mechanisms, such as learning stimulus-outcome associations. Moreover, when tested using a delayed reward version of the task, we found that rats' willingness to wait for rewards increased with confidence, as predicted by the theoretical model. These results indicate that confidence estimates, previously suggested to require 'metacognition' and conscious awareness are available even in the rodent brain, can be computed with relatively simple operations, and can drive adaptive behaviour. We suggest that confidence estimation may be a fundamental and ubiquitous component of decision-making.


Assuntos
Comportamento Animal/fisiologia , Tomada de Decisões/fisiologia , Modelos Neurológicos , Neurônios/fisiologia , Animais , Intervalos de Confiança , Lobo Frontal/fisiologia , Modelos Lineares , Masculino , Odorantes/análise , Ratos , Ratos Long-Evans , Recompensa , Olfato/fisiologia , Incerteza
15.
Neuron ; 112(6): 1001-1019.e6, 2024 Mar 20.
Artigo em Inglês | MEDLINE | ID: mdl-38278147

RESUMO

Midbrain dopamine neurons are thought to signal reward prediction errors (RPEs), but the mechanisms underlying RPE computation, particularly the contributions of different neurotransmitters, remain poorly understood. Here, we used a genetically encoded glutamate sensor to examine the pattern of glutamate inputs to dopamine neurons in mice. We found that glutamate inputs exhibit virtually all of the characteristics of RPE rather than conveying a specific component of RPE computation, such as reward or expectation. Notably, whereas glutamate inputs were transiently inhibited by reward omission, they were excited by aversive stimuli. Opioid analgesics altered dopamine negative responses to aversive stimuli into more positive responses, whereas excitatory responses of glutamate inputs remained unchanged. Our findings uncover previously unknown synaptic mechanisms underlying RPE computations; dopamine responses are shaped by both synergistic and competitive interactions between glutamatergic and GABAergic inputs to dopamine neurons depending on valences, with competitive interactions playing a role in responses to aversive stimuli.


Assuntos
Neurônios Dopaminérgicos , Ácido Glutâmico , Camundongos , Animais , Neurônios Dopaminérgicos/fisiologia , Dopamina/fisiologia , Recompensa , Mesencéfalo , Área Tegmentar Ventral/fisiologia
16.
bioRxiv ; 2024 Jul 16.
Artigo em Inglês | MEDLINE | ID: mdl-39005260

RESUMO

Postural control circuitry performs the essential function of maintaining balance and body position in response to perturbations that are either self-generated (e.g. reaching to pick up an object) or externally delivered (e.g. being pushed by another person). Human studies have shown that anticipation of predictable postural disturbances can modulate such responses. This indicates that postural control could involve higher-level neural structures associated with predictive functions, rather than being purely reactive. However, the underlying neural circuitry remains largely unknown. To enable studies of predictive postural control circuits, we developed a novel task for mice. In this task, modeled after human studies, a dynamic platform generated reproducible translational perturbations. While mice stood bipedally atop a perch to receive water rewards, they experienced backward translations that were either unpredictable or preceded by an auditory cue. To validate the task, we investigated the effect of the auditory cue on postural responses to perturbations across multiple days in three mice. These preliminary results serve to validate a new postural control model, opening the door to the types of neural recordings and circuit manipulations that are currently possible only in mice.

17.
bioRxiv ; 2024 Jan 03.
Artigo em Inglês | MEDLINE | ID: mdl-38260354

RESUMO

Machine learning research has achieved large performance gains on a wide range of tasks by expanding the learning target from mean rewards to entire probability distributions of rewards - an approach known as distributional reinforcement learning (RL)1. The mesolimbic dopamine system is thought to underlie RL in the mammalian brain by updating a representation of mean value in the striatum2,3, but little is known about whether, where, and how neurons in this circuit encode information about higher-order moments of reward distributions4. To fill this gap, we used high-density probes (Neuropixels) to acutely record striatal activity from well-trained, water-restricted mice performing a classical conditioning task in which reward mean, reward variance, and stimulus identity were independently manipulated. In contrast to traditional RL accounts, we found robust evidence for abstract encoding of variance in the striatum. Remarkably, chronic ablation of dopamine inputs disorganized these distributional representations in the striatum without interfering with mean value coding. Two-photon calcium imaging and optogenetics revealed that the two major classes of striatal medium spiny neurons - D1 and D2 MSNs - contributed to this code by preferentially encoding the right and left tails of the reward distribution, respectively. We synthesize these findings into a new model of the striatum and mesolimbic dopamine that harnesses the opponency between D1 and D2 MSNs5-15 to reap the computational benefits of distributional RL.

18.
bioRxiv ; 2024 Jan 23.
Artigo em Inglês | MEDLINE | ID: mdl-38260512

RESUMO

The widespread adoption of deep learning to build models that capture the dynamics of neural populations is typically based on "black-box" approaches that lack an interpretable link between neural activity and function. Here, we propose to apply algorithm unrolling, a method for interpretable deep learning, to design the architecture of sparse deconvolutional neural networks and obtain a direct interpretation of network weights in relation to stimulus-driven single-neuron activity through a generative model. We characterize our method, referred to as deconvolutional unrolled neural learning (DUNL), and show its versatility by applying it to deconvolve single-trial local signals across multiple brain areas and recording modalities. To exemplify use cases of our decomposition method, we uncover multiplexed salience and reward prediction error signals from midbrain dopamine neurons in an unbiased manner, perform simultaneous event detection and characterization in somatosensory thalamus recordings, and characterize the responses of neurons in the piriform cortex. Our work leverages the advances in interpretable deep learning to gain a mechanistic understanding of neural dynamics.

19.
bioRxiv ; 2024 Feb 06.
Artigo em Inglês | MEDLINE | ID: mdl-38370735

RESUMO

Associative learning depends on contingency, the degree to which a stimulus predicts an outcome. Despite its importance, the neural mechanisms linking contingency to behavior remain elusive. Here we examined the dopamine activity in the ventral striatum - a signal implicated in associative learning - in a Pavlovian contingency degradation task in mice. We show that both anticipatory licking and dopamine responses to a conditioned stimulus decreased when additional rewards were delivered uncued, but remained unchanged if additional rewards were cued. These results conflict with contingency-based accounts using a traditional definition of contingency or a novel causal learning model (ANCCR), but can be explained by temporal difference (TD) learning models equipped with an appropriate inter-trial-interval (ITI) state representation. Recurrent neural networks trained within a TD framework develop state representations like our best 'handcrafted' model. Our findings suggest that the TD error can be a measure that describes both contingency and dopaminergic activity.

20.
Nat Neurosci ; 2024 Jul 25.
Artigo em Inglês | MEDLINE | ID: mdl-39054370

RESUMO

The most influential account of phasic dopamine holds that it reports reward prediction errors (RPEs). The RPE-based interpretation of dopamine signaling is, in its original form, probably too simple and fails to explain all the properties of phasic dopamine observed in behaving animals. This Perspective helps to resolve some of the conflicting interpretations of dopamine that currently exist in the literature. We focus on the following three empirical challenges to the RPE theory of dopamine: why does dopamine (1) ramp up as animals approach rewards, (2) respond to sensory and motor features and (3) influence action selection? We argue that the prediction error concept, once it has been suitably modified and generalized based on an analysis of each computational problem, answers each challenge. Nonetheless, there are a number of additional empirical findings that appear to demand fundamentally different theoretical explanations beyond encoding RPE. Therefore, looking forward, we discuss the prospects for a unifying theory that respects the diversity of dopamine signaling and function as well as the complex circuitry that both underlies and responds to dopaminergic transmission.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA