Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 14 de 14
Filtrar
1.
PLoS Comput Biol ; 18(3): e1009897, 2022 03.
Artigo em Inglês | MEDLINE | ID: mdl-35333867

RESUMO

There is no single way to represent a task. Indeed, despite experiencing the same task events and contingencies, different subjects may form distinct task representations. As experimenters, we often assume that subjects represent the task as we envision it. However, such a representation cannot be taken for granted, especially in animal experiments where we cannot deliver explicit instruction regarding the structure of the task. Here, we tested how rats represent an odor-guided choice task in which two odor cues indicated which of two responses would lead to reward, whereas a third odor indicated free choice among the two responses. A parsimonious task representation would allow animals to learn from the forced trials what is the better option to choose in the free-choice trials. However, animals may not necessarily generalize across odors in this way. We fit reinforcement-learning models that use different task representations to trial-by-trial choice behavior of individual rats performing this task, and quantified the degree to which each animal used the more parsimonious representation, generalizing across trial types. Model comparison revealed that most rats did not acquire this representation despite extensive experience. Our results demonstrate the importance of formally testing possible task representations that can afford the observed behavior, rather than assuming that animals' task representations abide by the generative task structure that governs the experimental design.


Assuntos
Odorantes , Recompensa , Animais , Sinais (Psicologia) , Generalização Psicológica , Humanos , Ratos , Reforço Psicológico
3.
bioRxiv ; 2024 Mar 29.
Artigo em Inglês | MEDLINE | ID: mdl-38585868

RESUMO

Lack of cognitive flexibility is a hallmark of substance use disorders and has been associated with drug-induced synaptic plasticity in the dorsomedial striatum (DMS). Yet the possible impact of altered plasticity on real-time striatal neural dynamics during decision-making is unclear. Here, we identified persistent impairments induced by chronic ethanol (EtOH) exposure on cognitive flexibility and striatal decision signals. After a substantial withdrawal period from prior EtOH vapor exposure, male, but not female, rats exhibited reduced adaptability and exploratory behavior during a dynamic decision-making task. Reinforcement learning models showed that prior EtOH exposure enhanced learning from rewards over omissions. Notably, neural signals in the DMS related to the decision outcome were enhanced, while those related to choice and choice-outcome conjunction were reduced, in EtOH-treated rats compared to the controls. These findings highlight the profound impact of chronic EtOH exposure on adaptive decision-making, pinpointing specific changes in striatal representations of actions and outcomes as underlying mechanisms for cognitive deficits.

4.
bioRxiv ; 2023 Jul 21.
Artigo em Inglês | MEDLINE | ID: mdl-37781610

RESUMO

The orbitofrontal cortex (OFC) and hippocampus (HC) are both implicated in forming the cognitive or task maps that support flexible behavior. Previously, we used the dopamine neurons as a sensor or tool to measure the functional effects of OFC lesions (Takahashi et al., 2011). We recorded midbrain dopamine neurons as rats performed an odor-based choice task, in which errors in the prediction of reward were induced by manipulating the number or timing of the expected rewards across blocks of trials. We found that OFC lesions ipsilateral to the recording electrodes caused prediction errors to be degraded consistent with a loss in the resolution of the task states, particularly under conditions where hidden information was critical to sharpening the predictions. Here we have repeated this experiment, along with computational modeling of the results, in rats with ipsilateral HC lesions. The results show HC also shapes the map of our task, however unlike OFC, which provides information local to the trial, the HC appears to be necessary for estimating the upper-level hidden states based on the information that is discontinuous or separated by longer timescales. The results contrast the respective roles of the OFC and HC in cognitive mapping and add to evidence that the dopamine neurons access a rich information set from distributed regions regarding the predictive structure of the environment, potentially enabling this powerful teaching signal to support complex learning and behavior.

5.
Nat Neurosci ; 26(5): 830-839, 2023 05.
Artigo em Inglês | MEDLINE | ID: mdl-37081296

RESUMO

Dopamine neuron activity is tied to the prediction error in temporal difference reinforcement learning models. These models make significant simplifying assumptions, particularly with regard to the structure of the predictions fed into the dopamine neurons, which consist of a single chain of timepoint states. Although this predictive structure can explain error signals observed in many studies, it cannot cope with settings where subjects might infer multiple independent events and outcomes. In the present study, we recorded dopamine neurons in the ventral tegmental area in such a setting to test the validity of the single-stream assumption. Rats were trained in an odor-based choice task, in which the timing and identity of one of several rewards delivered in each trial changed across trial blocks. This design revealed an error signaling pattern that requires the dopamine neurons to access and update multiple independent predictive streams reflecting the subject's belief about timing and potentially unique identities of expected rewards.


Assuntos
Reforço Psicológico , Área Tegmentar Ventral , Ratos , Animais , Área Tegmentar Ventral/fisiologia , Aprendizagem/fisiologia , Recompensa , Neurônios Dopaminérgicos/fisiologia , Dopamina/fisiologia
6.
Curr Opin Behav Sci ; 41: 114-121, 2021 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-36341023

RESUMO

Reinforcement learning is a powerful framework for modelling the cognitive and neural substrates of learning and decision making. Contemporary research in cognitive neuroscience and neuroeconomics typically uses value-based reinforcement-learning models, which assume that decision-makers choose by comparing learned values for different actions. However, another possibility is suggested by a simpler family of models, called policy-gradient reinforcement learning. Policy-gradient models learn by optimizing a behavioral policy directly, without the intermediate step of value-learning. Here we review recent behavioral and neural findings that are more parsimoniously explained by policy-gradient models than by value-based models. We conclude that, despite the ubiquity of 'value' in reinforcement-learning models of decision making, policy-gradient models provide a lightweight and compelling alternative model of operant behavior.

7.
Trends Cogn Sci ; 24(7): 499-501, 2020 07.
Artigo em Inglês | MEDLINE | ID: mdl-32423707

RESUMO

Dopamine (DA) responses are synonymous with the 'reward prediction error' of reinforcement learning (RL), and are thought to update neural estimates of expected value. A recent study by Dabney et al. enriches this picture, demonstrating that DA neurons track variability in rewards, providing a readout of risk in the brain.


Assuntos
Dopamina , Reforço Psicológico , Encéfalo , Humanos , Aprendizagem , Recompensa
8.
Behav Processes ; 167: 103891, 2019 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-31381985

RESUMO

We review the abstract concept of a 'state' - an internal representation posited by reinforcement learning theories to be used by an agent, whether animal, human or artificial, to summarize the features of the external and internal environment that are relevant for future behavior on a particular task. Armed with this summary representation, an agent can make decisions and perform actions to interact effectively with the world. Here, we review recent findings from the neurobiological and behavioral literature to ask: 'what is a state?' with respect to the internal representations that organize learning and decision making across a range of tasks. We find that state representations include information beyond a straightforward summary of the immediate cues in the environment, providing timing or contextual information from the recent or more distant past, which allows these additional factors to influence decision making and other goal-directed behaviors in complex and perhaps unexpected ways.


Assuntos
Tomada de Decisões , Aprendizagem , Reforço Psicológico , Animais , Sinais (Psicologia) , Humanos , Teoria Psicológica , Recompensa
9.
Psychopharmacology (Berl) ; 236(8): 2543-2556, 2019 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-31256220

RESUMO

RATIONALE: Pairing rewarding outcomes with audiovisual cues in simulated gambling games increases risky choice in both humans and rats. However, the cognitive mechanism through which this sensory enhancement biases decision-making is unknown. OBJECTIVES: To assess the computational mechanisms that promote risky choice during gambling, we applied a series of reinforcement learning models to a large dataset of choices acquired from rats as they each performed one of two variants of a rat gambling task (rGT), in which rewards on "win" trials were delivered either with or without salient audiovisual cues. METHODS: We used a sampling technique based on Markov chain Monte Carlo to obtain posterior estimates of model parameters for a series of RL models of increasing complexity, in order to assess the relative contribution of learning about positive and negative outcomes to the latent valuation of each choice option on the cued and uncued rGT. RESULTS: Rats which develop a preference for the risky options on the rGT substantially down-weight the equivalent cost of the time-out punishments during these tasks. For each model tested, the reduction in learning from the negative time-outs correlated with the degree of risk preference in individual rats. We found no apparent relationship between risk preference and the parameters that govern learning from the positive rewards. CONCLUSIONS: The emergence of risk-preferring choice on the rGT derives from a relative insensitivity to the cost of the time-out punishments, as opposed to a relative hypersensitivity to rewards. This hyposensitivity to punishment is more likely to be induced in individual rats by the addition of salient audiovisual cues to rewards delivered on win trials.


Assuntos
Sinais (Psicologia) , Tomada de Decisões/fisiologia , Jogo de Azar/psicologia , Punição/psicologia , Recompensa , Animais , Comportamento de Escolha/fisiologia , Condicionamento Operante/fisiologia , Humanos , Masculino , Ratos , Ratos Long-Evans , Reforço Psicológico , Fatores de Tempo
10.
Curr Opin Neurobiol ; 49: 1-7, 2018 04.
Artigo em Inglês | MEDLINE | ID: mdl-29096115

RESUMO

Phasic dopamine responses are thought to encode a prediction-error signal consistent with model-free reinforcement learning theories. However, a number of recent findings highlight the influence of model-based computations on dopamine responses, and suggest that dopamine prediction errors reflect more dimensions of an expected outcome than scalar reward value. Here, we review a selection of these recent results and discuss the implications and complications of model-based predictions for computational theories of dopamine and learning.


Assuntos
Simulação por Computador , Dopamina/fisiologia , Aprendizagem/fisiologia , Modelos Neurológicos , Animais , Humanos , Reforço Psicológico
11.
Neuron ; 94(4): 700-702, 2017 May 17.
Artigo em Inglês | MEDLINE | ID: mdl-28521123

RESUMO

In this issue of Neuron, Murakami et al. (2017) relate neural activity in frontal cortex to stochastic and deterministic components of waiting behavior in rats; they find that mPFC biases waiting time, while M2 is ultimately responsible for trial-to-trial variability in decisions about how long to wait.


Assuntos
Lobo Frontal , Córtex Pré-Frontal , Animais , Neurônios , Ratos
12.
Neuron ; 91(1): 182-93, 2016 07 06.
Artigo em Inglês | MEDLINE | ID: mdl-27292535

RESUMO

Dopamine neurons signal reward prediction errors. This requires accurate reward predictions. It has been suggested that the ventral striatum provides these predictions. Here we tested this hypothesis by recording from putative dopamine neurons in the VTA of rats performing a task in which prediction errors were induced by shifting reward timing or number. In controls, the neurons exhibited error signals in response to both manipulations. However, dopamine neurons in rats with ipsilateral ventral striatal lesions exhibited errors only to changes in number and failed to respond to changes in timing of reward. These results, supported by computational modeling, indicate that predictions about the temporal specificity and the number of expected reward are dissociable and that dopaminergic prediction-error signals rely on the ventral striatum for the former but not the latter.


Assuntos
Gânglios da Base/fisiologia , Dopamina/metabolismo , Neurônios Dopaminérgicos/metabolismo , Recompensa , Estriado Ventral/fisiologia , Área Tegmentar Ventral/fisiologia , Animais , Ratos Long-Evans
13.
Phys Rev E Stat Nonlin Soft Matter Phys ; 86(6 Pt 1): 061903, 2012 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-23367972

RESUMO

The minimal integrate-and-fire-or-burst neuron model succinctly describes both tonic firing and postinhibitory rebound bursting of thalamocortical cells in the sensory relay. Networks of integrate-and-fire-or-burst (IFB) neurons with slow inhibitory synaptic interactions have been shown to support stable rhythmic states, including globally synchronous and cluster oscillations, in which network-mediated inhibition cyclically generates bursting in coherent subgroups of neurons. In this paper, we introduce a reduced IFB neuronal population model to study synchronization of inhibition-mediated oscillatory bursting states to periodic excitatory input. Using numeric methods, we demonstrate the existence and stability of 1:1 phase-locked bursting oscillations in the sinusoidally forced IFB neuronal population model. Phase locking is shown to arise when periodic excitation is sufficient to pace the onset of bursting in an IFB cluster without counteracting the inhibitory interactions necessary for burst generation. Phase-locked bursting states are thus found to destabilize when periodic excitation increases in strength or frequency. Further study of the IFB neuronal population model with pulse-like periodic excitatory input illustrates that this synchronization mechanism generalizes to a broad range of n:m phase-locked bursting states across both globally synchronous and clustered oscillatory regimes.


Assuntos
Biofísica/métodos , Neurônios/fisiologia , Tálamo/fisiologia , Animais , Encéfalo/metabolismo , Cálcio/metabolismo , Simulação por Computador , Humanos , Modelos Neurológicos , Neurônios/metabolismo , Oscilometria/métodos
14.
Prog Biophys Mol Biol ; 105(1-2): 58-66, 2011 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-20869386

RESUMO

Cortical population responses to sensory input arise from the interaction between external stimuli and the intrinsic dynamics of the densely interconnected neuronal population. Although there is a large body of knowledge regarding single neuron responses to periodic stimuli, responses at the scale of cortical populations are incompletely understood. The characteristics of large-scale neuronal activity during periodic stimulation speak directly to the mechanisms underlying collective neuronal activity. Their accurate elucidation is hence a vital prelude to constructing and evaluating large-scale computational and biophysical models of the brain. Electroencephalographic data was recorded from eight human subjects while periodic vibrotactile stimuli were applied to the fingertip. Time-frequency decomposition was performed on the multi-channel data in order to investigate relative changes in the power and phase distributions at stimulus-related frequencies. We observed phase locked oscillatory activity at multiple stimulus-specific frequencies, in particular at ratios of 1:1, 2:1 and 2:3 to the stimulus frequency. These phase locked components were found to be modulated differently across the range of stimulus frequencies, with oscillatory responses most robustly sustained around 30 Hz. In contrast, no robust frequency-locked responses were apparent in the power changes. These results demonstrate n:m phase synchronization between cortical oscillations in the somatosensory system and an external periodic signal. We argue that neuronal populations evidence a collective nonlinear response to periodic sensory input. The existence of n:m phase synchronization demonstrates the contribution of intrinsic cortical dynamics to stimulus encoding and provides a novel phenomenological criteria for the validation of large-scale models of the brain.


Assuntos
Encéfalo/fisiologia , Eletroencefalografia/métodos , Neurônios/fisiologia , Córtex Somatossensorial/fisiologia , Adulto , Feminino , Humanos , Masculino , Modelos Neurológicos , Oscilometria , Periodicidade , Fatores de Tempo
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa