Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 6 de 6
Filter
Add more filters










Database
Language
Publication year range
1.
PLoS Comput Biol ; 20(5): e1012175, 2024 May.
Article in English | MEDLINE | ID: mdl-38805546

ABSTRACT

The structural credit assignment problem arises when the causal structure between actions and subsequent outcomes is hidden from direct observation. To solve this problem and enable goal-directed behavior, an agent has to infer structure and form a representation thereof. In the scope of this study, we investigate a possible solution in the human brain. We recorded behavioral and electrophysiological data from human participants in a novel variant of the bandit task, where multiple actions lead to multiple outcomes. Crucially, the mapping between actions and outcomes was hidden and not instructed to the participants. Human choice behavior revealed clear hallmarks of credit assignment and learning. Moreover, a computational model which formalizes action selection as the competition between multiple representations of the hidden structure was fit to account for participants data. Starting in a state of uncertainty about the correct representation, the central mechanism of this model is the arbitration of action control towards the representation which minimizes surprise about outcomes. Crucially, single-trial latent-variable analysis reveals that the neural patterns clearly support central quantitative predictions of this surprise minimization model. The results suggest that neural activity is not only related to reinforcement learning under correct as well as incorrect task representations but also reflects central mechanisms of credit assignment and behavioral arbitration.


Subject(s)
Choice Behavior , Humans , Male , Choice Behavior/physiology , Female , Adult , Brain/physiology , Models, Neurological , Computational Biology , Young Adult , Learning/physiology , Reinforcement, Psychology , Computer Simulation , Electroencephalography
2.
Behav Res Methods ; 2023 Oct 16.
Article in English | MEDLINE | ID: mdl-37845425

ABSTRACT

Social stimuli seem to be processed more easily and efficiently than non-social stimuli. The current study tested whether social feedback stimuli improve reward learning in a probabilistic reward task (PRT), in which one response option is usually rewarded more often than the other via presentation of non-social reward stimuli. In a pre-registered online study with 305 participants, 75 participants were presented with a non-social feedback stimulus (a star) and information about gains, which is typically used in published PRT studies. Three other groups (with 73-82 participants each) were presented with one of three social feedback stimuli: verbal praise, an attractive happy face, or a "thumbs up"-picture. The data were analysed based on classical signal detection theory, drift diffusion modelling, and Bayesian analyses of null effects. All PRT variants yielded the expected behavioural preference for the more frequently rewarded response. There was no processing advantage of social over non-social feedback stimuli. Bayesian analyses further supported the observation that social feedback stimuli neither increased nor decreased behavioural preferences in the PRT. The current findings suggest that the PRT is a robust experimental paradigm independent of the applied feedback stimuli. They also suggest that the occurrence of a processing advantage for social feedback stimuli is dependent on the experimental task and design.

3.
Brain Stimul ; 16(4): 1001-1008, 2023.
Article in English | MEDLINE | ID: mdl-37348704

ABSTRACT

BACKGROUND: Transcutaneous auricular vagus nerve stimulation (taVNS) has been tested as a potential treatment for pharmaco-resistant epilepsy and depression. Its clinical efficacy is thought to depend on taVNS-induced activation of the locus coeruleus and other neuromodulator systems. However, unlike for invasive VNS in rodents, there is little evidence for an effect of taVNS on noradrenergic activity. OBJECTIVE: We attempted to replicate recently published findings by Sharon et al. (2021), showing that short bursts of taVNS transiently increased pupil size and decreased EEG alpha power, two correlates of central noradrenergic activity. METHODS: Following the original study, we used a single-blind, sham-controlled, randomized cross-over design. Human volunteers (n = 29) received short-term (3.4 s) taVNS at the maximum level below the pain threshold, while we collected resting-state pupil-size and EEG data. To analyze the data, we used scripts provided by Sharon and colleagues. RESULTS: Consistent with Sharon et al. (2021), pupil dilation was significantly larger during taVNS than during sham stimulation (p = .009; Bayes factor supporting the difference = 7.45). However, we failed to replicate the effect of taVNS on EEG alpha power (p = .37); the data were four times more likely under the null hypothesis (BF10 = 0.28). CONCLUSION: Our findings support the effectiveness of short-term taVNS in inducing transient pupil dilation, a correlate of phasic noradrenergic activity. However, we failed to replicate the recent finding by Sharon et al. (2021) that taVNS attenuates EEG alpha activity. Overall, this study highlights the need for continued research on the neural mechanisms underlying taVNS efficacy and its potential as a treatment option for pharmaco-resistant conditions. It also highlights the need for direct replications of influential taVNS studies.


Subject(s)
Transcutaneous Electric Nerve Stimulation , Vagus Nerve Stimulation , Humans , Pupil/physiology , Single-Blind Method , Bayes Theorem , Vagus Nerve/physiology , Electroencephalography
5.
J Cogn Neurosci ; 34(1): 34-53, 2021 12 06.
Article in English | MEDLINE | ID: mdl-34879392

ABSTRACT

The goal of temporal difference (TD) reinforcement learning is to maximize outcomes and improve future decision-making. It does so by utilizing a prediction error (PE), which quantifies the difference between the expected and the obtained outcome. In gambling tasks, however, decision-making cannot be improved because of the lack of learnability. On the basis of the idea that TD utilizes two independent bits of information from the PE (valence and surprise), we asked which of these aspects is affected when a task is not learnable. We contrasted behavioral data and ERPs in a learning variant and a gambling variant of a simple two-armed bandit task, in which outcome sequences were matched across tasks. Participants were explicitly informed that feedback could be used to improve performance in the learning task but not in the gambling task, and we predicted a corresponding modulation of the aspects of the PE. We used a model-based analysis of ERP data to extract the neural footprints of the valence and surprise information in the two tasks. Our results revealed that task learnability modulates reinforcement learning via the suppression of surprise processing but leaves the processing of valence unaffected. On the basis of our model and the data, we propose that task learnability can selectively suppress TD learning as well as alter behavioral adaptation based on a flexible cost-benefit arbitration.


Subject(s)
Gambling , Reward , Decision Making , Evoked Potentials , Humans , Learning , Reinforcement, Psychology
6.
Cogn Affect Behav Neurosci ; 20(5): 1070-1089, 2020 10.
Article in English | MEDLINE | ID: mdl-32812148

ABSTRACT

Decision making relies on the interplay between two distinct learning mechanisms, namely habitual model-free learning and goal-directed model-based learning. Recent literature suggests that this interplay is significantly shaped by the environmental structure as represented by an internal model. We employed a modified two-stage but one-decision Markov decision task to investigate how two internal models differing in the predictability of stage transitions influence the neural correlates of feedback processing. Our results demonstrate that fronto-central theta and the feedback-related negativity (FRN), two correlates of reward prediction errors in the medial frontal cortex, are independent of the internal representations of the environmental structure. In contrast, centro-parietal delta and the P3, two correlates possibly reflecting feedback evaluation in working memory, were highly susceptible to the underlying internal model. Model-based analyses of single-trial activity showed a comparable pattern, indicating that while the computation of unsigned reward prediction errors is represented by theta and the FRN irrespective of the internal models, the P3 adapts to the internal representation of an environment. Our findings further substantiate the assumption that the feedback-locked components under investigation reflect distinct mechanisms of feedback processing and that different internal models selectively influence these mechanisms.


Subject(s)
Adaptation, Psychological/physiology , Delta Rhythm/physiology , Evoked Potentials/physiology , Feedback, Psychological/physiology , Psychomotor Performance/physiology , Reward , Theta Rhythm/physiology , Adult , Event-Related Potentials, P300/physiology , Female , Humans , Male , Young Adult
SELECTION OF CITATIONS
SEARCH DETAIL
...