Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 20
Filtrar
1.
J Neurosci ; 44(20)2024 May 15.
Artigo em Inglês | MEDLINE | ID: mdl-38569923

RESUMO

Our prior research has identified neural correlates of cognitive control in the anterior cingulate cortex (ACC), leading us to hypothesize that the ACC is necessary for increasing attention as rats flexibly learn new contingencies during a complex reward-guided decision-making task. Here, we tested this hypothesis by using optogenetics to transiently inhibit the ACC, while rats of either sex performed the same two-choice task. ACC inhibition had a profound impact on behavior that extended beyond deficits in attention during learning when expected outcomes were uncertain. We found that ACC inactivation slowed and reduced the number of trials rats initiated and impaired both their accuracy and their ability to complete sessions. Furthermore, drift-diffusion model analysis suggested that free-choice performance and evidence accumulation (i.e., reduced drift rates) were degraded during initial learning-leading to weaker associations that were more easily overridden in later trial blocks (i.e., stronger bias). Together, these results suggest that in addition to attention-related functions, the ACC contributes to the ability to initiate trials and generally stay on task.


Assuntos
Giro do Cíngulo , Optogenética , Ratos Long-Evans , Animais , Giro do Cíngulo/fisiologia , Masculino , Ratos , Feminino , Atenção/fisiologia , Recompensa , Comportamento de Escolha/fisiologia , Tomada de Decisões/fisiologia , Inibição Neural/fisiologia
2.
Proc Natl Acad Sci U S A ; 118(1)2021 01 05.
Artigo em Inglês | MEDLINE | ID: mdl-33443150

RESUMO

Real-life decisions are often repeated. Whether considering taking a job in a new city, or doing something mundane like checking if the stove is off, decisions are frequently revisited even if no new information is available. This mode of behavior takes a particularly pathological form in obsessive-compulsive disorder (OCD), which is marked by individuals' redeliberating previously resolved decisions. Surprisingly, little is known about how information is transferred across decision episodes in such circumstances, and whether and how such transfer varies in OCD. In two experiments, data from a repeated decision-making task and computational modeling revealed that both implicit and explicit memories of previous decisions affected subsequent decisions by biasing the rate of evidence integration. Further, we replicated previous work demonstrating impairments in baseline decision-making as a function of self-reported OCD symptoms, and found that information transfer effects specifically due to implicit memory were reduced, offering computational insight into checking behavior.


Assuntos
Tomada de Decisões/fisiologia , Memória/fisiologia , Transtorno Obsessivo-Compulsivo/fisiopatologia , Adulto , Feminino , Humanos , Masculino , Modelos Teóricos , Comportamento Obsessivo/metabolismo , Comportamento Obsessivo/fisiopatologia
3.
PLoS Comput Biol ; 18(5): e1010047, 2022 05.
Artigo em Inglês | MEDLINE | ID: mdl-35511764

RESUMO

A large literature has accumulated suggesting that human and animal decision making is driven by at least two systems, and that important functions of these systems can be captured by reinforcement learning algorithms. The "model-free" system caches and uses stimulus-value or stimulus-response associations, and the "model-based" system implements more flexible planning using a model of the world. However, it is not clear how the two systems interact during deliberation and how a single decision emerges from this process, especially when they disagree. Most previous work has assumed that while the systems operate in parallel, they do so independently, and they combine linearly to influence decisions. Using an integrated reinforcement learning/drift-diffusion model, we tested the hypothesis that the two systems interact in a non-linear fashion similar to other situations with cognitive conflict. We differentiated two forms of conflict: action conflict, a binary state representing whether the systems disagreed on the best action, and value conflict, a continuous measure of the extent to which the two systems disagreed on the difference in value between the available options. We found that decisions with greater value conflict were characterized by reduced model-based control and increased caution both with and without action conflict. Action conflict itself (the binary state) acted in the opposite direction, although its effects were less prominent. We also found that between-system conflict was highly correlated with within-system conflict, and although it is less clear a priori why the latter might influence the strength of each system above its standard linear contribution, we could not rule it out. Our work highlights the importance of non-linear conflict effects, and provides new constraints for more detailed process models of decision making. It also presents new avenues to explore with relation to disorders of compulsivity, where an imbalance between systems has been implicated.


Assuntos
Tomada de Decisões , Reforço Psicológico , Algoritmos , Animais , Tomada de Decisões/fisiologia
4.
Proc Natl Acad Sci U S A ; 112(37): 11708-13, 2015 Sep 15.
Artigo em Inglês | MEDLINE | ID: mdl-26324932

RESUMO

Research on the dynamics of reward-based, goal-directed decision making has largely focused on simple choice, where participants decide among a set of unitary, mutually exclusive options. Recent work suggests that the deliberation process underlying simple choice can be understood in terms of evidence integration: Noisy evidence in favor of each option accrues over time, until the evidence in favor of one option is significantly greater than the rest. However, real-life decisions often involve not one, but several steps of action, requiring a consideration of cumulative rewards and a sensitivity to recursive decision structure. We present results from two experiments that leveraged techniques previously applied to simple choice to shed light on the deliberation process underlying multistep choice. We interpret the results from these experiments in terms of a new computational model, which extends the evidence accumulation perspective to multiple steps of action.


Assuntos
Comportamento de Escolha , Tomada de Decisões , Teorema de Bayes , Simulação por Computador , Humanos , Aprendizagem , Modelos Neurológicos , Reforço Psicológico , Reprodutibilidade dos Testes , Recompensa
5.
PLoS Comput Biol ; 10(8): e1003779, 2014 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-25122479

RESUMO

Human behavior has long been recognized to display hierarchical structure: actions fit together into subtasks, which cohere into extended goal-directed activities. Arranging actions hierarchically has well established benefits, allowing behaviors to be represented efficiently by the brain, and allowing solutions to new tasks to be discovered easily. However, these payoffs depend on the particular way in which actions are organized into a hierarchy, the specific way in which tasks are carved up into subtasks. We provide a mathematical account for what makes some hierarchies better than others, an account that allows an optimal hierarchy to be identified for any set of tasks. We then present results from four behavioral experiments, suggesting that human learners spontaneously discover optimal action hierarchies.


Assuntos
Comportamento/fisiologia , Objetivos , Aprendizagem/fisiologia , Modelos Neurológicos , Adolescente , Adulto , Biologia Computacional , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Adulto Jovem
6.
J Affect Disord ; 360: 345-353, 2024 Sep 01.
Artigo em Inglês | MEDLINE | ID: mdl-38806064

RESUMO

BACKGROUND: Functional connectivity has garnered interest as a potential biomarker of psychiatric disorders including borderline personality disorder (BPD). However, small sample sizes and lack of within-study replications have led to divergent findings with no clear spatial foci. AIMS: Evaluate discriminative performance and generalizability of functional connectivity markers for BPD. METHOD: Whole-brain fMRI resting state functional connectivity in matched subsamples of 116 BPD and 72 control individuals defined by three grouping strategies. We predicted BPD status using classifiers with repeated cross-validation based on multiscale functional connectivity within and between regions of interest (ROIs) covering the whole brain-global ROI-based network, seed-based ROI-connectivity, functional consistency, and voxel-to-voxel connectivity-and evaluated the generalizability of the classification in the left-out portion of non-matched data. RESULTS: Full-brain connectivity allowed classification (∼70 %) of BPD patients vs. controls in matched inner cross-validation. The classification remained significant when applied to unmatched out-of-sample data (∼61-70 %). Highest seed-based accuracies were in a similar range to global accuracies (∼70-75 %), but spatially more specific. The most discriminative seed regions included midline, temporal and somatomotor regions. Univariate connectivity values were not predictive of BPD after multiple comparison corrections, but weak local effects coincided with the most discriminative seed-ROIs. Highest accuracies were achieved with a full clinical interview while self-report results remained at chance level. LIMITATIONS: The accuracies vary considerably between random sub-samples of the population, global signal and covariates limiting the practical applicability. CONCLUSIONS: Spatially distributed functional connectivity patterns are moderately predictive of BPD despite heterogeneity of the patient population.


Assuntos
Transtorno da Personalidade Borderline , Encéfalo , Aprendizado de Máquina , Imageamento por Ressonância Magnética , Humanos , Transtorno da Personalidade Borderline/fisiopatologia , Transtorno da Personalidade Borderline/diagnóstico , Feminino , Adulto , Masculino , Encéfalo/fisiopatologia , Encéfalo/diagnóstico por imagem , Adulto Jovem , Conectoma/métodos , Estudos de Casos e Controles , Mapeamento Encefálico/métodos
7.
Behav Res Methods ; 45(4): 1293-312, 2013 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-23549683

RESUMO

Recent advances in neuroimaging and neural recording techniques have enabled researchers to make significant progress in understanding the neural mechanisms underlying human spatial navigation. Because these techniques generally require participants to remain stationary, computer-generated virtual environments are used. We introduce PandaEPL, a programming library for the Python language designed to simplify the creation of computer-controlled spatial-navigation experiments. PandaEPL is built on top of Panda3D, a modern open-source game engine. It allows users to construct three-dimensional environments that participants can navigate from a first-person perspective. Sound playback and recording and also joystick support are provided through the use of additional optional libraries. PandaEPL also handles many tasks common to all cognitive experiments, including managing configuration files, logging all internal and participant-generated events, and keeping track of the experiment state. We describe how PandaEPL compares with other software for building spatial-navigation experiments and walk the reader through the process of creating a fully functional experiment.


Assuntos
Ciências do Comportamento/instrumentação , Ciências do Comportamento/métodos , Bibliotecas Digitais , Software , Comportamento Espacial/fisiologia , Interface Usuário-Computador , Adulto , Computadores , Apresentação de Dados , Planejamento Ambiental , Humanos , Projetos de Pesquisa
8.
Mem Cognit ; 40(2): 177-90, 2012 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-22057363

RESUMO

The well-known finding that responses in serial recall tend to be clustered around the position of the target item has bolstered positional-coding theories of serial order memory. In the present study, we show that this effect is confounded with another well-known finding--that responses in serial recall tend to also be clustered around the position of the prior recall (temporal clustering). The confound can be alleviated by conditioning each analysis on the positional accuracy of the previously recalled item. The revised analyses show that temporal clustering is much more prevalent in serial recall than is positional clustering. A simple associative chaining model with asymmetric neighboring, remote associations, and a primacy gradient can account for these effects. Using the same parameter values, the model produces reasonable serial position curves and captures the changes in item and order information across study-test trials. In contrast, a prominent positional coding model cannot account for the pattern of clustering uncovered by the new analyses.


Assuntos
Associação , Rememoração Mental/fisiologia , Modelos Psicológicos , Aprendizagem Seriada/fisiologia , Humanos , Fatores de Tempo
9.
Comput Psychiatr ; 6(1): 79-95, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-38774779

RESUMO

Computational models of decision making have identified a relationship between obsessive-compulsive symptoms (OCS), both in the general population and in patients, and impairments in perceptual evidence accumulation. Some studies have interpreted these deficits to reflect global disease traits which give rise to clusters of OCS. Such assumptions are not uncommon, even if implicit, in computational psychiatry more broadly. However, it is well established that state- and trait-symptom scores are often correlated (e.g., state and trait anxiety), and the extent to which perceptual deficits are actually explained by state-based symptoms is unclear. State-based symptoms may give rise to information processing differences in a number of ways, including the mechanistically less interesting possibility of tying up working memory and attentional resources for off-task processing. In a general population sample (N = 150), we investigated the extent to which previously identified impairments in perceptual evidence accumulation were related to trait vs stated-based OCS. In addition, we tested whether differences in working memory capacity moderated state-based impairments, such that impairments were worse in individuals with lower working memory capacity. We replicated previous work demonstrating a negative relationship between the rate of evidence accumulation and trait-based OCS when state-based symptoms were unaccounted for. When state-based effects were included in the model, they captured a significant degree of impairment while trait-based effects were attenuated, although they did not disappear completely. We did not find evidence that working memory capacity moderated the state-based effects. Our work suggests that investigating the relationship between information processing and state-based symptoms may be important more generally in computational psychiatry beyond this specific context.

10.
Neuroimage Clin ; 34: 102975, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35255416

RESUMO

Previous studies have demonstrated that the rate of evidence integration during perceptual decision making, a specific computationally defined parameter, is negatively correlated with both subclinical symptoms of OCD measured on a continuum and categorically diagnosed patient status. However, the neural mechanisms underlying this deficit are unknown. Separate work has shown that both gamma and beta-band power are related to evidence integration, and differences in beta-band power in particular have been hypothesized to hinder flexible behavioral control. We sought to unify these two disparate literatures, one on OCD-related information processing differences constrained by behavioral data alone, and the other on the neural correlates of evidence integration. Using computational modeling and scalp EEG, we tested (N = 67) the relationships between subclinical symptom scores, drift rate, and gamma/beta-band activity during perceptual decision making. We replicated both prior work showing deficits in evidence integration as a function of OCD symptoms, and work showing a relationship between evidence integration and gamma and beta-band power. As predicted, the slope of beta-band power was correlated with OCD symptoms. However, the relationships between OCD symptoms and drift rate and the slopes of gamma and beta-band power and drift rate remained unchanged when simultaneously accounting for all variables, speaking against the hypothesis that differences in band-band power explain drift rate deficits.


Assuntos
Transtorno Obsessivo-Compulsivo , Tomada de Decisões , Eletroencefalografia , Humanos
11.
J Psychiatr Res ; 138: 428-435, 2021 06.
Artigo em Inglês | MEDLINE | ID: mdl-33962130

RESUMO

Deficits in primary recognition memory and confidence have previously been tested as potential contributors to excessive checking behavior in obsessive-compulsive disorder. Studies have tested both recognition for actions and, hypothesizing that recognition may be disrupted more generally across content domains, verbal recognition memory. However, studies of verbal recognition memory have yielded mixed results. We revisited this work with the benefit of hindsight, running two new experiments with larger samples, the manipulation of recognition difficulty, and a computational model-based approach to data analysis. In both datasets, we found that discriminability, defined as the difference in drift rate for old versus new stimuli in the drift-diffusion model, was reduced as a function of subclinical OCD symptoms in the general population. Paralleling work on drift rate deficits in perceptual decision making in OCD, these reductions were larger for easier recognition decisions. We also asked participants about their confidence in each recognition decision and parcellated confidence into bias, or the difference in overall confidence, and sensitivity, which represents the ability to appropriately map confidence to objective accuracy. We found no consistent evidence of a relationship between OCD symptoms and either quantity.


Assuntos
Transtorno Obsessivo-Compulsivo , Humanos , Memória , Transtornos da Memória/etiologia , Reconhecimento Psicológico
12.
JAMA Psychiatry ; 78(10): 1113-1122, 2021 10 01.
Artigo em Inglês | MEDLINE | ID: mdl-34319349

RESUMO

Importance: Major depressive disorder is prevalent and impairing. Parsing neurocomputational substrates of reinforcement learning in individuals with depression may facilitate a mechanistic understanding of the disorder and suggest new cognitive therapeutic targets. Objective: To determine associations among computational model-derived reinforcement learning parameters, depression symptoms, and symptom changes after treatment. Design, Setting, and Participants: In this mixed cross-sectional-cohort study, individuals performed reward and loss variants of a probabilistic learning task during functional magnetic resonance imaging at baseline and follow-up. A volunteer sample with and without a depression diagnosis was recruited from the community. Participants were assessed from July 2011 to February 2017, and data were analyzed from May 2017 to May 2021. Main Outcomes and Measures: Computational model-based analyses of participants' choices assessed a priori hypotheses about associations between components of reward-based and loss-based learning with depression symptoms. Changes in both learning parameters and symptoms were then assessed in a subset of participants who received cognitive behavioral therapy (CBT). Results: Of 101 included adults, 69 (68.3%) were female, and the mean (SD) age was 34.4 (11.2) years. A total of 69 participants with a depression diagnosis and 32 participants without a depression diagnosis were included at baseline; 48 participants (28 with depression who received CBT and 20 without depression) were included at follow-up (mean [SD] of 115.1 [15.6] days). Computational model-based analyses of behavioral choices and neural data identified associations of learning with symptoms during reward learning and loss learning, respectively. During reward learning only, anhedonia (and not negative affect or arousal) was associated with model-derived learning parameters (learning rate: posterior mean regression ß = -0.14; 95% credible interval [CrI], -0.12 to -0.03; outcome sensitivity: posterior mean regression ß = 0.18; 95% CrI, 0.02 to 0.37) and neural learning signals (moderation of association between striatal prediction error and expected value signals: t97 = -2.10; P = .04). During loss learning only, negative affect (and not anhedonia or arousal) was associated with learning parameters (outcome shift: posterior mean regression ß = -0.11; 95% CrI, -0.20 to -0.01) and disrupted neural encoding of learning signals (association with subgenual anterior cingulate prediction error signals: r = -0.28; P = .005). Symptom improvement following CBT was associated with normalization of learning parameters that were disrupted at baseline (reward learning rate: posterior mean regression ß = 0.15; 90% CrI, 0.001 to 0.41; loss outcome shift: posterior mean regression ß = 0.42; 90% CrI, 0.09 to 0.77). Conclusions and Relevance: In this study, the mapping of reinforcement learning components to symptoms of major depression revealed mechanistic features associated with these symptoms and points to possible learning-based therapeutic processes and targets.


Assuntos
Terapia Cognitivo-Comportamental , Transtorno Depressivo Maior/fisiopatologia , Transtorno Depressivo Maior/terapia , Giro do Cíngulo/fisiopatologia , Reforço Psicológico , Estriado Ventral/fisiopatologia , Adulto , Mapeamento Encefálico , Estudos Transversais , Transtorno Depressivo Maior/diagnóstico por imagem , Feminino , Giro do Cíngulo/diagnóstico por imagem , Humanos , Imageamento por Ressonância Magnética , Masculino , Pessoa de Meia-Idade , Aprendizagem por Probabilidade , Recompensa , Estriado Ventral/diagnóstico por imagem , Adulto Jovem
13.
Behav Res Methods ; 42(1): 141-7, 2010 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-20160294

RESUMO

Studies of human memory often generate data on the sequence and timing of recalled items, but scoring such data using conventional methods is difficult or impossible. We describe a Python-based semiautomated system that greatly simplifies this task. This software, called PyParse, can easily be used in conjunction with many common experiment authoring systems. Scored data is output in a simple ASCII format and can be accessed with the programming language of choice, allowing for the identification of features such as correct responses, prior-list intrusions, extra-list intrusions, and repetitions.


Assuntos
Interpretação Estatística de Dados , Memória de Curto Prazo , Linguagens de Programação , Humanos , Reconhecimento Psicológico , Vocabulário
14.
Front Neurosci ; 13: 915, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31555082

RESUMO

Reward-based decision making is thought to be driven by at least two different types of decision systems: a simple stimulus-response cache-based system which embodies the common-sense notion of "habit," for which model-free reinforcement learning serves as a computational substrate, and a more deliberate, prospective, model-based planning system. Previous work has shown that loss aversion, a well-studied measure of how much more on average individuals weigh losses relative to gains during decision making, is reduced when participants take all possible decisions and outcomes into account including future ones, relative to when they myopically focus on the current decision. Model-based control offers a putative mechanism for implementing such foresight. Using a well-powered data set (N = 117) in which participants completed two different tasks designed to measure each of the two quantities of interest, and four models of choice data for these tasks, we found consistent evidence of a relationship between loss aversion and model-based control but in the direction opposite to that expected based on previous work: loss aversion had a positive relationship with model-based control. We did not find evidence for a relationship between either decision system and risk aversion, a related aspect of subjective utility.

15.
Sci Rep ; 7: 43119, 2017 02 22.
Artigo em Inglês | MEDLINE | ID: mdl-28225034

RESUMO

The laboratory study of how humans and other animals trade-off value and time has a long and storied history, and is the subject of a vast literature. However, despite a long history of study, there is no agreed upon mechanistic explanation of how intertemporal choice preferences arise. Several theorists have recently proposed model-based reinforcement learning as a candidate framework. This framework describes a suite of algorithms by which a model of the environment, in the form of a state transition function and reward function, can be converted on-line into a decision. The state transition function allows the model-based system to make decisions based on projected future states, while the reward function assigns value to each state, together capturing the necessary components for successful intertemporal choice. Empirical work has also pointed to a possible relationship between increased prospection and reduced discounting. In the current paper, we look for direct evidence of a relationship between temporal discounting and model-based control in a large new data set (n = 168). However, testing the relationship under several different modeling formulations revealed no indication that the two quantities are related.


Assuntos
Comportamento de Escolha , Tomada de Decisões , Desvalorização pelo Atraso , Previsões , Humanos
16.
Nat Neurosci ; 16(9): 1188-90, 2013 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-23912946

RESUMO

Grid cells in the entorhinal cortex appear to represent spatial location via a triangular coordinate system. Such cells, which have been identified in rats, bats and monkeys, are believed to support a wide range of spatial behaviors. Recording neuronal activity from neurosurgical patients performing a virtual-navigation task, we identified cells exhibiting grid-like spiking patterns in the human brain, suggesting that humans and simpler animals rely on homologous spatial-coding schemes.


Assuntos
Mapeamento Encefálico , Córtex Entorrinal/citologia , Neurônios/fisiologia , Percepção Espacial/fisiologia , Comportamento Espacial/fisiologia , Potenciais de Ação/fisiologia , Córtex Entorrinal/diagnóstico por imagem , Córtex Entorrinal/patologia , Epilepsia/patologia , Epilepsia/cirurgia , Humanos , Processamento de Imagem Assistida por Computador , Imageamento por Ressonância Magnética , Modelos Neurológicos , Movimento , Tomografia Computadorizada por Raios X , Interface Usuário-Computador
17.
Science ; 342(6162): 1111-4, 2013 Nov 29.
Artigo em Inglês | MEDLINE | ID: mdl-24288336

RESUMO

In many species, spatial navigation is supported by a network of place cells that exhibit increased firing whenever an animal is in a certain region of an environment. Does this neural representation of location form part of the spatiotemporal context into which episodic memories are encoded? We recorded medial temporal lobe neuronal activity as epilepsy patients performed a hybrid spatial and episodic memory task. We identified place-responsive cells active during virtual navigation and then asked whether the same cells activated during the subsequent recall of navigation-related memories without actual navigation. Place-responsive cell activity was reinstated during episodic memory retrieval. Neuronal firing during the retrieval of each memory was similar to the activity that represented the locations in the environment where the memory was initially encoded.


Assuntos
Hipocampo/fisiologia , Memória Episódica , Neurônios/fisiologia , Percepção Espacial/fisiologia , Separação Celular , Eletrodos Implantados , Epilepsia , Hipocampo/citologia , Humanos , Lobo Temporal/citologia , Lobo Temporal/fisiologia , Interface Usuário-Computador
18.
Psychol Rev ; 119(1): 120-54, 2012 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-22229491

RESUMO

Recent work has given rise to the view that reward-based decision making is governed by two key controllers: a habit system, which stores stimulus-response associations shaped by past reward, and a goal-oriented system that selects actions based on their anticipated outcomes. The current literature provides a rich body of computational theory addressing habit formation, centering on temporal-difference learning mechanisms. Less progress has been made toward formalizing the processes involved in goal-directed decision making. We draw on recent work in cognitive neuroscience, animal conditioning, cognitive and developmental psychology, and machine learning to outline a new theory of goal-directed decision making. Our basic proposal is that the brain, within an identifiable network of cortical and subcortical structures, implements a probabilistic generative model of reward, and that goal-directed decision making is effected through Bayesian inversion of this model. We present a set of simulations implementing the account, which address benchmark behavioral and neuroscientific findings, and give rise to a set of testable predictions. We also discuss the relationship between the proposed framework and other models of decision making, including recent models of perceptual choice, to which our theory bears a direct connection.


Assuntos
Tomada de Decisões/fisiologia , Objetivos , Aprendizagem/fisiologia , Modelos Neurológicos , Neurociências , Aprendizagem por Probabilidade , Algoritmos , Animais , Inteligência Artificial , Teorema de Bayes , Comportamento Animal/fisiologia , Encéfalo/fisiologia , Simulação por Computador , Condicionamento Psicológico/fisiologia , Hábitos , Humanos , Aprendizagem em Labirinto , Modelos Psicológicos , Rede Nervosa , Resolução de Problemas , Ratos , Recompensa
20.
Neuron ; 71(2): 370-9, 2011 Jul 28.
Artigo em Inglês | MEDLINE | ID: mdl-21791294

RESUMO

Human behavior displays hierarchical structure: simple actions cohere into subtask sequences, which work together to accomplish overall task goals. Although the neural substrates of such hierarchy have been the target of increasing research, they remain poorly understood. We propose that the computations supporting hierarchical behavior may relate to those in hierarchical reinforcement learning (HRL), a machine-learning framework that extends reinforcement-learning mechanisms into hierarchical domains. To test this, we leveraged a distinctive prediction arising from HRL. In ordinary reinforcement learning, reward prediction errors are computed when there is an unanticipated change in the prospects for accomplishing overall task goals. HRL entails that prediction errors should also occur in relation to task subgoals. In three neuroimaging studies we observed neural responses consistent with such subgoal-related reward prediction errors, within structures previously implicated in reinforcement learning. The results reported support the relevance of HRL to the neural processes underlying hierarchical behavior.


Assuntos
Mapeamento Encefálico , Ondas Encefálicas/fisiologia , Encéfalo/fisiologia , Reforço Psicológico , Adolescente , Adulto , Encéfalo/irrigação sanguínea , Simulação por Computador , Eletroencefalografia/métodos , Feminino , Humanos , Processamento de Imagem Assistida por Computador , Modelos Lineares , Imageamento por Ressonância Magnética/métodos , Masculino , Modelos Biológicos , Oxigênio/sangue , Desempenho Psicomotor/fisiologia , Adulto Jovem
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA