Búsqueda | Portal Regional de la BVS

1.

Individuals with problem gambling and obsessive-compulsive disorder learn through distinct reinforcement mechanisms.

Suzuki, Shinsuke; Zhang, Xiaoliu; Dezfouli, Amir; Braganza, Leah; Fulcher, Ben D; Parkes, Linden; Fontenelle, Leonardo F; Harrison, Ben J; Murawski, Carsten; Yücel, Murat; Suo, Chao.

PLoS Biol ; 21(3): e3002031, 2023 03.

Artículo en Inglés | MEDLINE | ID: mdl-36917567

RESUMEN

Obsessive-compulsive disorder (OCD) and pathological gambling (PG) are accompanied by deficits in behavioural flexibility. In reinforcement learning, this inflexibility can reflect asymmetric learning from outcomes above and below expectations. In alternative frameworks, it reflects perseveration independent of learning. Here, we examine evidence for asymmetric reward-learning in OCD and PG by leveraging model-based functional magnetic resonance imaging (fMRI). Compared with healthy controls (HC), OCD patients exhibited a lower learning rate for worse-than-expected outcomes, which was associated with the attenuated encoding of negative reward prediction errors in the dorsomedial prefrontal cortex and the dorsal striatum. PG patients showed higher and lower learning rates for better- and worse-than-expected outcomes, respectively, accompanied by higher encoding of positive reward prediction errors in the anterior insula than HC. Perseveration did not differ considerably between the patient groups and HC. These findings elucidate the neural computations of reward-learning that are altered in OCD and PG, providing a potential account of behavioural inflexibility in those mental disorders.

Asunto(s)

Juego de Azar , Trastorno Obsesivo Compulsivo , Humanos , Refuerzo en Psicología , Recompensa , Trastorno Obsesivo Compulsivo/diagnóstico por imagen , Corteza Prefrontal/diagnóstico por imagen , Imagen por Resonancia Magnética

2.

The Neural Bases of Action-Outcome Learning in Humans.

Morris, Richard W; Dezfouli, Amir; Griffiths, Kristi R; Le Pelley, Mike E; Balleine, Bernard W.

J Neurosci ; 42(17): 3636-3647, 2022 04 27.

Artículo en Inglés | MEDLINE | ID: mdl-35296548

RESUMEN

From an associative perspective the acquisition of new goal-directed actions requires the encoding of specific action-outcome (AO) associations and, therefore, sensitivity to the validity of an action as a predictor of a specific outcome relative to other events. Although competitive architectures have been proposed within associative learning theory to achieve this kind of identity-based selection, whether and how these architectures are implemented by the brain is still a matter of conjecture. To investigate this issue, we trained human participants to encode various AO associations while undergoing functional neuroimaging (fMRI). We then degraded one AO contingency by increasing the probability of the outcome in the absence of its associated action while keeping other AO contingencies intact. We found that this treatment selectively reduced performance of the degraded action. Furthermore, when a signal predicted the unpaired outcome, performance of the action was restored, suggesting that the degradation effect reflects competition between the action and the context for prediction of the specific outcome. We used a Kalman filter to model the contribution of different causal variables to AO learning and found that activity in the medial prefrontal cortex (mPFC) and the dorsal anterior cingulate cortex (dACC) tracked changes in the association of the action and context, respectively, with regard to the specific outcome. Furthermore, we found the mPFC participated in a network with the striatum and posterior parietal cortex to segregate the influence of the various competing predictors to establish specific AO associations.SIGNIFICANCE STATEMENT Humans and other animals learn the consequences of their actions, allowing them to control their environment in a goal-directed manner. Nevertheless, it is unknown how we parse environmental causes from the effects of our own actions to establish these specific action-outcome (AO) relationships. Here, we show that the brain learns the causal structure of the environment by segregating the unique influence of actions from other causes in the medial prefrontal and anterior cingulate cortices and, through a network of structures, including the caudate nucleus and posterior parietal cortex, establishes the distinct causal relationships from which specific AO associations are formed.

Asunto(s)

Giro del Cíngulo , Aprendizaje , Animales , Cuerpo Estriado , Humanos , Imagen por Resonancia Magnética , Lóbulo Parietal , Corteza Prefrontal , Aprendizaje Basado en Problemas

3.

Adversarial vulnerabilities of human decision-making.

Dezfouli, Amir; Nock, Richard; Dayan, Peter.

Proc Natl Acad Sci U S A ; 117(46): 29221-29228, 2020 11 17.

Artículo en Inglés | MEDLINE | ID: mdl-33148802

RESUMEN

Adversarial examples are carefully crafted input patterns that are surprisingly poorly classified by artificial and/or natural neural networks. Here we examine adversarial vulnerabilities in the processes responsible for learning and choice in humans. Building upon recent recurrent neural network models of choice processes, we propose a general framework for generating adversarial opponents that can shape the choices of individuals in particular decision-making tasks toward the behavioral patterns desired by the adversary. We show the efficacy of the framework through three experiments involving action selection, response inhibition, and social decision-making. We further investigate the strategy used by the adversary in order to gain insights into the vulnerabilities of human choice. The framework may find applications across behavioral sciences in helping detect and avoid flawed choice.

Asunto(s)

Toma de Decisiones/fisiología , Aprendizaje/fisiología , Recompensa , Conducta de Elección/fisiología , Simulación por Computador , Humanos , Redes Neurales de la Computación , Refuerzo en Psicología

4.

Learning the structure of the world: The adaptive nature of state-space and action representations in multi-stage decision-making.

Dezfouli, Amir; Balleine, Bernard W.

PLoS Comput Biol ; 15(9): e1007334, 2019 09.

Artículo en Inglés | MEDLINE | ID: mdl-31490932

RESUMEN

State-space and action representations form the building blocks of decision-making processes in the brain; states map external cues to the current situation of the agent whereas actions provide the set of motor commands from which the agent can choose to achieve specific goals. Although these factors differ across environments, it is currently unknown whether or how accurately state and action representations are acquired by the agent because previous experiments have typically provided this information a priori through instruction or pre-training. Here we studied how state and action representations adapt to reflect the structure of the world when such a priori knowledge is not available. We used a sequential decision-making task in rats in which they were required to pass through multiple states before reaching the goal, and for which the number of states and how they map onto external cues were unknown a priori. We found that, early in training, animals selected actions as if the task was not sequential and outcomes were the immediate consequence of the most proximal action. During the course of training, however, rats recovered the true structure of the environment and made decisions based on the expanded state-space, reflecting the multiple stages of the task. Similarly, we found that the set of actions expanded with training, although the emergence of new action sequences was sensitive to the experimental parameters and specifics of the training procedure. We conclude that the profile of choices shows a gradual shift from simple representations to more complex structures compatible with the structure of the world.

Asunto(s)

Biología Computacional/métodos , Toma de Decisiones/fisiología , Aprendizaje/fisiología , Algoritmos , Animales , Conducta Animal , Señales (Psicología) , Masculino , Modelos Biológicos , Ratas , Ratas Wistar

5.

Hierarchical Bayesian inference for concurrent model fitting and comparison for group studies.

Piray, Payam; Dezfouli, Amir; Heskes, Tom; Frank, Michael J; Daw, Nathaniel D.

PLoS Comput Biol ; 15(6): e1007043, 2019 06.

Artículo en Inglés | MEDLINE | ID: mdl-31211783

RESUMEN

Computational modeling plays an important role in modern neuroscience research. Much previous research has relied on statistical methods, separately, to address two problems that are actually interdependent. First, given a particular computational model, Bayesian hierarchical techniques have been used to estimate individual variation in parameters over a population of subjects, leveraging their population-level distributions. Second, candidate models are themselves compared, and individual variation in the expressed model estimated, according to the fits of the models to each subject. The interdependence between these two problems arises because the relevant population for estimating parameters of a model depends on which other subjects express the model. Here, we propose a hierarchical Bayesian inference (HBI) framework for concurrent model comparison, parameter estimation and inference at the population level, combining previous approaches. We show that this framework has important advantages for both parameter estimation and model comparison theoretically and experimentally. The parameters estimated by the HBI show smaller errors compared to other methods. Model comparison by HBI is robust against outliers and is not biased towards overly simplistic models. Furthermore, the fully Bayesian approach of our theory enables researchers to make inference on group-level parameters by performing HBI t-test.

Asunto(s)

Teorema de Bayes , Biología Computacional/métodos , Modelos Neurológicos , Simulación por Computador , Toma de Decisiones/fisiología , Humanos , Aprendizaje/fisiología

6.

Models that learn how humans learn: The case of decision-making and its disorders.

Dezfouli, Amir; Griffiths, Kristi; Ramos, Fabio; Dayan, Peter; Balleine, Bernard W.

PLoS Comput Biol ; 15(6): e1006903, 2019 06.

Artículo en Inglés | MEDLINE | ID: mdl-31185008

RESUMEN

Popular computational models of decision-making make specific assumptions about learning processes that may cause them to underfit observed behaviours. Here we suggest an alternative method using recurrent neural networks (RNNs) to generate a flexible family of models that have sufficient capacity to represent the complex learning and decision- making strategies used by humans. In this approach, an RNN is trained to predict the next action that a subject will take in a decision-making task and, in this way, learns to imitate the processes underlying subjects' choices and their learning abilities. We demonstrate the benefits of this approach using a new dataset drawn from patients with either unipolar (n = 34) or bipolar (n = 33) depression and matched healthy controls (n = 34) making decisions on a two-armed bandit task. The results indicate that this new approach is better than baseline reinforcement-learning methods in terms of overall performance and its capacity to predict subjects' choices. We show that the model can be interpreted using off-policy simulations and thereby provides a novel clustering of subjects' learning processes-something that often eludes traditional approaches to modelling and behavioural analysis.

Asunto(s)

Simulación por Computador , Toma de Decisiones/fisiología , Aprendizaje/fisiología , Modelos Psicológicos , Adulto , Trastorno Bipolar/fisiopatología , Biología Computacional , Trastorno Depresivo/fisiopatología , Femenino , Humanos , Masculino , Persona de Mediana Edad , Redes Neurales de la Computación , Adulto Joven

7.

Optimizing the depth and the direction of prospective planning using information values.

Sezener, Can Eren; Dezfouli, Amir; Keramati, Mehdi.

PLoS Comput Biol ; 15(3): e1006827, 2019 03.

Artículo en Inglés | MEDLINE | ID: mdl-30861001

RESUMEN

Evaluating the future consequences of actions is achievable by simulating a mental search tree into the future. Expanding deep trees, however, is computationally taxing. Therefore, machines and humans use a plan-until-habit scheme that simulates the environment up to a limited depth and then exploits habitual values as proxies for consequences that may arise in the future. Two outstanding questions in this scheme are "in which directions the search tree should be expanded?", and "when should the expansion stop?". Here we propose a principled solution to these questions based on a speed/accuracy tradeoff: deeper expansion in the appropriate directions leads to more accurate planning, but at the cost of slower decision-making. Our simulation results show how this algorithm expands the search tree effectively and efficiently in a grid-world environment. We further show that our algorithm can explain several behavioral patterns in animals and humans, namely the effect of time-pressure on the depth of planning, the effect of reward magnitudes on the direction of planning, and the gradual shift from goal-directed to habitual behavior over the course of training. The algorithm also provides several predictions testable in animal/human experiments.

Asunto(s)

Técnicas de Planificación , Algoritmos , Animales , Conducta de Elección , Humanos , Estudios Prospectivos , Recompensa

8.

Hierarchical Action Control: Adaptive Collaboration Between Actions and Habits.

Balleine, Bernard W; Dezfouli, Amir.

Front Psychol ; 10: 2735, 2019.

Artículo en Inglés | MEDLINE | ID: mdl-31920796

RESUMEN

It is now commonly accepted that instrumental actions can reflect goal-directed control; i.e., they can show sensitivity to changes in the relationship to and the value of their consequences. With overtraining, stress, neurodegeneration, psychiatric conditions, or after exposure to various drugs of abuse, goal-directed control declines and instrumental actions are performed independently of their consequences. Although this latter insensitivity has been argued to reflect the development of habitual control, the lack of a positive definition of habits has rendered this conclusion controversial. Here we consider various alternative definitions of habit, including recent suggestions they reflect chunked action sequences, to derive criteria with which to categorize responses as habitual. We consider various theories regarding the interaction between goal-directed and habitual controllers and propose a collaborative model based on their hierarchical integration. We argue that this model is consistent with the available data, can be instantiated both at an associative level and computationally and generates interesting predictions regarding the influence of this collaborative integration on behavior.

9.

Optimal response vigor and choice under non-stationary outcome values.

Dezfouli, Amir; Balleine, Bernard W; Nock, Richard.

Psychon Bull Rev ; 26(1): 182-204, 2019 Feb.

Artículo en Inglés | MEDLINE | ID: mdl-29971644

RESUMEN

Within a rational framework, a decision-maker selects actions based on the reward-maximization principle, which stipulates that they acquire outcomes with the highest value at the lowest cost. Action selection can be divided into two dimensions: selecting an action from various alternatives, and choosing its vigor, i.e., how fast the selected action should be executed. Both of these dimensions depend on the values of outcomes, which are often affected as more outcomes are consumed together with their associated actions. Despite this, previous research has only addressed the computational substrate of optimal actions in the specific condition that the values of outcomes are constant. It is not known what actions are optimal when the values of outcomes are non-stationary. Here, based on an optimal control framework, we derive a computational model for optimal actions when outcome values are non-stationary. The results imply that, even when the values of outcomes are changing, the optimal response rate is constant rather than decreasing. This finding shows that, in contrast to previous theories, commonly observed changes in action rate cannot be attributed solely to changes in outcome value. We then prove that this observation can be explained based on uncertainty about temporal horizons; e.g., the session duration. We further show that, when multiple outcomes are available, the model explains probability matching as well as maximization strategies. The model therefore provides a quantitative analysis of optimal action and explicit predictions for future testing.

Asunto(s)

Conducta de Elección , Modelos Psicológicos , Recompensa , Humanos

10.

Medial Orbitofrontal Cortex Mediates Outcome Retrieval in Partially Observable Task Situations.

Bradfield, Laura A; Dezfouli, Amir; van Holstein, Mieke; Chieng, Billy; Balleine, Bernard W.

Neuron ; 88(6): 1268-1280, 2015 Dec 16.

Artículo en Inglés | MEDLINE | ID: mdl-26627312

RESUMEN

Choice between actions often requires the ability to retrieve action consequences in circumstances where they are only partially observable. This capacity has recently been argued to depend on orbitofrontal cortex; however, no direct evidence for this hypothesis has been reported. Here, we examined whether activity in the medial orbitofrontal cortex (mOFC) underlies this critical determinant of decision-making in rats. First, we simulated predictions from this hypothesis for various tests of goal-directed action by removing the assumption that rats could retrieve partially observable outcomes and then tested those predictions experimentally using manipulations of the mOFC. The results closely followed predictions; consistent deficits only emerged when action consequences had to be retrieved. Finally, we put action selection based on observable and unobservable outcomes into conflict and found that whereas intact rats selected actions based on the value of retrieved outcomes, mOFC rats relied solely on the value of observable outcomes.

Asunto(s)

Conducta de Elección/fisiología , Corteza Prefrontal/fisiología , Desempeño Psicomotor/fisiología , Recompensa , Animales , Masculino , Ratas , Ratas Long-Evans

11.

Habits as action sequences: hierarchical action control and changes in outcome value.

Dezfouli, Amir; Lingawi, Nura W; Balleine, Bernard W.

Philos Trans R Soc Lond B Biol Sci ; 369(1655)2014 Nov 05.

Artículo en Inglés | MEDLINE | ID: mdl-25267824

RESUMEN

Goal-directed action involves making high-level choices that are implemented using previously acquired action sequences to attain desired goals. Such a hierarchical schema is necessary for goal-directed actions to be scalable to real-life situations, but results in decision-making that is less flexible than when action sequences are unfolded and the decision-maker deliberates step-by-step over the outcome of each individual action. In particular, from this perspective, the offline revaluation of any outcomes that fall within action sequence boundaries will be invisible to the high-level planner resulting in decisions that are insensitive to such changes. Here, within the context of a two-stage decision-making task, we demonstrate that this property can explain the emergence of habits. Next, we show how this hierarchical account explains the insensitivity of over-trained actions to changes in outcome value. Finally, we provide new data that show that, under extended extinction conditions, habitual behaviour can revert to goal-directed control, presumably as a consequence of decomposing action sequences into single actions. This hierarchical view suggests that the development of action sequences and the insensitivity of actions to changes in outcome value are essentially two sides of the same coin, explaining why these two aspects of automatic behaviour involve a shared neural structure.

Asunto(s)

Toma de Decisiones/fisiología , Objetivos , Hábitos , Aprendizaje/fisiología , Animales , Humanos , Roedores

12.

Action-value comparisons in the dorsolateral prefrontal cortex control choice between goal-directed actions.

Morris, Richard W; Dezfouli, Amir; Griffiths, Kristi R; Balleine, Bernard W.

Nat Commun ; 5: 4390, 2014 Jul 23.

Artículo en Inglés | MEDLINE | ID: mdl-25055179

RESUMEN

It is generally assumed that choice between different actions reflects the difference between their action values yet little direct evidence confirming this assumption has been reported. Here we assess whether the brain calculates the absolute difference between action values or their relative advantage, that is, the probability that one action is better than the other alternatives. We use a two-armed bandit task during functional magnetic resonance imaging and modelled responses to determine both the size of the difference between action values (D) and the probability that one action value is better (P). The results show haemodynamic signals corresponding to P in right dorsolateral prefrontal cortex (dlPFC) together with evidence that these signals modulate motor cortex activity in an action-specific manner. We find no significant activity related to D. These findings demonstrate that a distinct neuronal population mediates action-value comparisons, and reveals how these comparisons are implemented to mediate value-based decision-making.

Asunto(s)

Conducta de Elección/fisiología , Toma de Decisiones/fisiología , Objetivos , Corteza Prefrontal/anatomía & histología , Corteza Prefrontal/fisiología , Adolescente , Adulto , Teorema de Bayes , Mapeo Encefálico , Femenino , Humanos , Aprendizaje/fisiología , Imagen por Resonancia Magnética , Masculino , Modelos Estadísticos , Análisis y Desempeño de Tareas , Adulto Joven

13.

Actions, action sequences and habits: evidence that goal-directed and habitual action control are hierarchically organized.

Dezfouli, Amir; Balleine, Bernard W.

PLoS Comput Biol ; 9(12): e1003364, 2013.

Artículo en Inglés | MEDLINE | ID: mdl-24339762

RESUMEN

Behavioral evidence suggests that instrumental conditioning is governed by two forms of action control: a goal-directed and a habit learning process. Model-based reinforcement learning (RL) has been argued to underlie the goal-directed process; however, the way in which it interacts with habits and the structure of the habitual process has remained unclear. According to a flat architecture, the habitual process corresponds to model-free RL, and its interaction with the goal-directed process is coordinated by an external arbitration mechanism. Alternatively, the interaction between these systems has recently been argued to be hierarchical, such that the formation of action sequences underlies habit learning and a goal-directed process selects between goal-directed actions and habitual sequences of actions to reach the goal. Here we used a two-stage decision-making task to test predictions from these accounts. The hierarchical account predicts that, because they are tied to each other as an action sequence, selecting a habitual action in the first stage will be followed by a habitual action in the second stage, whereas the flat account predicts that the statuses of the first and second stage actions are independent of each other. We found, based on subjects' choices and reaction times, that human subjects combined single actions to build action sequences and that the formation of such action sequences was sufficient to explain habitual actions. Furthermore, based on Bayesian model comparison, a family of hierarchical RL models, assuming a hierarchical interaction between habit and goal-directed processes, provided a better fit of the subjects' behavior than a family of flat models. Although these findings do not rule out all possible model-free accounts of instrumental conditioning, they do show such accounts are not necessary to explain habitual actions and provide a new basis for understanding how goal-directed and habitual action control interact.

Asunto(s)

Objetivos , Teorema de Bayes , Toma de Decisiones , Humanos , Motivación , Tiempo de Reacción

14.

Treatment outcome predictors in flexible dose-duration methadone detoxification program.

Ekhtiari, Hamed; Dezfouli, Amir; Zamanian, Behnam; Ghodousi, Arash; Mokri, Azarakhsh.

Arch Iran Med ; 16(10): 599-601, 2013 Oct.

Artículo en Inglés | MEDLINE | ID: mdl-24093142

RESUMEN

Methadone detoxification is among the widely used treatment programs for opioid dependence. The aims of this study were to identify which patient baseline factors and treatment regimen features are predictors of the treatment outcome in an outpatient flexible dose-duration methadone detoxification program. We studied 126 opioid dependents in a naturalistic nonexperimental clinical setting. The patients were assessed for baseline demographic characteristics, and drug abuse characteristics. Treatment regimen features were recorded during the program. Successful treatment completion was defined as the last daily dose of methadone being less than 15 mg, negative urine analysis in the last two weeks of treatment, and based on the final clinician-client's decision. Out of 126 patients, 60 patients completed detoxification successfully. Younger age, longer duration of the opioid abuse, and higher subjective opiate intoxication severity before treatment entry were all significantly associated with negative treatment outcome. Among treatment regimen features, higher maximum methadone dose had a marginally significant independent effect on treatment failure. Patients with maximum methadone dose of more than 75 mg per day had around ten times worse success rate when compared to those who received lesser doses. The study findings could be used to predict treatment outcome and prognosis in a more individualized and patient-tailored approach in the real clinical setting. Guideline development for treatment selection and outcome monitoring in addiction medicine based on similar studies could enhance treatment outcome in clinical services.

Asunto(s)

Metadona/uso terapéutico , Trastornos Relacionados con Opioides/tratamiento farmacológico , Adolescente , Adulto , Femenino , Humanos , Masculino , Estudios Prospectivos , Factores de Tiempo , Resultado del Tratamiento

15.

Habits, action sequences and reinforcement learning.

Dezfouli, Amir; Balleine, Bernard W.

Eur J Neurosci ; 35(7): 1036-51, 2012 Apr.

Artículo en Inglés | MEDLINE | ID: mdl-22487034

RESUMEN

It is now widely accepted that instrumental actions can be either goal-directed or habitual; whereas the former are rapidly acquired and regulated by their outcome, the latter are reflexive, elicited by antecedent stimuli rather than their consequences. Model-based reinforcement learning (RL) provides an elegant description of goal-directed action. Through exposure to states, actions and rewards, the agent rapidly constructs a model of the world and can choose an appropriate action based on quite abstract changes in environmental and evaluative demands. This model is powerful but has a problem explaining the development of habitual actions. To account for habits, theorists have argued that another action controller is required, called model-free RL, that does not form a model of the world but rather caches action values within states allowing a state to select an action based on its reward history rather than its consequences. Nevertheless, there are persistent problems with important predictions from the model; most notably the failure of model-free RL correctly to predict the insensitivity of habitual actions to changes in the action-reward contingency. Here, we suggest that introducing model-free RL in instrumental conditioning is unnecessary, and demonstrate that reconceptualizing habits as action sequences allows model-based RL to be applied to both goal-directed and habitual actions in a manner consistent with what real animals do. This approach has significant implications for the way habits are currently investigated and generates new experimental predictions.

Asunto(s)

Objetivos , Hábitos , Aprendizaje/fisiología , Refuerzo en Psicología , Animales , Humanos , Distribución Aleatoria

16.

Speed/accuracy trade-off between the habitual and the goal-directed processes.

Keramati, Mehdi; Dezfouli, Amir; Piray, Payam.

PLoS Comput Biol ; 7(5): e1002055, 2011 May.

Artículo en Inglés | MEDLINE | ID: mdl-21637741

RESUMEN

Instrumental responses are hypothesized to be of two kinds: habitual and goal-directed, mediated by the sensorimotor and the associative cortico-basal ganglia circuits, respectively. The existence of the two heterogeneous associative learning mechanisms can be hypothesized to arise from the comparative advantages that they have at different stages of learning. In this paper, we assume that the goal-directed system is behaviourally flexible, but slow in choice selection. The habitual system, in contrast, is fast in responding, but inflexible in adapting its behavioural strategy to new conditions. Based on these assumptions and using the computational theory of reinforcement learning, we propose a normative model for arbitration between the two processes that makes an approximately optimal balance between search-time and accuracy in decision making. Behaviourally, the model can explain experimental evidence on behavioural sensitivity to outcome at the early stages of learning, but insensitivity at the later stages. It also explains that when two choices with equal incentive values are available concurrently, the behaviour remains outcome-sensitive, even after extensive training. Moreover, the model can explain choice reaction time variations during the course of learning, as well as the experimental observation that as the number of choices increases, the reaction time also increases. Neurobiologically, by assuming that phasic and tonic activities of midbrain dopamine neurons carry the reward prediction error and the average reward signals used by the model, respectively, the model predicts that whereas phasic dopamine indirectly affects behaviour through reinforcing stimulus-response associations, tonic dopamine can directly affect behaviour through manipulating the competition between the habitual and the goal-directed systems and thus, affect reaction time.

Asunto(s)

Conducta de Elección/fisiología , Toma de Decisiones/fisiología , Aprendizaje/fisiología , Modelos Neurológicos , Algoritmos , Animales , Conducta Animal , Simulación por Computador , Dopamina/fisiología , Objetivos , Humanos , Cadenas de Markov , Aprendizaje por Laberinto , Neuronas/fisiología , Ratas , Refuerzo en Psicología , Reproducibilidad de los Resultados

17.

Individual differences in nucleus accumbens dopamine receptors predict development of addiction-like behavior: a computational approach.

Piray, Payam; Keramati, Mohammad Mahdi; Dezfouli, Amir; Lucas, Caro; Mokri, Azarakhsh.

Neural Comput ; 22(9): 2334-68, 2010 Sep 01.

Artículo en Inglés | MEDLINE | ID: mdl-20569176

RESUMEN

Clinical and experimental observations show individual differences in the development of addiction. Increasing evidence supports the hypothesis that dopamine receptor availability in the nucleus accumbens (NAc) predisposes drug reinforcement. Here, modeling striatal-midbrain dopaminergic circuit, we propose a reinforcement learning model for addiction based on the actor-critic model of striatum. Modeling dopamine receptors in the NAc as modulators of learning rate for appetitive--but not aversive--stimuli in the critic--but not the actor--we define vulnerability to addiction as a relatively lower learning rate for the appetitive stimuli, compared to aversive stimuli, in the critic. We hypothesize that an imbalance in this learning parameter used by appetitive and aversive learning systems can result in addiction. We elucidate that the interaction between the degree of individual vulnerability and the duration of exposure to drug has two progressive consequences: deterioration of the imbalance and establishment of an abnormal habitual response in the actor. Using computational language, the proposed model describes how development of compulsive behavior can be a function of both degree of drug exposure and individual vulnerability. Moreover, the model describes how involvement of the dorsal striatum in addiction can be augmented progressively. The model also interprets other forms of addiction, such as obesity and pathological gambling, in a common mechanism with drug addiction. Finally, the model provides an answer for the question of why behavioral addictions are triggered in Parkinson's disease patients by D2 dopamine agonist treatments.

Asunto(s)

Conducta Adictiva/fisiopatología , Individualidad , Núcleo Accumbens/fisiopatología , Receptores Dopaminérgicos/fisiología , Refuerzo en Psicología , Simulación por Computador , Humanos , Modelos Neurológicos , Red Nerviosa/fisiopatología

18.

A neurocomputational model for cocaine addiction.

Dezfouli, Amir; Piray, Payam; Keramati, Mohammad Mahdi; Ekhtiari, Hamed; Lucas, Caro; Mokri, Azarakhsh.

Neural Comput ; 21(10): 2869-93, 2009 Oct.

Artículo en Inglés | MEDLINE | ID: mdl-19635010

RESUMEN

Based on the dopamine hypotheses of cocaine addiction and the assumption of decrement of brain reward system sensitivity after long-term drug exposure, we propose a computational model for cocaine addiction. Utilizing average reward temporal difference reinforcement learning, we incorporate the elevation of basal reward threshold after long-term drug exposure into the model of drug addiction proposed by Redish. Our model is consistent with the animal models of drug seeking under punishment. In the case of nondrug reward, the model explains increased impulsivity after long-term drug exposure. Furthermore, the existence of a blocking effect for cocaine is predicted by our model.

Asunto(s)

Encéfalo/efectos de los fármacos , Encéfalo/fisiopatología , Trastornos Relacionados con Cocaína/fisiopatología , Cocaína/farmacología , Simulación por Computador , Recompensa , Algoritmos , Animales , Química Encefálica/efectos de los fármacos , Química Encefálica/fisiología , Toma de Decisiones/efectos de los fármacos , Toma de Decisiones/fisiología , Modelos Animales de Enfermedad , Dopamina/metabolismo , Inhibidores de Captación de Dopamina/farmacología , Humanos , Conducta Impulsiva/inducido químicamente , Conducta Impulsiva/fisiopatología , Aprendizaje/efectos de los fármacos , Aprendizaje/fisiología , Refuerzo en Psicología

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA