Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
1.
PLoS Comput Biol ; 19(2): e1010864, 2023 02.
Artigo em Inglês | MEDLINE | ID: mdl-36745688

RESUMO

To adapt to their environments, animals learn associations between sensory stimuli and unconditioned stimuli. In invertebrates, olfactory associative learning primarily occurs in the mushroom body, which is segregated into separate compartments. Within each compartment, Kenyon cells (KCs) encoding sparse odor representations project onto mushroom body output neurons (MBONs) whose outputs guide behavior. Associated with each compartment is a dopamine neuron (DAN) that modulates plasticity of the KC-MBON synapses within the compartment. Interestingly, DAN-induced plasticity of the KC-MBON synapse is imbalanced in the sense that it only weakens the synapse and is temporally sparse. We propose a normative mechanistic model of the MBON as a linear discriminant analysis (LDA) classifier that predicts the presence of an unconditioned stimulus (class identity) given a KC odor representation (feature vector). Starting from a principled LDA objective function and under the assumption of temporally sparse DAN activity, we derive an online algorithm which maps onto the mushroom body compartment. Our model accounts for the imbalanced learning at the KC-MBON synapse and makes testable predictions that provide clear contrasts with existing models.


Assuntos
Aprendizagem , Corpos Pedunculados , Animais , Corpos Pedunculados/fisiologia , Análise Discriminante , Aprendizagem/fisiologia , Olfato/fisiologia , Drosophila melanogaster/fisiologia , Odorantes , Neurônios Dopaminérgicos/fisiologia
2.
J Neurosci ; 38(18): 4383-4398, 2018 05 02.
Artigo em Inglês | MEDLINE | ID: mdl-29626169

RESUMO

Monkeys and other animals appear to share with humans two risk attitudes predicted by prospect theory: an inverse-S-shaped probability-weighting (PW) function and a steeper utility curve for losses than for gains. These findings suggest that such preferences are stable traits with common neural substrates. We hypothesized instead that animals tailor their preferences to subtle changes in task contexts, making risk attitudes flexible. Previous studies used a limited number of outcomes, trial types, and contexts. To gain a broader perspective, we examined two large datasets of male macaques' risky choices: one from a task with real (juice) gains and another from a token task with gains and losses. In contrast to previous findings, monkeys were risk seeking for both gains and losses (i.e., lacked a reflection effect) and showed steeper gain than loss curves (loss seeking). Utility curves for gains were substantially different in the two tasks. Monkeys showed nearly linear PWs in one task and S-shaped ones in the other; neither task produced a consistent inverse-S-shaped curve. To account for these observations, we developed and tested various computational models of the processes involved in the construction of reward value. We found that adaptive differential weighting of prospective gamble outcomes could partially account for the observed differences in the utility functions across the two experiments and thus provide a plausible mechanism underlying flexible risk attitudes. Together, our results support the idea that risky choices are constructed flexibly at the time of elicitation and place important constraints on neural models of economic choice.SIGNIFICANCE STATEMENT We respond in reliable ways to risk, but are our risk preferences stable traits or ephemeral states? Using various computational models, we examined two large datasets of macaque risky choices in two different tasks. We observed several deviations from "classic" risk preferences seen in humans and monkeys: no reflection effect, loss seeking as opposed to loss aversion, and linear and S-shaped, as opposed to inverse-S-shaped, probability distortion. These results challenge the idea that our risk attitudes are evolved traits shared with the last common ancestor of macaques and humans, suggesting instead that behavioral flexibility is the hallmark of risky choice in primates. We show how this flexibility can emerge partly as a result of interactions between attentional and reward systems.


Assuntos
Atitude , Assunção de Riscos , Algoritmos , Animais , Simulação por Computador , Tomada de Decisões , Feminino , Jogo de Azar/psicologia , Macaca mulatta , Masculino , Recompensa
3.
PLoS Comput Biol ; 14(3): e1006070, 2018 03.
Artigo em Inglês | MEDLINE | ID: mdl-29584717

RESUMO

When making choices, collecting more information is beneficial but comes at the cost of sacrificing time that could be allocated to making other potentially rewarding decisions. To investigate how the brain balances these costs and benefits, we conducted a series of novel experiments in humans and simulated various computational models. Under six levels of time pressure, subjects made decisions either by integrating sensory information over time or by dynamically combining sensory and reward information over time. We found that during sensory integration, time pressure reduced performance as the deadline approached, and choice was more strongly influenced by the most recent sensory evidence. By fitting performance and reaction time with various models we found that our experimental results are more compatible with leaky integration of sensory information with an urgency signal or a decision process based on stochastic transitions between discrete states modulated by an urgency signal. When combining sensory and reward information, subjects spent less time on integration than optimally prescribed when reward decreased slowly over time, and the most recent evidence did not have the maximal influence on choice. The suboptimal pattern of reaction time was partially mitigated in an equivalent control experiment in which sensory integration over time was not required, indicating that the suboptimal response time was influenced by the perception of imperfect sensory integration. Meanwhile, during combination of sensory and reward information, performance did not drop as the deadline approached, and response time was not different between correct and incorrect trials. These results indicate a decision process different from what is involved in the integration of sensory information over time. Together, our results not only reveal limitations in sensory integration over time but also illustrate how these limitations influence dynamic combination of sensory and reward information.


Assuntos
Comportamento de Escolha/fisiologia , Tomada de Decisões/ética , Adulto , Encéfalo , Simulação por Computador , Tomada de Decisões/fisiologia , Feminino , Humanos , Aprendizagem , Masculino , Modelos Neurológicos , Percepção , Estimulação Luminosa/métodos , Desempenho Psicomotor/fisiologia , Tempo de Reação/fisiologia , Recompensa , Tempo , Adulto Jovem
4.
Nat Neurosci ; 26(2): 339-349, 2023 02.
Artigo em Inglês | MEDLINE | ID: mdl-36635497

RESUMO

Recent experiments have revealed that neural population codes in many brain areas continuously change even when animals have fully learned and stably perform their tasks. This representational 'drift' naturally leads to questions about its causes, dynamics and functions. Here we explore the hypothesis that neural representations optimize a representational objective with a degenerate solution space, and noisy synaptic updates drive the network to explore this (near-)optimal space causing representational drift. We illustrate this idea and explore its consequences in simple, biologically plausible Hebbian/anti-Hebbian network models of representation learning. We find that the drifting receptive fields of individual neurons can be characterized by a coordinated random walk, with effective diffusion constants depending on various parameters such as learning rate, noise amplitude and input statistics. Despite such drift, the representational similarity of population codes is stable over time. Our model recapitulates experimental observations in the hippocampus and posterior parietal cortex and makes testable predictions that can be probed in future experiments.


Assuntos
Encéfalo , Aprendizagem , Animais , Aprendizagem/fisiologia , Neurônios/fisiologia , Hipocampo , Cabeça , Modelos Neurológicos
5.
Nat Commun ; 12(1): 7191, 2021 12 10.
Artigo em Inglês | MEDLINE | ID: mdl-34893597

RESUMO

Learning appropriate representations of the reward environment is challenging in the real world where there are many options, each with multiple attributes or features. Despite existence of alternative solutions for this challenge, neural mechanisms underlying emergence and adoption of value representations and learning strategies remain unknown. To address this, we measure learning and choice during a multi-dimensional probabilistic learning task in humans and trained recurrent neural networks (RNNs) to capture our experimental observations. We find that human participants estimate stimulus-outcome associations by learning and combining estimates of reward probabilities associated with the informative feature followed by those of informative conjunctions. Through analyzing representations, connectivity, and lesioning of the RNNs, we demonstrate this mixed learning strategy relies on a distributed neural code and opponency between excitatory and inhibitory neurons through value-dependent disinhibition. Together, our results suggest computational and neural mechanisms underlying emergence of complex learning strategies in naturalistic settings.

6.
Cognition ; 205: 104425, 2020 12.
Artigo em Inglês | MEDLINE | ID: mdl-32958287

RESUMO

Most cognitive processes are studied using abstract or synthetic stimuli with specific features to fully control what is presented to subjects. However, recent studies have revealed enhancements of cognitive capacities (such as working memory) when processing naturalistic versus abstract stimuli. Using abstract stimuli constructed from distinct visual features (e.g., color and shape), we have recently shown that human subjects can learn multidimensional stimulus-reward associations via initially estimating reward value of individual features (feature-based learning) before gradually switching to learning about reward value of individual stimuli (object-based learning). Here, we examined whether similar strategies are adopted during learning about naturalistic stimuli that are clearly perceived as objects (instead of a combination of features) and contain both task-relevant and irrelevant features. We found that similar to learning about abstract stimuli, subjects initially adopted feature-based learning more strongly before transitioning to object-based learning. However, there were three key differences between learning about naturalistic and abstract stimuli. First, compared with abstract stimuli, the initial learning strategy was less feature-based for naturalistic stimuli. Second, subjects transitioned to object-based learning faster for naturalistic stimuli. Third, unexpectedly, subjects were more likely to adopt feature-based learning for naturalistic stimuli, both at the steady state and overall. These results suggest that despite the stronger tendency to perceive naturalistic stimuli as objects, which leads to greater likelihood of using object-based learning as the initial strategy and a faster transition to object-based learning, the influence of individual features on learning is stronger for these stimuli such that ultimately the object-based strategy is adopted less. Overall, our findings suggest that feature-based learning is a general initial strategy for learning about reward value of all types of multi-dimensional stimuli.


Assuntos
Aprendizagem , Recompensa , Humanos
7.
Nat Hum Behav ; 3(11): 1215-1224, 2019 11.
Artigo em Inglês | MEDLINE | ID: mdl-31501543

RESUMO

A fundamental but rarely contested assumption in economics and neuroeconomics is that decision-makers compute subjective values of risky options by multiplying functions of reward probability and magnitude. By contrast, an additive strategy for valuation allows flexible combination of reward information required in uncertain or changing environments. We hypothesized that the level of uncertainty in the reward environment should determine the strategy used for valuation and choice. To test this hypothesis, we examined choice between risky options in humans and rhesus macaques across three tasks with different levels of uncertainty. We found that whereas humans and monkeys adopted a multiplicative strategy under risk when probabilities are known, both species spontaneously adopted an additive strategy under uncertainty when probabilities must be learned. Additionally, the level of volatility influenced relative weighting of certain and uncertain reward information, and this was reflected in the encoding of reward magnitude by neurons in the dorsolateral prefrontal cortex.


Assuntos
Compreensão , Recompensa , Incerteza , Adolescente , Animais , Comportamento de Escolha , Tomada de Decisões , Feminino , Humanos , Macaca mulatta/psicologia , Masculino , Probabilidade , Risco , Adulto Jovem
8.
PLoS One ; 13(5): e0197263, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-29787566

RESUMO

Measurements of response time (RT) have long been used to infer neural processes underlying various cognitive functions such as working memory, attention, and decision making. However, it is currently unknown if RT is also informative about various stages of value-based choice, particularly how reward values are constructed. To investigate these questions, we analyzed the pattern of RT during a set of multi-dimensional learning and decision-making tasks that can prompt subjects to adopt different learning strategies. In our experiments, subjects could use reward feedback to directly learn reward values associated with possible choice options (object-based learning). Alternatively, they could learn reward values of options' features (e.g. color, shape) and combine these values to estimate reward values for individual options (feature-based learning). We found that RT was slower when the difference between subjects' estimates of reward probabilities for the two alternative objects on a given trial was smaller. Moreover, RT was overall faster when the preceding trial was rewarded or when the previously selected object was present. These effects, however, were mediated by an interaction between these factors such that subjects were faster when the previously selected object was present rather than absent but only after unrewarded trials. Finally, RT reflected the learning strategy (i.e. object-based or feature-based approach) adopted by the subject on a trial-by-trial basis, indicating an overall faster construction of reward value and/or value comparison during object-based learning. Altogether, these results demonstrate that the pattern of RT can be informative about how reward values are learned and constructed during complex value-based learning and decision making.


Assuntos
Comportamento de Escolha , Aprendizagem , Tempo de Reação , Recompensa , Antecipação Psicológica , Retroalimentação Psicológica , Feminino , Humanos , Masculino , Modelos Psicológicos
9.
Nat Commun ; 8(1): 1768, 2017 11 24.
Artigo em Inglês | MEDLINE | ID: mdl-29170381

RESUMO

Learning from reward feedback is essential for survival but can become extremely challenging with myriad choice options. Here, we propose that learning reward values of individual features can provide a heuristic for estimating reward values of choice options in dynamic, multi-dimensional environments. We hypothesize that this feature-based learning occurs not just because it can reduce dimensionality, but more importantly because it can increase adaptability without compromising precision of learning. We experimentally test this hypothesis and find that in dynamic environments, human subjects adopt feature-based learning even when this approach does not reduce dimensionality. Even in static, low-dimensional environments, subjects initially adopt feature-based learning and gradually switch to learning reward values of individual options, depending on how accurately objects' values can be predicted by combining feature values. Our computational models reproduce these results and highlight the importance of neurons coding feature values for parallel learning of values for features and objects.


Assuntos
Comportamento de Escolha , Aprendizagem , Adolescente , Adulto , Simulação por Computador , Feminino , Humanos , Masculino , Neurônios/fisiologia , Recompensa , Adulto Jovem
10.
Neuron ; 94(2): 401-414.e6, 2017 Apr 19.
Artigo em Inglês | MEDLINE | ID: mdl-28426971

RESUMO

Value-based decision making often involves integration of reward outcomes over time, but this becomes considerably more challenging if reward assignments on alternative options are probabilistic and non-stationary. Despite the existence of various models for optimally integrating reward under uncertainty, the underlying neural mechanisms are still unknown. Here we propose that reward-dependent metaplasticity (RDMP) can provide a plausible mechanism for both integration of reward under uncertainty and estimation of uncertainty itself. We show that a model based on RDMP can robustly perform the probabilistic reversal learning task via dynamic adjustment of learning based on reward feedback, while changes in its activity signal unexpected uncertainty. The model predicts time-dependent and choice-specific learning rates that strongly depend on reward history. Key predictions from this model were confirmed with behavioral data from non-human primates. Overall, our results suggest that metaplasticity can provide a neural substrate for adaptive learning and choice under uncertainty.


Assuntos
Adaptação Psicológica/fisiologia , Encéfalo/fisiologia , Comportamento de Escolha/fisiologia , Reversão de Aprendizagem/fisiologia , Incerteza , Animais , Comportamento Animal , Macaca mulatta , Masculino , Plasticidade Neuronal
11.
Nat Commun ; 7: 11393, 2016 04 26.
Artigo em Inglês | MEDLINE | ID: mdl-27116102

RESUMO

Decision making often requires simultaneously learning about and combining evidence from various sources of information. However, when making inferences from these sources, humans show systematic biases that are often attributed to heuristics or limitations in cognitive processes. Here we use a combination of experimental and modelling approaches to reveal neural substrates of probabilistic inference and corresponding biases. We find systematic deviations from normative accounts of inference when alternative options are not equally rewarding; subjects' choice behaviour is biased towards the more rewarding option, whereas their inferences about individual cues show the opposite bias. Moreover, inference bias about combinations of cues depends on the number of cues. Using a biophysically plausible model, we link these biases to synaptic plasticity mechanisms modulated by reward expectation and attention. We demonstrate that inference relies on direct estimation of posteriors, not on combination of likelihoods and prior. Our work reveals novel mechanisms underlying cognitive biases and contributions of interactions between reward-dependent learning, decision making and attention to high-level reasoning.


Assuntos
Cognição , Tomada de Decisões , Plasticidade Neuronal , Adolescente , Atenção , Viés , Comportamento de Escolha , Sinais (Psicologia) , Feminino , Humanos , Masculino , Modelos Estatísticos , Resolução de Problemas , Recompensa , Adulto Jovem
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA