Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
1.
PLoS Comput Biol ; 20(3): e1011950, 2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-38552190

RESUMEN

Active reinforcement learning enables dynamic prediction and control, where one should not only maximize rewards but also minimize costs such as of inference, decisions, actions, and time. For an embodied agent such as a human, decisions are also shaped by physical aspects of actions. Beyond the effects of reward outcomes on learning processes, to what extent can modeling of behavior in a reinforcement-learning task be complicated by other sources of variance in sequential action choices? What of the effects of action bias (for actions per se) and action hysteresis determined by the history of actions chosen previously? The present study addressed these questions with incremental assembly of models for the sequential choice data from a task with hierarchical structure for additional complexity in learning. With systematic comparison and falsification of computational models, human choices were tested for signatures of parallel modules representing not only an enhanced form of generalized reinforcement learning but also action bias and hysteresis. We found evidence for substantial differences in bias and hysteresis across participants-even comparable in magnitude to the individual differences in learning. Individuals who did not learn well revealed the greatest biases, but those who did learn accurately were also significantly biased. The direction of hysteresis varied among individuals as repetition or, more commonly, alternation biases persisting from multiple previous actions. Considering that these actions were button presses with trivial motor demands, the idiosyncratic forces biasing sequences of action choices were robust enough to suggest ubiquity across individuals and across tasks requiring various actions. In light of how bias and hysteresis function as a heuristic for efficient control that adapts to uncertainty or low motivation by minimizing the cost of effort, these phenomena broaden the consilient theory of a mixture of experts to encompass a mixture of expert and nonexpert controllers of behavior.


Asunto(s)
Aprendizaje , Refuerzo en Psicología , Humanos , Recompensa , Aprendizaje Basado en Problemas , Sesgo
2.
Hum Brain Mapp ; 43(15): 4750-4790, 2022 10 15.
Artículo en Inglés | MEDLINE | ID: mdl-35860954

RESUMEN

The model-free algorithms of "reinforcement learning" (RL) have gained clout across disciplines, but so too have model-based alternatives. The present study emphasizes other dimensions of this model space in consideration of associative or discriminative generalization across states and actions. This "generalized reinforcement learning" (GRL) model, a frugal extension of RL, parsimoniously retains the single reward-prediction error (RPE), but the scope of learning goes beyond the experienced state and action. Instead, the generalized RPE is efficiently relayed for bidirectional counterfactual updating of value estimates for other representations. Aided by structural information but as an implicit rather than explicit cognitive map, GRL provided the most precise account of human behavior and individual differences in a reversal-learning task with hierarchical structure that encouraged inverse generalization across both states and actions. Reflecting inference that could be true, false (i.e., overgeneralization), or absent (i.e., undergeneralization), state generalization distinguished those who learned well more so than action generalization. With high-resolution high-field fMRI targeting the dopaminergic midbrain, the GRL model's RPE signals (alongside value and decision signals) were localized within not only the striatum but also the substantia nigra and the ventral tegmental area, including specific effects of generalization that also extend to the hippocampus. Factoring in generalization as a multidimensional process in value-based learning, these findings shed light on complexities that, while challenging classic RL, can still be resolved within the bounds of its core computations.


Asunto(s)
Imagen por Resonancia Magnética , Refuerzo en Psicología , Generalización Psicológica , Humanos , Aprendizaje , Imagen por Resonancia Magnética/métodos , Recompensa
3.
PLoS Comput Biol ; 13(10): e1005810, 2017 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-29049406

RESUMEN

Prediction-error signals consistent with formal models of "reinforcement learning" (RL) have repeatedly been found within dopaminergic nuclei of the midbrain and dopaminoceptive areas of the striatum. However, the precise form of the RL algorithms implemented in the human brain is not yet well determined. Here, we created a novel paradigm optimized to dissociate the subtypes of reward-prediction errors that function as the key computational signatures of two distinct classes of RL models-namely, "actor/critic" models and action-value-learning models (e.g., the Q-learning model). The state-value-prediction error (SVPE), which is independent of actions, is a hallmark of the actor/critic architecture, whereas the action-value-prediction error (AVPE) is the distinguishing feature of action-value-learning algorithms. To test for the presence of these prediction-error signals in the brain, we scanned human participants with a high-resolution functional magnetic-resonance imaging (fMRI) protocol optimized to enable measurement of neural activity in the dopaminergic midbrain as well as the striatal areas to which it projects. In keeping with the actor/critic model, the SVPE signal was detected in the substantia nigra. The SVPE was also clearly present in both the ventral striatum and the dorsal striatum. However, alongside these purely state-value-based computations we also found evidence for AVPE signals throughout the striatum. These high-resolution fMRI findings suggest that model-free aspects of reward learning in humans can be explained algorithmically with RL in terms of an actor/critic mechanism operating in parallel with a system for more direct action-value learning.


Asunto(s)
Mapeo Encefálico/métodos , Cuerpo Estriado/fisiología , Recuerdo Mental/fisiología , Mesencéfalo/fisiología , Modelos Neurológicos , Red Nerviosa/fisiología , Refuerzo en Psicología , Adaptación Fisiológica/fisiología , Simulación por Computador , Humanos , Aumento de la Imagen/métodos , Imagen por Resonancia Magnética/métodos , Plasticidad Neuronal/fisiología
4.
Hum Brain Mapp ; 35(7): 2924-34, 2014 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-24038968

RESUMEN

Intuition and an assumption of basic rationality would suggest that people evaluate a stimulus on the basis of its properties and their underlying utility. However, various findings suggest that evaluations often depend not only on what is being evaluated, but also on contextual factors. Here we demonstrate a further departure from normative decision making: Aesthetic evaluations of abstract fractal art by human subjects were predicted from pre-stimulus patterns of BOLD fMRI signals across a distributed network of frontal regions before the stimuli were presented. This predictive power was dissociated from motor biases in favor of pressing a particular button to indicate one's choice. Our findings suggest that endogenous neural signals present before stimulation can bias decisions at multiple levels of representation when evaluating stimuli.


Asunto(s)
Mapeo Encefálico , Encéfalo/irrigación sanguínea , Encéfalo/fisiología , Toma de Decisiones/fisiología , Estética , Juicio/fisiología , Adolescente , Adulto , Femenino , Estudios de Seguimiento , Humanos , Procesamiento de Imagen Asistido por Computador , Imagen por Resonancia Magnética , Masculino , Oxígeno/sangre , Estimulación Luminosa , Factores de Tiempo , Adulto Joven
5.
Sci Rep ; 13(1): 6486, 2023 04 20.
Artículo en Inglés | MEDLINE | ID: mdl-37081031

RESUMEN

Heuristics can inform human decision making in complex environments through a reduction of computational requirements (accuracy-resource trade-off) and a robustness to overparameterisation (less-is-more). However, tasks capturing the efficiency of heuristics typically ignore action proficiency in determining rewards. The requisite movement parameterisation in sensorimotor control questions whether heuristics preserve efficiency when actions are nontrivial. We developed a novel action selection-execution task requiring joint optimisation of action selection and spatio-temporal skillful execution. State-appropriate choices could be determined by a simple spatial heuristic, or by more complex planning. Computational models of action selection parsimoniously distinguished human participants who adopted the heuristic from those using a more complex planning strategy. Broader comparative analyses then revealed that participants using the heuristic showed combined decisional (selection) and skill (execution) advantages, consistent with a less-is-more framework. In addition, the skill advantage of the heuristic group was predominantly in the core spatial features that also shaped their decision policy, evidence that the dimensions of information guiding action selection might be yoked to salient features in skill learning.


Asunto(s)
Heurística , Aprendizaje , Humanos , Recompensa , Toma de Decisiones
6.
Psychol Sci ; 22(9): 1220-6, 2011 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-21852451

RESUMEN

Visual pop-out occurs when a unique visual target (e.g., a feature singleton) is present in a set of homogeneous distractors. However, the role of visual awareness in this process remains unclear. In the experiments reported here, we showed that even though subjects were not aware of a suppressed pop-out display, their subsequent performance on an orientation-discrimination task was significantly better at the pop-out location than at a control location. These results indicate that conscious visual awareness of a feature singleton is not necessary for it to attract attention. Furthermore, the subliminal pop-out effect disappeared when subjects diverted their attention toward a rapid sequential visual presentation task while presented with the same subliminal pop-out display. These results suggest that top-down attention is necessary for the subliminal pop-out effect and that the cognitive processes underlying attention and awareness are somewhat independent.


Asunto(s)
Atención , Concienciación , Adulto , Humanos , Estimulación Luminosa , Estimulación Subliminal , Percepción Visual
7.
PLoS One ; 13(8): e0203093, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-30138375

RESUMEN

[This corrects the article DOI: 10.1371/journal.pone.0186822.].

8.
PLoS One ; 12(10): e0186822, 2017.
Artículo en Inglés | MEDLINE | ID: mdl-29077746

RESUMEN

In principle, formal dynamical models of decision making hold the potential to represent fundamental computations underpinning value-based (i.e., preferential) decisions in addition to perceptual decisions. Sequential-sampling models such as the race model and the drift-diffusion model that are grounded in simplicity, analytical tractability, and optimality remain popular, but some of their more recent counterparts have instead been designed with an aim for more feasibility as architectures to be implemented by actual neural systems. Connectionist models are proposed herein at an intermediate level of analysis that bridges mental phenomena and underlying neurophysiological mechanisms. Several such models drawing elements from the established race, drift-diffusion, feedforward-inhibition, divisive-normalization, and competing-accumulator models were tested with respect to fitting empirical data from human participants making choices between foods on the basis of hedonic value rather than a traditional perceptual attribute. Even when considering performance at emulating behavior alone, more neurally plausible models were set apart from more normative race or drift-diffusion models both quantitatively and qualitatively despite remaining parsimonious. To best capture the paradigm, a novel six-parameter computational model was formulated with features including hierarchical levels of competition via mutual inhibition as well as a static approximation of attentional modulation, which promotes "winner-take-all" processing. Moreover, a meta-analysis encompassing several related experiments validated the robustness of model-predicted trends in humans' value-based choices and concomitant reaction times. These findings have yet further implications for analysis of neurophysiological data in accordance with computational modeling, which is also discussed in this new light.


Asunto(s)
Atención , Toma de Decisiones , Adolescente , Adulto , Humanos , Tiempo de Reacción , Adulto Joven
9.
Front Psychol ; 8: 2000, 2017.
Artículo en Inglés | MEDLINE | ID: mdl-29187831

RESUMEN

Decision making in any brain is imperfect and costly in terms of time and energy. Operating under such constraints, an organism could be in a position to improve performance if an opportunity arose to exploit informative patterns in the environment being searched. Such an improvement of performance could entail both faster and more accurate (i.e., reward-maximizing) decisions. The present study investigated the extent to which human participants could learn to take advantage of immediate patterns in the spatial arrangement of serially presented foods such that a region of space would consistently be associated with greater subjective value. Eye movements leading up to choices demonstrated rapidly induced biases in the selective allocation of visual fixation and attention that were accompanied by both faster and more accurate choices of desired goods as implicit learning occurred. However, for the control condition with its spatially balanced reward environment, these subjects exhibited preexisting lateralized biases for eye and hand movements (i.e., leftward and rightward, respectively) that could act in opposition not only to each other but also to the orienting biases elicited by the experimental manipulation, producing an asymmetry between the left and right hemifields with respect to performance. Potentially owing at least in part to learned cultural conventions (e.g., reading from left to right), the findings herein particularly revealed an intrinsic leftward bias underlying initial saccades in the midst of more immediate feedback-directed processes for which spatial biases can be learned flexibly to optimize oculomotor and manual control in value-based decision making. The present study thus replicates general findings of learned attentional biases in a novel context with inherently rewarding stimuli and goes on to further elucidate the interactions between endogenous and exogenous biases.

10.
J Exp Psychol Hum Percept Perform ; 38(5): 1085-90, 2012 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-22564160

RESUMEN

The contents of working memory (WM) have predominantly been viewed as necessarily conscious. However, recent findings suggest otherwise. Here we investigate whether visual WM can represent subliminal stimuli, such that the positions of an invisible moving object can be extrapolated or learned about in terms of their task-relevant predictive power. We presented a moving cue subliminally and measured subjects' performance on an orientation-discrimination task at the naturally anticipated location on the cue's trajectory and at variably predictable off-trajectory locations. Our data show that orientation discriminability at the on-trajectory location was not significantly different from that at a nearby off-trajectory location. However, orientation discriminability at locations near the final position of the cue was significantly better than that at distal locations. This finding suggests that a moving object can still attract attention when presented subliminally. In contrast, the dynamic trajectory of the object and its task-relevant predictive patterns may not be monitored and maintained in visual WM.


Asunto(s)
Atención/fisiología , Concienciación/fisiología , Memoria a Corto Plazo/fisiología , Percepción Visual/fisiología , Adulto , Señales (Psicología) , Discriminación en Psicología/fisiología , Humanos , Percepción de Movimiento/fisiología , Pruebas Neuropsicológicas , Reconocimiento Visual de Modelos/fisiología , Percepción Espacial/fisiología
11.
J Exp Psychol Hum Percept Perform ; 38(2): 267-71, 2012 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-22250867

RESUMEN

A retinally stabilized object readily undergoes perceptual fading and disappears from consciousness. This startling phenomenon is commonly believed to arise from local bottom-up sensory adaptation to edge information that occurs early in the visual pathway, such as in the lateral geniculate nucleus of the thalamus or retinal ganglion cells. Here we use random dot stereograms to generate perceivable contours or shapes that are not present on the retina and ask whether perceptual fading occurs for such "cortical" contours. Our results show that perceptual fading occurs for "cortical" contours and that the time a contour requires to fade increases as a function of its size, suggesting that retinal adaptation is not necessary for the phenomenon and that perceptual fading may be based in the cortex.


Asunto(s)
Adaptación Ocular/fisiología , Atención/fisiología , Percepción de Profundidad/fisiología , Reconocimiento Visual de Modelos/fisiología , Retina/fisiología , Adulto , Femenino , Cuerpos Geniculados/fisiología , Humanos , Masculino , Orientación/fisiología , Psicofísica , Tiempo de Reacción/fisiología , Células Ganglionares de la Retina/fisiología , Percepción del Tamaño/fisiología , Tálamo/fisiología , Visión Binocular/fisiología , Corteza Visual/fisiología , Vías Visuales/fisiología
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA