Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
1.
PLoS Comput Biol ; 20(6): e1012159, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38870125

RESUMEN

Humans are extremely robust in our ability to perceive and recognize objects-we see faces in tea stains and can recognize friends on dark streets. Yet, neurocomputational models of primate object recognition have focused on the initial feed-forward pass of processing through the ventral stream and less on the top-down feedback that likely underlies robust object perception and recognition. Aligned with the generative approach, we propose that the visual system actively facilitates recognition by reconstructing the object hypothesized to be in the image. Top-down attention then uses this reconstruction as a template to bias feedforward processing to align with the most plausible object hypothesis. Building on auto-encoder neural networks, our model makes detailed hypotheses about the appearance and location of the candidate objects in the image by reconstructing a complete object representation from potentially incomplete visual input due to noise and occlusion. The model then leverages the best object reconstruction, measured by reconstruction error, to direct the bottom-up process of selectively routing low-level features, a top-down biasing that captures a core function of attention. We evaluated our model using the MNIST-C (handwritten digits under corruptions) and ImageNet-C (real-world objects under corruptions) datasets. Not only did our model achieve superior performance on these challenging tasks designed to approximate real-world noise and occlusion viewing conditions, but also better accounted for human behavioral reaction times and error patterns than a standard feedforward Convolutional Neural Network. Our model suggests that a complete understanding of object perception and recognition requires integrating top-down and attention feedback, which we propose is an object reconstruction.


Asunto(s)
Atención , Redes Neurales de la Computación , Reconocimiento Visual de Modelos , Humanos , Atención/fisiología , Reconocimiento Visual de Modelos/fisiología , Biología Computacional , Modelos Neurológicos , Reconocimiento en Psicología/fisiología
2.
J Vis ; 23(5): 16, 2023 05 02.
Artículo en Inglés | MEDLINE | ID: mdl-37212782

RESUMEN

The visual system uses sequences of selective glimpses to objects to support goal-directed behavior, but how is this attention control learned? Here we present an encoder-decoder model inspired by the interacting bottom-up and top-down visual pathways making up the recognition-attention system in the brain. At every iteration, a new glimpse is taken from the image and is processed through the "what" encoder, a hierarchy of feedforward, recurrent, and capsule layers, to obtain an object-centric (object-file) representation. This representation feeds to the "where" decoder, where the evolving recurrent representation provides top-down attentional modulation to plan subsequent glimpses and impact routing in the encoder. We demonstrate how the attention mechanism significantly improves the accuracy of classifying highly overlapping digits. In a visual reasoning task requiring comparison of two objects, our model achieves near-perfect accuracy and significantly outperforms larger models in generalizing to unseen stimuli. Our work demonstrates the benefits of object-based attention mechanisms taking sequential glimpses of objects.


Asunto(s)
Encéfalo , Percepción Visual , Humanos , Estimulación Luminosa/métodos , Reconocimiento en Psicología , Solución de Problemas , Reconocimiento Visual de Modelos
3.
J Neurosci ; 37(6): 1453-1467, 2017 02 08.
Artículo en Inglés | MEDLINE | ID: mdl-28039373

RESUMEN

Modern computational models of attention predict fixations using saliency maps and target maps, which prioritize locations for fixation based on feature contrast and target goals, respectively. But whereas many such models are biologically plausible, none have looked to the oculomotor system for design constraints or parameter specification. Conversely, although most models of saccade programming are tightly coupled to underlying neurophysiology, none have been tested using real-world stimuli and tasks. We combined the strengths of these two approaches in MASC, a model of attention in the superior colliculus (SC) that captures known neurophysiological constraints on saccade programming. We show that MASC predicted the fixation locations of humans freely viewing naturalistic scenes and performing exemplar and categorical search tasks, a breadth achieved by no other existing model. Moreover, it did this as well or better than its more specialized state-of-the-art competitors. MASC's predictive success stems from its inclusion of high-level but core principles of SC organization: an over-representation of foveal information, size-invariant population codes, cascaded population averaging over distorted visual and motor maps, and competition between motor point images for saccade programming, all of which cause further modulation of priority (attention) after projection of saliency and target maps to the SC. Only by incorporating these organizing brain principles into our models can we fully understand the transformation of complex visual information into the saccade programs underlying movements of overt attention. With MASC, a theoretical footing now exists to generate and test computationally explicit predictions of behavioral and neural responses in visually complex real-world contexts.SIGNIFICANCE STATEMENT The superior colliculus (SC) performs a visual-to-motor transformation vital to overt attention, but existing SC models cannot predict saccades to visually complex real-world stimuli. We introduce a brain-inspired SC model that outperforms state-of-the-art image-based competitors in predicting the sequences of fixations made by humans performing a range of everyday tasks (scene viewing and exemplar and categorical search), making clear the value of looking to the brain for model design. This work is significant in that it will drive new research by making computationally explicit predictions of SC neural population activity in response to naturalistic stimuli and tasks. It will also serve as a blueprint for the construction of other brain-inspired models, helping to usher in the next generation of truly intelligent autonomous systems.


Asunto(s)
Movimientos Oculares/fisiología , Modelos Neurológicos , Reconocimiento Visual de Modelos/fisiología , Estimulación Luminosa/métodos , Colículos Superiores/fisiología , Percepción Visual/fisiología , Femenino , Predicción , Humanos , Masculino , Modelos Anatómicos , Colículos Superiores/anatomía & histología
4.
J Vis ; 17(4): 2, 2017 04 01.
Artículo en Inglés | MEDLINE | ID: mdl-28388698

RESUMEN

Saccades quite systematically undershoot a peripheral visual target by about 10% of its eccentricity while becoming more variable, mainly in amplitude, as the target becomes more peripheral. This undershoot phenomenon has been interpreted as the strategic adjustment of saccadic gain downstream of the superior colliculus (SC), where saccades are programmed. Here, we investigated whether the eccentricity-related increase in saccades' hypometria and imprecision might not instead result from overrepresentation of space closer to the fovea in the SC and visual-cortical areas. To test this magnification-factor (MF) hypothesis, we analyzed four parametric eye-movement data sets, collected while humans made saccades to single eccentric stimuli. We first established that the undershoot phenomenon generalizes to ordinary saccade amplitudes (0.5°-15°) and directions (0°-90°) and that landing-position distributions become not only increasingly elongated but also more skewed toward the fovea as target eccentricity increases. Moreover, we confirmed the MF hypothesis by showing (a) that the linear eccentricity-related increase in undershoot error and negative skewness canceled out when landing positions were log-scaled according to the MF in monkeys' SC and (b) that the spread, proportional to eccentricity outside an extended, 5°, foveal region, became circular and invariant in size in SC space. Yet the eccentricity-related increase in variability, slower near the fovea, yielded progressively larger and more elongated clusters toward foveal and vertical-meridian SC representations. What causes this latter, unexpected, pattern remains undetermined. Nevertheless, our findings clearly suggest that the undershoot phenomenon, and related variability, originate in, or upstream of, the SC, rather than reflecting downstream, adaptive, strategies.


Asunto(s)
Movimientos Sacádicos/fisiología , Colículos Superiores/fisiología , Percepción Visual/fisiología , Adolescente , Femenino , Fóvea Central , Humanos , Masculino , Visión Binocular/fisiología , Adulto Joven
5.
bioRxiv ; 2024 Jul 05.
Artículo en Inglés | MEDLINE | ID: mdl-39005469

RESUMEN

The brain routes and integrates information from many sources during behavior. A number of models explain this phenomenon within the framework of mixed selectivity theory, yet it is difficult to compare their predictions to understand how neurons and circuits integrate information. In this work, we apply time-series partial information decomposition [PID] to compare models of integration on a dataset of superior colliculus [SC] recordings collected during a multi-target visual search task. On this task, SC must integrate target guidance, bottom-up salience, and previous fixation signals to drive attention. We find evidence that SC neurons integrate these factors in diverse ways, including decision-variable selectivity to expected value, functional specialization to previous fixation, and code-switching (to incorporate new visual input).

6.
Artículo en Inglés | MEDLINE | ID: mdl-34164631

RESUMEN

Understanding how goals control behavior is a question ripe for interrogation by new methods from machine learning. These methods require large and labeled datasets to train models. To annotate a large-scale image dataset with observed search fixations, we collected 16,184 fixations from people searching for either microwaves or clocks in a dataset of 4,366 images (MS-COCO). We then used this behaviorally-annotated dataset and the machine learning method of inverse-reinforcement learning (IRL) to learn target-specific reward functions and policies for these two target goals. Finally, we used these learned policies to predict the fixations of 60 new behavioral searchers (clock = 30, microwave = 30) in a disjoint test dataset of kitchen scenes depicting both a microwave and a clock (thus controlling for differences in low-level image contrast). We found that the IRL model predicted behavioral search efficiency and fixation-density maps using multiple metrics. Moreover, reward maps from the IRL model revealed target-specific patterns that suggest, not just attention guidance by target features, but also guidance by scene context (e.g., fixations along walls in the search of clocks). Using machine learning and the psychologically meaningful principle of reward, it is possible to learn the visual features used in goal-directed attention control.

7.
Philos Trans R Soc Lond B Biol Sci ; 368(1628): 20130058, 2013 Oct 19.
Artículo en Inglés | MEDLINE | ID: mdl-24018720

RESUMEN

We introduce a model of eye movements during categorical search, the task of finding and recognizing categorically defined targets. It extends a previous model of eye movements during search (target acquisition model, TAM) by using distances from an support vector machine classification boundary to create probability maps indicating pixel-by-pixel evidence for the target category in search images. Other additions include functionality enabling target-absent searches, and a fixation-based blurring of the search images now based on a mapping between visual and collicular space. We tested this model on images from a previously conducted variable set-size (6/13/20) present/absent search experiment where participants searched for categorically defined teddy bear targets among random category distractors. The model not only captured target-present/absent set-size effects, but also accurately predicted for all conditions the numbers of fixations made prior to search judgements. It also predicted the percentages of first eye movements during search landing on targets, a conservative measure of search guidance. Effects of set size on false negative and false positive errors were also captured, but error rates in general were overestimated. We conclude that visual features discriminating a target category from non-targets can be learned and used to guide eye movements during categorical search.


Asunto(s)
Conducta Apetitiva/fisiología , Movimientos Oculares/fisiología , Fijación Ocular/fisiología , Modelos Psicológicos , Reconocimiento en Psicología/fisiología , Aprendizaje Discriminativo/fisiología , Humanos , Estimulación Luminosa , Desempeño Psicomotor/fisiología , Retina/metabolismo , Visión Ocular/fisiología
SELECCIÓN DE REFERENCIAS
Detalles de la búsqueda