Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 22
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Curr Biol ; 34(5): 1098-1106.e5, 2024 03 11.
Artigo em Inglês | MEDLINE | ID: mdl-38218184

RESUMO

Visual shape perception is central to many everyday tasks, from object recognition to grasping and handling tools.1,2,3,4,5,6,7,8,9,10 Yet how shape is encoded in the visual system remains poorly understood. Here, we probed shape representations using visual aftereffects-perceptual distortions that occur following extended exposure to a stimulus.11,12,13,14,15,16,17 Such effects are thought to be caused by adaptation in neural populations that encode both simple, low-level stimulus characteristics17,18,19,20 and more abstract, high-level object features.21,22,23 To tease these two contributions apart, we used machine-learning methods to synthesize novel shapes in a multidimensional shape space, derived from a large database of natural shapes.24 Stimuli were carefully selected such that low-level and high-level adaptation models made distinct predictions about the shapes that observers would perceive following adaptation. We found that adaptation along vector trajectories in the high-level shape space predicted shape aftereffects better than simple low-level processes. Our findings reveal the central role of high-level statistical features in the visual representation of shape. The findings also hint that human vision is attuned to the distribution of shapes experienced in the natural environment.


Assuntos
Visão Ocular , Percepção Visual , Humanos , Distorção da Percepção , Meio Ambiente , Reconhecimento Visual de Modelos , Estimulação Luminosa
2.
Behav Brain Sci ; 46: e386, 2023 Dec 06.
Artigo em Inglês | MEDLINE | ID: mdl-38054335

RESUMO

Everyone agrees that testing hypotheses is important, but Bowers et al. provide scant details about where hypotheses about perception and brain function should come from. We suggest that the answer lies in considering how information about the outside world could be acquired - that is, learned - over the course of evolution and development. Deep neural networks (DNNs) provide one tool to address this question.


Assuntos
Encéfalo , Redes Neurais de Computação , Humanos , Aprendizagem
3.
Vision Res ; 206: 108195, 2023 05.
Artigo em Inglês | MEDLINE | ID: mdl-36801664

RESUMO

Why do we perceive illusory motion in some static images? Several accounts point to eye movements, response latencies to different image elements, or interactions between image patterns and motion energy detectors. Recently PredNet, a recurrent deep neural network (DNN) based on predictive coding principles, was reported to reproduce the "Rotating Snakes" illusion, suggesting a role for predictive coding. We begin by replicating this finding, then use a series of "in silico" psychophysics and electrophysiology experiments to examine whether PredNet behaves consistently with human observers and non-human primate neural data. A pretrained PredNet predicted illusory motion for all subcomponents of the Rotating Snakes pattern, consistent with human observers. However, we found no simple response delays in internal units, unlike evidence from electrophysiological data. PredNet's detection of motion in gradients seemed dependent on contrast, but depends predominantly on luminance in humans. Finally, we examined the robustness of the illusion across ten PredNets of identical architecture, retrained on the same video data. There was large variation across network instances in whether they reproduced the Rotating Snakes illusion, and what motion, if any, they predicted for simplified variants. Unlike human observers, no network predicted motion for greyscale variants of the Rotating Snakes pattern. Our results sound a cautionary note: even when a DNN successfully reproduces some idiosyncrasy of human vision, more detailed investigation can reveal inconsistencies between humans and the network, and between different instances of the same network. These inconsistencies suggest that predictive coding does not reliably give rise to human-like illusory motion.


Assuntos
Ilusões , Percepção de Movimento , Animais , Humanos , Ilusões/fisiologia , Percepção de Movimento/fisiologia , Visão Ocular , Movimentos Oculares , Redes Neurais de Computação
4.
Curr Biol ; 32(21): R1224-R1225, 2022 11 07.
Artigo em Inglês | MEDLINE | ID: mdl-36347228

RESUMO

The discovery of mental rotation was one of the most significant landmarks in experimental psychology, leading to the ongoing assumption that to visually compare objects from different three-dimensional viewpoints, we use explicit internal simulations of object rotations, to 'mentally adjust' one object until it matches the other1. These rotations are thought to be performed on three-dimensional representations of the object, by literal analogy to physical rotations. In particular, it is thought that an imagined object is continuously adjusted at a constant three-dimensional angular rotation rate from its initial orientation to the final orientation through all intervening viewpoints2. While qualitative theories have tried to account for this phenomenon3, to date there has been no explicit, image-computable model of the underlying processes. As a result, there is no quantitative account of why some object viewpoints appear more similar to one another than others when the three-dimensional angular difference between them is the same4,5. We reasoned that the specific pattern of non-uniformities in the perception of viewpoints can reveal the visual computations underlying mental rotation. We therefore compared human viewpoint perception with a model based on the kind of two-dimensional 'optical flow' computations that are thought to underlie motion perception in biological vision6, finding that the model reproduces the specific errors that participants make. This suggests that mental rotation involves simulating the two-dimensional retinal image change that would occur when rotating objects. When we compare objects, we do not do so in a distal three-dimensional representation as previously assumed, but by measuring how much the proximal stimulus would change if we watched the object rotate, capturing perspectival appearance changes7.


Assuntos
Percepção de Movimento , Fluxo Óptico , Humanos , Reconhecimento Visual de Modelos , Percepção Visual
5.
Proc Natl Acad Sci U S A ; 119(27): e2115047119, 2022 07 05.
Artigo em Inglês | MEDLINE | ID: mdl-35767642

RESUMO

Human vision is attuned to the subtle differences between individual faces. Yet we lack a quantitative way of predicting how similar two face images look and whether they appear to show the same person. Principal component-based three-dimensional (3D) morphable models are widely used to generate stimuli in face perception research. These models capture the distribution of real human faces in terms of dimensions of physical shape and texture. How well does a "face space" based on these dimensions capture the similarity relationships humans perceive among faces? To answer this, we designed a behavioral task to collect dissimilarity and same/different identity judgments for 232 pairs of realistic faces. Stimuli sampled geometric relationships in a face space derived from principal components of 3D shape and texture (Basel face model [BFM]). We then compared a wide range of models in their ability to predict the data, including the BFM from which faces were generated, an active appearance model derived from face photographs, and image-computable models of visual perception. Euclidean distance in the BFM explained both dissimilarity and identity judgments surprisingly well. In a comparison against 16 diverse models, BFM distance was competitive with representational distances in state-of-the-art deep neural networks (DNNs), including novel DNNs trained on BFM synthetic identities or BFM latents. Models capturing the distribution of face shape and texture across individuals are not only useful tools for stimulus generation. They also capture important information about how faces are perceived, suggesting that human face representations are tuned to the statistical distribution of faces.


Assuntos
Reconhecimento Facial , Julgamento , Percepção Visual , Humanos , Redes Neurais de Computação
6.
Proc Natl Acad Sci U S A ; 118(32)2021 08 10.
Artigo em Inglês | MEDLINE | ID: mdl-34349023

RESUMO

Sitting in a static railway carriage can produce illusory self-motion if the train on an adjoining track moves off. While our visual system registers motion, vestibular signals indicate that we are stationary. The brain is faced with a difficult challenge: is there a single cause of sensations (I am moving) or two causes (I am static, another train is moving)? If a single cause, integrating signals produces a more precise estimate of self-motion, but if not, one cue should be ignored. In many cases, this process of causal inference works without error, but how does the brain achieve it? Electrophysiological recordings show that the macaque medial superior temporal area contains many neurons that encode combinations of vestibular and visual motion cues. Some respond best to vestibular and visual motion in the same direction ("congruent" neurons), while others prefer opposing directions ("opposite" neurons). Congruent neurons could underlie cue integration, but the function of opposite neurons remains a puzzle. Here, we seek to explain this computational arrangement by training a neural network model to solve causal inference for motion estimation. Like biological systems, the model develops congruent and opposite units and recapitulates known behavioral and neurophysiological observations. We show that all units (both congruent and opposite) contribute to motion estimation. Importantly, however, it is the balance between their activity that distinguishes whether visual and vestibular cues should be integrated or separated. This explains the computational purpose of puzzling neural representations and shows how a relatively simple feedforward network can solve causal inference.


Assuntos
Percepção de Movimento/fisiologia , Redes Neurais de Computação , Células Receptoras Sensoriais/fisiologia , Animais , Sinais (Psicologia) , Macaca mulatta , Estimulação Luminosa , Lobo Temporal/fisiologia
7.
J Cogn Neurosci ; 33(10): 2044-2064, 2021 09 01.
Artigo em Inglês | MEDLINE | ID: mdl-34272948

RESUMO

Deep neural networks (DNNs) trained on object recognition provide the best current models of high-level visual cortex. What remains unclear is how strongly experimental choices, such as network architecture, training, and fitting to brain data, contribute to the observed similarities. Here, we compare a diverse set of nine DNN architectures on their ability to explain the representational geometry of 62 object images in human inferior temporal cortex (hIT), as measured with fMRI. We compare untrained networks to their task-trained counterparts and assess the effect of cross-validated fitting to hIT, by taking a weighted combination of the principal components of features within each layer and, subsequently, a weighted combination of layers. For each combination of training and fitting, we test all models for their correlation with the hIT representational dissimilarity matrix, using independent images and subjects. Trained models outperform untrained models (accounting for 57% more of the explainable variance), suggesting that structured visual features are important for explaining hIT. Model fitting further improves the alignment of DNN and hIT representations (by 124%), suggesting that the relative prevalence of different features in hIT does not readily emerge from the Imagenet object-recognition task used to train the networks. The same models can also explain the disparate representations in primary visual cortex (V1), where stronger weights are given to earlier layers. In each region, all architectures achieved equivalently high performance once trained and fitted. The models' shared properties-deep feedforward hierarchies of spatially restricted nonlinear filters-seem more important than their differences, when modeling human visual representations.


Assuntos
Redes Neurais de Computação , Córtex Visual , Humanos , Imageamento por Ressonância Magnética , Lobo Temporal/diagnóstico por imagem , Córtex Visual/diagnóstico por imagem , Percepção Visual
8.
Nat Hum Behav ; 5(10): 1402-1417, 2021 10.
Artigo em Inglês | MEDLINE | ID: mdl-33958744

RESUMO

Reflectance, lighting and geometry combine in complex ways to create images. How do we disentangle these to perceive individual properties, such as surface glossiness? We suggest that brains disentangle properties by learning to model statistical structure in proximal images. To test this hypothesis, we trained unsupervised generative neural networks on renderings of glossy surfaces and compared their representations with human gloss judgements. The networks spontaneously cluster images according to distal properties such as reflectance and illumination, despite receiving no explicit information about these properties. Intriguingly, the resulting representations also predict the specific patterns of 'successes' and 'errors' in human perception. Linearly decoding specular reflectance from the model's internal code predicts human gloss perception better than ground truth, supervised networks or control models, and it predicts, on an image-by-image basis, illusions of gloss perception caused by interactions between material, shape and lighting. Unsupervised learning may underlie many perceptual dimensions in vision and beyond.


Assuntos
Luz , Propriedades de Superfície , Percepção Visual/fisiologia , Gráficos por Computador , Sensibilidades de Contraste , Percepção de Forma , Humanos , Iluminação/métodos , Ciência dos Materiais , Estimulação Luminosa , Psicofísica/instrumentação , Psicofísica/métodos , Análise e Desempenho de Tarefas
10.
Curr Opin Behav Sci ; 30: 100-108, 2019 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-31886321

RESUMO

Materials with complex appearances, like textiles and foodstuffs, pose challenges for conventional theories of vision. But recent advances in unsupervised deep learning provide a framework for explaining how we learn to see them. We suggest that perception does not involve estimating physical quantities like reflectance or lighting. Instead, representations emerge from learning to encode and predict the visual input as efficiently and accurately as possible. Neural networks can be trained to compress natural images or to predict frames in movies without 'ground truth' data about the outside world. Yet, to succeed, such systems may automatically discover how to disentangle distal causal factors. Such 'statistical appearance models' potentially provide a coherent explanation of both failures and successes in perception.

11.
Front Psychol ; 8: 1726, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-29062291

RESUMO

Recent advances in Deep convolutional Neural Networks (DNNs) have enabled unprecedentedly accurate computational models of brain representations, and present an exciting opportunity to model diverse cognitive functions. State-of-the-art DNNs achieve human-level performance on object categorisation, but it is unclear how well they capture human behavior on complex cognitive tasks. Recent reports suggest that DNNs can explain significant variance in one such task, judging object similarity. Here, we extend these findings by replicating them for a rich set of object images, comparing performance across layers within two DNNs of different depths, and examining how the DNNs' performance compares to that of non-computational "conceptual" models. Human observers performed similarity judgments for a set of 92 images of real-world objects. Representations of the same images were obtained in each of the layers of two DNNs of different depths (8-layer AlexNet and 16-layer VGG-16). To create conceptual models, other human observers generated visual-feature labels (e.g., "eye") and category labels (e.g., "animal") for the same image set. Feature labels were divided into parts, colors, textures and contours, while category labels were divided into subordinate, basic, and superordinate categories. We fitted models derived from the features, categories, and from each layer of each DNN to the similarity judgments, using representational similarity analysis to evaluate model performance. In both DNNs, similarity within the last layer explains most of the explainable variance in human similarity judgments. The last layer outperforms almost all feature-based models. Late and mid-level layers outperform some but not all feature-based models. Importantly, categorical models predict similarity judgments significantly better than any DNN layer. Our results provide further evidence for commonalities between DNNs and brain representations. Models derived from visual features other than object parts perform relatively poorly, perhaps because DNNs more comprehensively capture the colors, textures and contours which matter to human object perception. However, categorical models outperform DNNs, suggesting that further work may be needed to bring high-level semantic representations in DNNs closer to those extracted by humans. Modern DNNs explain similarity judgments remarkably well considering they were not trained on this task, and are promising models for many aspects of human cognition.

12.
J Exp Psychol Hum Percept Perform ; 43(1): 181-191, 2017 01.
Artigo em Inglês | MEDLINE | ID: mdl-27808549

RESUMO

Adaptation to different visual properties can produce distinct patterns of perceptual aftereffect. Some, such as those following adaptation to color, seem to arise from recalibrative processes. These are associated with a reappraisal of which physical input constitutes a normative value in the environment-in this case, what appears "colorless," and what "colorful." Recalibrative aftereffects can arise from coding schemes in which inputs are referenced against malleable norm values. Other aftereffects seem to arise from contrastive processes. These exaggerate differences between the adaptor and other inputs without changing the adaptor's appearance. There has been conjecture over which process best describes adaptation-induced distortions of spatial vision, such as of apparent shape or facial identity. In 3 experiments, we determined whether recalibrative or contrastive processes underlie the shape aspect ratio aftereffect. We found that adapting to a moderately elongated shape compressed the appearance of narrower shapes and further elongated the appearance of more-elongated shapes (Experiment 1). Adaptation did not change the perceived aspect ratio of the adaptor itself (Experiment 2), and adapting to a circle induced similar bidirectional aftereffects on shapes narrower or wider than circular (Experiment 3). Results could not be explained by adaptation to retinotopically local edge orientation or single linear dimensions of shapes. We conclude that aspect ratio aftereffects are determined by contrastive processes that can exaggerate differences between successive inputs, inconsistent with a norm-referenced representation of aspect ratio. Adaptation might enhance the salience of novel stimuli rather than recalibrate one's sense of what constitutes a "normal" shape. (PsycINFO Database Record


Assuntos
Adaptação Fisiológica/fisiologia , Pós-Efeito de Figura/fisiologia , Percepção de Forma/fisiologia , Adulto , Humanos
13.
Neuron ; 92(2): 280-284, 2016 Oct 19.
Artigo em Inglês | MEDLINE | ID: mdl-27764662

RESUMO

"Grid cells" encode an animal's location and direction of movement in 2D physical environments via regularly repeating receptive fields. Constantinescu et al. (2016) report the first evidence of grid cells for 2D conceptual spaces. The work has exciting implications for mental representation and shows how detailed neural-coding hypotheses can be tested with bulk population-activity measures.


Assuntos
Neurônios , Orientação , Animais , Células de Grade , Movimento
15.
J Vis ; 15(8): 1, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26030371

RESUMO

After looking at a photograph of someone for a protracted period (adaptation), a previously neutral-looking face can take on an opposite appearance in terms of gender, identity, and other attributes-but what happens to the appearance of other faces? Face aftereffects have repeatedly been ascribed to perceptual renormalization. Renormalization predicts that the adapting face and more extreme versions of it should appear more neutral after adaptation (e.g., if the adaptor was male, it and hyper-masculine faces should look more feminine). Other aftereffects, such as tilt and spatial frequency, are locally repulsive, exaggerating differences between adapting and test stimuli. This predicts that the adapting face should be little changed in appearance after adaptation, while more extreme versions of it should look even more extreme (e.g., if the adaptor was male, it should look unchanged, while hyper-masculine faces should look even more masculine). Existing reports do not provide clear evidence for either pattern. We overcame this by using a spatial comparison task to measure the appearance of stimuli presented in differently adapted retinal locations. In behaviorally matched experiments we compared aftereffect patterns after adapting to tilt, facial identity, and facial gender. In all three experiments data matched the predictions of a locally repulsive, but not a renormalizing, aftereffect. These data are consistent with the existence of similar encoding strategies for tilt, facial identity, and facial gender.


Assuntos
Reconhecimento Facial/fisiologia , Pós-Efeito de Figura/fisiologia , Adaptação Ocular/fisiologia , Comportamento de Escolha , Feminino , Humanos , Masculino
16.
Front Psychol ; 6: 157, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-25745407
17.
J Vis ; 15(1): 15.1.26, 2015 Jan 26.
Artigo em Inglês | MEDLINE | ID: mdl-25624465

RESUMO

Some data have been taken as evidence that after prolonged viewing, near-vertical orientations "normalize" to appear more vertical than they did previously. After almost a century of research, the existence of tilt normalization remains controversial. The most recent evidence for tilt normalization comes from data suggesting a measurable "perceptual drift" of near-vertical adaptors toward vertical, which can be nulled by a slight physical rotation away from vertical (Müller, Schillinger, Do, & Leopold, 2009). We argue that biases in estimates of perceptual stasis could, however, result from the anisotropic organization of orientation-selective neurons in V1, with vertically-selective cells being more narrowly tuned than obliquely-selective cells. We describe a neurophysiologically plausible model that predicts greater sensitivity to orientation displacements toward than away from vertical. We demonstrate the predicted asymmetric pattern of sensitivity in human observers by determining threshold speeds for detecting rotation direction (Experiment 1), and by determining orientation discrimination thresholds for brief static stimuli (Experiment 2). Results imply that data suggesting a perceptual drift toward vertical instead result from greater discrimination sensitivity around cardinal than oblique orientations (the oblique effect), and thus do not constitute evidence for tilt normalization.


Assuntos
Ilusões Ópticas/fisiologia , Orientação/fisiologia , Rotação , Percepção Visual/fisiologia , Anisotropia , Humanos , Modelos Neurológicos , Psicofísica , Limiar Sensorial
18.
Iperception ; 6(2): 100-103, 2015 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-28299168

RESUMO

Face aftereffects can help adjudicate between theories of how facial attributes are encoded. O'Neil and colleagues (2014) compared age estimates for faces before and after adapting to young, middle-aged or old faces. They concluded that age aftereffects are best described as a simple re-normalisation-e.g. after adapting to old faces, all faces look younger than they did initially. Here I argue that this conclusion is not substantiated by the reported data. The authors fit only a linear regression model, which captures the predictions of re-normalisation, but not alternative hypotheses such as local repulsion away from the adapted age. A second concern is that the authors analysed absolute age estimates after adaptation, as a function of baseline estimates, so goodness-of-fit measures primarily reflect the physical ages of test faces, rather than the impact of adaptation. When data are re-expressed as aftereffects and fit with a nonlinear "locally repulsive" model, this model performs equal to or better than a linear model in all adaptation conditions. Data in O'Neil et al. do not provide strong evidence for either re-normalisation or local repulsion in facial age aftereffects, but are more consistent with local repulsion (and exemplar-based encoding of facial age), contrary to the original report.

19.
J Vis ; 14(8): 25, 2014 Jul 29.
Artigo em Inglês | MEDLINE | ID: mdl-25074903

RESUMO

Humans are experts at face recognition. The mechanisms underlying this complex capacity are not fully understood. Recently, it has been proposed that face recognition is supported by a coarse-scale analysis of visual information contained in horizontal bands of contrast distributed along the vertical image axis-a biological facial "barcode" (Dakin & Watt, 2009). A critical prediction of the facial barcode hypothesis is that the distribution of image contrast along the vertical axis will be more important for face recognition than image distributions along the horizontal axis. Using a novel paradigm involving dynamic image distortions, a series of experiments are presented examining famous face recognition impairments from selectively disrupting image distributions along the vertical or horizontal image axes. Results show that disrupting the image distribution along the vertical image axis is more disruptive for recognition than matched distortions along the horizontal axis. Consistent with the facial barcode hypothesis, these results suggest that human face recognition relies disproportionately on appropriately scaled distributions of image contrast along the vertical image axis.


Assuntos
Face , Reconhecimento Visual de Modelos/fisiologia , Reconhecimento Psicológico/fisiologia , Adolescente , Adulto , Feminino , Humanos , Masculino , Adulto Jovem
20.
J Exp Psychol Hum Percept Perform ; 39(3): 616-22, 2013 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-23528000

RESUMO

One of the oldest known visual aftereffects is the shape aftereffect, wherein looking at a particular shape can make subsequent shapes seem distorted in the opposite direction. After viewing a narrow ellipse, for example, a perfect circle can look like a broad ellipse. It is thought that shape aftereffects are determined by the dimensions of successive retinal images. However, perceived shape is invariant for large retinal image changes resulting from different viewing angles; current understanding suggests that shape aftereffects should not be impacted by the operations responsible for this viewpoint invariance. By viewing adaptors from an angle, with subsequent frontoparallel tests, we establish that shape aftereffects are not solely determined by the dimensions of successive retinal images. Moreover, by comparing performance with and without stereo surface slant cues, we show that shape aftereffects reflect a weighted function of retinal image shape and surface slant information, a hallmark of shape constancy operations. Thus our data establish that shape aftereffects can be influenced by perceived shape, as determined by constancy operations, and must therefore involve higher-level neural substrates than previously thought.


Assuntos
Pós-Efeito de Figura/fisiologia , Percepção de Forma/fisiologia , Retina/fisiologia , Adulto , Humanos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA