Pesquisa | Portal Regional da BVS

Standard models of spatial vision mispredict edge sensitivity at low spatial frequencies.

Schmittwilken, Lynn; Wichmann, Felix A; Maertens, Marianne.

Vision Res ; 222: 108450, 2024 Jul 03.

Artigo em Inglês | MEDLINE | ID: mdl-38964164

RESUMO

One well-established characteristic of early visual processing is the contrast sensitivity function (CSF) which describes how sensitivity varies with the spatial frequency (SF) content of the visual input. The CSF prompted the development of a now standard model of spatial vision. It represents the visual input by activity in orientation- and SF selective channels which are nonlinearly recombined to predict a perceptual decision. The standard spatial vision model has been extensively tested with sinusoidal gratings at low contrast because their narrow SF spectra isolate the underlying SF selective mechanisms. It is less studied how well these mechanisms account for sensitivity to more behaviourally relevant stimuli such as sharp edges at high contrast (i.e. object boundaries) which abound in the natural environment and have broader SF spectra. Here, we probe sensitivity to edges (2-AFC, edge localization) in the presence of broadband and narrowband noises. We use Cornsweet luminance profiles with peak frequencies at 0.5, 3 and 9 cpd as edge stimuli. To test how well mechanisms underlying sinusoidal contrast sensitivity can account for edge sensitivity, we implement a single- and a multi-scale model building upon standard spatial vision model components. Both models account for most of the data but also systematically deviate in their predictions, particularly in the presence of pink noise and for the lowest SF edge. These deviations might indicate a transition from contrast- to luminance-based detection at low SFs. Alternatively, they might point to a missing component in current spatial vision models.

Fixational eye movements enable robust edge detection.

Schmittwilken, Lynn; Maertens, Marianne.

J Vis ; 22(8): 5, 2022 07 11.

Artigo em Inglês | MEDLINE | ID: mdl-35834376

RESUMO

Human vision relies on mechanisms that respond to luminance edges in space and time. Most edge models use orientation-selective mechanisms on multiple spatial scales and operate on static inputs assuming that edge processing occurs within a single fixational instance. Recent studies, however, demonstrate functionally relevant temporal modulations of the sensory input due to fixational eye movements. Here we propose a spatiotemporal model of human edge detection that combines elements of spatial and active vision. The model augments a spatial vision model by temporal filtering and shifts the input images over time, mimicking an active sampling scheme via fixational eye movements. The first model test was White's illusion, a lightness effect that has been shown to depend on edges. The model reproduced the spatial-frequency-specific interference with the edges by superimposing narrowband noise (1-5 cpd), similar to the psychophysical interference observed in White's effect. Second, we compare the model's edge detection performance in natural images in the presence and absence of Gaussian white noise with human-labeled contours for the same (noise-free) images. Notably, the model detects edges robustly against noise in both test cases without relying on orientation-selective processes. Eliminating model components, we demonstrate the relevance of multiscale spatiotemporal filtering and scale-specific normalization for edge detection. The proposed model facilitates efficient edge detection in (artificial) vision systems and challenges the notion that orientation-selective mechanisms are required for edge detection.

Assuntos

Movimentos Oculares , Percepção de Forma , Humanos , Ruído , Visão Ocular

Capsule networks as recurrent models of grouping and segmentation.

Doerig, Adrien; Schmittwilken, Lynn; Sayim, Bilge; Manassi, Mauro; Herzog, Michael H.

PLoS Comput Biol ; 16(7): e1008017, 2020 07.

Artigo em Inglês | MEDLINE | ID: mdl-32692780

RESUMO

Classically, visual processing is described as a cascade of local feedforward computations. Feedforward Convolutional Neural Networks (ffCNNs) have shown how powerful such models can be. However, using visual crowding as a well-controlled challenge, we previously showed that no classic model of vision, including ffCNNs, can explain human global shape processing. Here, we show that Capsule Neural Networks (CapsNets), combining ffCNNs with recurrent grouping and segmentation, solve this challenge. We also show that ffCNNs and standard recurrent CNNs do not, suggesting that the grouping and segmentation capabilities of CapsNets are crucial. Furthermore, we provide psychophysical evidence that grouping and segmentation are implemented recurrently in humans, and show that CapsNets reproduce these results well. We discuss why recurrence seems needed to implement grouping and segmentation efficiently. Together, we provide mutually reinforcing psychophysical and computational evidence that a recurrent grouping and segmentation process is essential to understand the visual system and create better models that harness global shape computations.

Assuntos

Biologia Computacional , Redes Neurais de Computação , Reconhecimento Visual de Modelos , Visão Ocular , Algoritmos , Simulação por Computador , Feminino , Humanos , Processamento de Imagem Assistida por Computador/métodos , Masculino , Modelos Biológicos , Distribuição Normal , Reprodutibilidade dos Testes

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA