RESUMO
Visual search is a fundamental natural task for humans and other animals. We investigated the decision processes humans use when searching briefly presented displays having well-separated potential target-object locations. Performance was compared with the Bayesian-optimal decision process under the assumption that the information from the different potential target locations is statistically independent. Surprisingly, humans performed slightly better than optimal, despite humans' substantial loss of sensitivity in the fovea ("foveal neglect"), and the implausibility of the human brain replicating the optimal computations. We show that three factors can quantitatively explain these seemingly paradoxical results. Most importantly, simple and fixed heuristic decision rules reach near optimal search performance. Secondly, foveal neglect primarily affects only the central potential target location. Finally, spatially correlated neural noise causes search performance to exceed that predicted for independent noise. These findings have far-reaching implications for understanding visual search tasks and other identification tasks in humans and other animals.
RESUMO
Visual detection is a fundamental natural task. Detection becomes more challenging as the similarity between the target and the background in which it is embedded increases, a phenomenon termed 'similarity masking'. To test the hypothesis that V1 contributes to similarity masking, we used voltage sensitive dye imaging (VSDI) to measure V1 population responses while macaque monkeys performed a detection task under varying levels of target-background similarity. Paradoxically, we find that during an initial transient phase, V1 responses to the target are enhanced, rather than suppressed, by target-background similarity. This effect reverses in the second phase of the response, so that in this phase V1 signals are positively correlated with the behavioral effect of similarity. Finally, we show that a simple model with delayed divisive normalization can qualitatively account for our findings. Overall, our results support the hypothesis that a nonlinear gain control mechanism in V1 contributes to perceptual similarity masking.
Assuntos
Macaca , Primatas , Animais , Mascaramento Perceptivo , Imagens com Corantes Sensíveis à VoltagemRESUMO
When detecting targets under natural conditions, the visual system almost always faces multiple, simultaneous, dimensions of extrinsic uncertainty. This study focused on the simultaneous uncertainty about target amplitude and background contrast. These dimensions have a large effect on detection and vary greatly in natural scenes. We measured the human performance for detecting a sine-wave target in white noise and natural-scene backgrounds for two levels of prior probability of the target being present. We derived and tested the ideal observer for white-noise backgrounds, a special case of a template-matching observer that dynamically moves its criterion with the background contrast (the DTM observer) and two simpler models with a fixed criterion: the template-matching (TM) observer and the normalized template-matching (NTM) observer that normalizes template response by background contrast. Simulations show that, when the target prior is low, the performance of the NTM observer is near optimal and the TM observer is near chance, suggesting that manipulating the target prior is valuable for distinguishing among models. Surprisingly, we found that the NTM and DTM observers better explain human performance than the TM observer for both target priors in both background types. We argue that the visual system most likely exploits contrast normalization, rather than dynamic criterion adjustment, to deal with simultaneous background contrast and target amplitude uncertainty. Finally, our findings show that the data collected under high levels of uncertainty have a rich structure capable of discriminating between models, providing an alternative approach for studying high dimensions of uncertainty.
Assuntos
Incerteza , Humanos , ProbabilidadeRESUMO
A number of recent studies have been directed at measuring and modeling detection of targets at specific locations in natural backgrounds, a key subtask of visual search in natural environments. A useful approach is to bin natural background patches into joint histograms with bins along specific background dimensions. By measuring psychometric functions in a sparse subset of these bins, it is possible to estimate how the included dimensions jointly affect detectability over the whole space of natural backgrounds. In previous studies, we found that threshold is proportional to the product of the background luminance, contrast, and similarity; a result predicted by a simple template-matching observer with divisive normalization along each of the dimensions. The measure of similarity was the cosine similarity of the amplitude spectra of the target and background (SA)-a phase-invariant measure. Here, we investigated the effect of the cosine similarity of the target and background images (SI|A)-a phase-dependent measure. We found that threshold decreases monotonically with SI|A in agreement with a recent study (Rideaux et al., 2022). In contrast, the template-matching observer predicts threshold to be a U-shaped function of SI|A reaching a minimum when the target and background are orthogonal (SI|A = 0). Surprisingly, when the template-matching observer includes a small amount of intrinsic position uncertainty (measured in a separate experiment) the pattern of thresholds is explained.
Assuntos
Meio Ambiente , Humanos , Incerteza , PsicometriaRESUMO
Visual detection is a fundamental natural task. Detection becomes more challenging as the similarity between the target and the background in which it is embedded increases, a phenomenon termed "similarity masking". To test the hypothesis that V1 contributes to similarity masking, we used voltage sensitive dye imaging (VSDI) to measure V1 population responses while macaque monkeys performed a detection task under varying levels of target-background similarity. Paradoxically, we find that during an initial transient phase, V1 responses to the target are enhanced, rather than suppressed, by target-background similarity. This effect reverses in the second phase of the response, so that in this phase V1 signals are positively correlated with the behavioral effect of similarity. Finally, we show that a simple model with delayed divisive normalization can qualitatively account for our findings. Overall, our results support the hypothesis that a nonlinear gain control mechanism in V1 contributes to perceptual similarity masking.
RESUMO
Most studies of detection in complex backgrounds have measured and modeled human performance for statistically uniform (stationary) backgrounds. However, natural and medical images have statistical properties that vary over space. We measured detection of various target shapes presented in Gaussian 1/f noise backgrounds that were statistically uniform over space, and in ones that modulated in contrast over space. We find that the pattern of human thresholds is not consistent with the ideal observer but is consistent with a suboptimal observer that performs partial whitening in spatial frequency and whitening (reliability-weighting) in space, and has a small level of intrinsic position uncertainty.
Assuntos
Sensibilidades de Contraste , Processamento de Imagem Assistida por Computador , Humanos , Percepção VisualRESUMO
Binocular stereo cues are important for discriminating 3D surface orientation, especially at near distances. We devised a single-interval task where observers discriminated the slant of a densely textured planar test surface relative to a textured planar surround reference surface. Although surfaces were rendered with correct perspective, the stimuli were designed so that the binocular cues dominated performance. Slant discrimination performance was measured as a function of the reference slant and the level of uncorrelated white noise added to the test-plane images in the left and right eyes. We compared human performance with an approximate ideal observer (planar matching [PM]) and two subideal observers. The PM observer uses the image in one eye and back projection to predict a test image in the other eye for all possible slants, tilts, and distances. The estimated slant, tilt, and distance are determined by the prediction that most closely matches the measured image in the other eye. The first subideal observer (local planar matching [LPM]) applies PM over local neighborhoods and then pools estimates across the test plane. The second suboptimal observer (local frontoparallel matching [LFM]) uses only location disparity. We find that the ideal observer (PM) and the first subideal observer (LPM) outperforms the second subideal observer (LFM), demonstrating the additional benefit of pattern disparities. We also find that all three model observers can account for human performance, if two free parameters are included: a fixed small level of internal estimation noise, and a fixed overall efficiency scalar on slant discriminability.
Assuntos
Sinais (Psicologia) , Percepção de Profundidade , Olho , HumanosRESUMO
Can direct stimulation of primate V1 substitute for a visual stimulus and mimic its perceptual effect? To address this question, we developed an optical-genetic toolkit to 'read' neural population responses using widefield calcium imaging, while simultaneously using optogenetics to 'write' neural responses into V1 of behaving macaques. We focused on the phenomenon of visual masking, where detection of a dim target is significantly reduced by a co-localized medium-brightness mask (Cornsweet and Pinsker, 1965; Whittle and Swanston, 1974). Using our toolkit, we tested whether V1 optogenetic stimulation can recapitulate the perceptual masking effect of a visual mask. We find that, similar to a visual mask, low-power optostimulation can significantly reduce visual detection sensitivity, that a sublinear interaction between visual- and optogenetic-evoked V1 responses could account for this perceptual effect, and that these neural and behavioral effects are spatially selective. Our toolkit and results open the door for further exploration of perceptual substitutions by direct stimulation of sensory cortex.
Assuntos
Optogenética/métodos , Mascaramento Perceptivo/fisiologia , Estimulação Luminosa/métodos , Percepção Visual/fisiologia , Animais , Macaca mulatta , Masculino , Neurônios/fisiologia , Estudo de Prova de Conceito , Córtex Visual/fisiologiaRESUMO
The human visual system has a high-resolution fovea and a low-resolution periphery. When actively searching for a target, humans perform a covert search during each fixation, and then shift fixation (the fovea) to probable target locations. Previous studies of covert search under carefully controlled conditions provide strong evidence that for simple and small search displays, humans process all potential target locations with the same efficiency that they process those locations when individually cued on each trial. Here, we extend these studies to the case of large displays, in which the target can appear anywhere within the display. These more natural conditions reveal an attentional effect in which sensitivity in the fovea and parafovea is greatly diminished. We show that this "foveal neglect" is the expected consequence of efficiently allocating a fixed total attentional sensitivity gain across the retinotopic map in the visual cortex. We present a formal theory that explains our findings and the previous findings.
Assuntos
Córtex Visual , Campos Visuais , Atenção , Fóvea Central , Humanos , Estimulação Luminosa , Visão OcularRESUMO
Univariate and multivariate normal probability distributions are widely used when modeling decisions under uncertainty. Computing the performance of such models requires integrating these distributions over specific domains, which can vary widely across models. Besides some special cases where these integrals are easy to calculate, there exist no general analytical expressions, standard numerical methods, or software for these integrals. Here we present mathematical results and open-source software that provide (a) the probability in any domain of a normal in any dimensions with any parameters; (b) the probability density, cumulative distribution, and inverse cumulative distribution of any function of a normal vector; (c) the classification errors among any number of normal distributions, the Bayes-optimal discriminability index, and relation to the receiver operating characteristic (ROC); (d) dimension reduction and visualizations for such problems; and (e) tests for how reliably these methods may be used on given data. We demonstrate these tools with vision research applications of detecting occluding objects in natural scenes and detecting camouflage.
Assuntos
Software , Teorema de Bayes , Humanos , Distribuição Normal , Probabilidade , IncertezaRESUMO
Visual systems evolve to process the stimuli that arise in the organism's natural environment, and hence, to fully understand the neural computations in the visual system, it is important to measure behavioral and neural responses to natural visual stimuli. Here, we measured psychometric and neurometric functions in the macaque monkey for detection of a windowed sine-wave target in uniform backgrounds and in natural backgrounds of various contrasts. The neurometric functions were obtained by near-optimal decoding of voltage-sensitive-dye-imaging (VSDI) responses at the retinotopic scale in primary visual cortex (V1). The results were compared with previous human psychophysical measurements made under the same conditions. We found that human and macaque behavioral thresholds followed the generalized Weber's law as function of contrast, and that both the slopes and the intercepts of the threshold as a function of background contrast match each other up to a single scale factor. We also found that the neurometric thresholds followed the generalized Weber's law with slopes and intercepts matching the behavioral slopes and intercepts up to a single scale factor. We conclude that human and macaque ability to detect targets in natural backgrounds are affected in the same way by background contrast, that these effects are consistent with population decoding at the retinotopic scale by down-stream circuits, and that the macaque monkey is an appropriate animal model for gaining an understanding of the neural mechanisms in humans for detecting targets in natural backgrounds. Finally, we discuss limitations of the current study and potential next steps.NEW & NOTEWORTHY We measured macaque detection performance in natural images and compared their performance to the detection sensitivity of neurophysiological responses recorded in the primary visual cortex (V1), and to the performance of human subjects. We found that 1) human and macaque behavioral performances are in quantitative agreement and 2) are consistent with near-optimal decoding of V1 population responses.
Assuntos
Sensibilidades de Contraste/fisiologia , Percepção de Profundidade/fisiologia , Discriminação Psicológica/fisiologia , Reconhecimento Visual de Modelos/fisiologia , Mascaramento Perceptivo/fisiologia , Córtex Visual Primário/fisiologia , Limiar Sensorial/fisiologia , Animais , Comportamento Animal/fisiologia , Limiar Diferencial , Humanos , Macaca , Especificidade da Espécie , Análise e Desempenho de Tarefas , Imagens com Corantes Sensíveis à VoltagemRESUMO
Detection of target objects in the surrounding environment is a common visual task. There is a vast psychophysical and modeling literature concerning the detection of targets in artificial and natural backgrounds. Most studies involve detection of additive targets or of some form of image distortion. Although much has been learned from these studies, the targets that most often occur under natural conditions are neither additive nor distorting; rather, they are opaque targets that occlude the backgrounds behind them. Here, we describe our efforts to measure and model detection of occluding targets in natural backgrounds. To systematically vary the properties of the backgrounds, we used the constrained sampling approach of Sebastian, Abrams, and Geisler (2017). Specifically, millions of calibrated gray-scale natural-image patches were sorted into a 3D histogram along the dimensions of luminance, contrast, and phase-invariant similarity to the target. Eccentricity psychometric functions (accuracy as a function of retinal eccentricity) were measured for four different occluding targets and 15 different combinations of background luminance, contrast, and similarity, with a different randomly sampled background on each trial. The complex pattern of results was consistent across the three subjects, and was largely explained by a principled model observer (with only a single efficiency parameter) that combines three image cues (pattern, silhouette, and edge) and four well-known properties of the human visual system (optical blur, blurring and downsampling by the ganglion cells, divisive normalization, intrinsic position uncertainty). The model also explains the thresholds for additive foveal targets in natural backgrounds reported in Sebastian et al. (2017).
Assuntos
Sensibilidades de Contraste/fisiologia , Percepção de Forma/fisiologia , Luz , Retina/fisiologia , Sinais (Psicologia) , Humanos , Psicofísica , Limiar SensorialRESUMO
A fundamental natural visual task is the identification of specific target objects in the environments that surround us. It has long been known that some properties of the background have strong effects on target visibility. The most well-known properties are the luminance, contrast, and similarity of the background to the target. In previous studies, we found that these properties have highly lawful effects on detection in natural backgrounds. However, there is another important factor affecting detection in natural backgrounds that has received little or no attention in the masking literature, which has been concerned with detection in simpler backgrounds. Namely, in natural backgrounds the properties of the background often vary under the target, and hence some parts of the target are masked more than others. We began studying this factor, which we call the "partial masking factor," by measuring detection thresholds in backgrounds of contrast-modulated white noise that was constructed so that the standard template-matching (TM) observer performs equally well whether or not the noise contrast modulates in the target region. If noise contrast is uniform in the target region, then this TM observer is the Bayesian optimal observer. However, when the noise contrast modulates then the Bayesian optimal observer weights the template at each pixel location by the estimated reliability at that location. We find that human performance for modulated noise backgrounds is predicted by this reliability-weighted TM (RTM) observer. More surprisingly, we find that human performance for natural backgrounds is also predicted by the RTM observer.
Assuntos
Processamento de Imagem Assistida por Computador/métodos , Modelos Neurológicos , Reconhecimento Visual de Modelos/fisiologia , Mascaramento Perceptivo/fisiologia , Artefatos , Teorema de Bayes , Humanos , Distribuição Normal , Estimulação Luminosa/métodosRESUMO
Humans have remarkable scale-invariant visual capabilities. For example, our orientation discrimination sensitivity is largely constant over more than two orders of magnitude of variations in stimulus spatial frequency (SF). Orientation-selective V1 neurons are likely to contribute to orientation discrimination. However, because at any V1 location neurons have a limited range of receptive field (RF) sizes, we predict that at low SFs V1 neurons will carry little orientation information. If this were the case, what could account for the high behavioral sensitivity at low SFs? Using optical imaging in behaving macaques, we show that, as predicted, V1 orientation-tuned responses drop rapidly with decreasing SF. However, we reveal a surprising coarse-scale signal that corresponds to the projection of the luminance layout of low-SF stimuli to V1's retinotopic map. This homeomorphic and distributed representation, which carries high-quality orientation information, is likely to contribute to our striking scale-invariant visual capabilities.
Assuntos
Mapeamento Encefálico , Sensibilidades de Contraste/fisiologia , Neurônios/fisiologia , Orientação , Córtex Visual/fisiologia , Animais , Discriminação Psicológica , Macaca mulatta , Masculino , Estimulação Luminosa , Vias Visuais/fisiologiaRESUMO
Humans and other primates sample the visual environment using saccadic eye movements that shift a high-resolution fovea toward regions of interest to create a clear perception of a scene across fixations. Many mammals, however, like mice, lack a fovea, which raises the question of why they make saccades. Here we describe and test the hypothesis that saccades are matched to natural scene statistics and to the receptive field sizes and adaptive properties of neural populations. Specifically, we determined the minimum amplitude of saccades in natural scenes necessary to provide uncorrelated inputs to model neural populations. This analysis predicts the distributions of observed saccade sizes during passive viewing for nonhuman primates, cats, and mice. Furthermore, disrupting the development of receptive field properties by monocular deprivation changed saccade sizes consistent with this hypothesis. Therefore, natural-scene statistics and the neural representation of natural images appear to be critical factors guiding saccadic eye movements.
Assuntos
Neurônios/fisiologia , Movimentos Sacádicos/fisiologia , Campos Visuais/fisiologia , Percepção Visual/fisiologia , Animais , Gatos , Camundongos , Estimulação Luminosa , PrimatasRESUMO
How do cortical responses to local image elements combine to form a spatial pattern of population activity in primate V1? Here, we used voltage-sensitive dye imaging, which measures summed membrane potential activity, to examine the rules that govern lateral interactions between the representations of two small local-oriented elements in macaque (Macaca mulatta) V1. We find strong subadditive and mostly orientation-independent interactions for nearby elements [2-4 mm interelement cortical distance (IED)] that gradually become linear at larger separations (>6 mm IED). These results are consistent with a population gain control model describing nonlinear V1 population responses to single oriented elements. However, because of the membrane potential-to-spiking accelerating nonlinearity, the model predicts supra-additive lateral interactions of spiking responses for intermediate separations at a range of locations between the two elements, consistent with some prior facilitatory effects observed in electrophysiology and psychophysics. Overall, our results suggest that population-level lateral interactions in V1 are primarily explained by a simple orientation-independent contrast gain control mechanism.SIGNIFICANCE STATEMENT Interactions between representations of simple visual elements such as oriented edges in primary visual cortex (V1) are thought to contribute to our ability to easily integrate contours and segment surfaces, but the mechanisms that govern these interactions are primarily unknown. Our study provides novel evidence that lateral interactions at the population level are governed by a simple contrast gain-control mechanism, and we show how this divisive gain-control mechanism can give rise to apparently facilitatory spiking responses.
Assuntos
Sensibilidades de Contraste/fisiologia , Percepção de Forma/fisiologia , Estimulação Luminosa/métodos , Córtex Visual/fisiologia , Vias Visuais/fisiologia , Potenciais de Ação/fisiologia , Animais , Macaca mulatta , MasculinoRESUMO
A long-term goal of visual neuroscience is to develop and test quantitative models that account for the moment-by-moment relationship between neural responses in early visual cortex and human performance in natural visual tasks. This review focuses on efforts to address this goal by measuring and perturbing the activity of primary visual cortex (V1) neurons while nonhuman primates perform demanding, well-controlled visual tasks. We start by describing a conceptual approach-the decoder linking model (DLM) framework-in which candidate decoding models take neural responses as input and generate predicted behavior as output. The ultimate goal in this framework is to find the actual decoder-the model that best predicts behavior from neural responses. We discuss key relevant properties of primate V1 and review current literature from the DLM perspective. We conclude by discussing major technological and theoretical advances that are likely to accelerate our understanding of the link between V1 activity and behavior.
Assuntos
Comportamento Animal/fisiologia , Neurônios/fisiologia , Córtex Visual/fisiologia , Percepção Visual/fisiologia , Animais , Discriminação Psicológica/fisiologia , Retroalimentação Sensorial/fisiologia , Modelos Neurológicos , Primatas/fisiologia , Vias Visuais/fisiologiaRESUMO
Little is known about distance discrimination in real scenes, especially at long distances. This is not surprising given the logistical difficulties of making such measurements. To circumvent these difficulties, we collected 81 stereo images of outdoor scenes, together with precisely registered range images that provided the ground-truth distance at each pixel location. We then presented the stereo images in the correct viewing geometry and measured the ability of human subjects to discriminate the distance between locations in the scene, as a function of absolute distance (3 m to 30 m) and the angular spacing between the locations being compared (2°, 5°, and 10°). Measurements were made for binocular and monocular viewing. Thresholds for binocular viewing were quite small at all distances (Weber fractions less than 1% at 2° spacing and less than 4% at 10° spacing). Thresholds for monocular viewing were higher than those for binocular viewing out to distances of 15-20 m, beyond which they were the same. Using standard cue-combination analysis, we also estimated what the thresholds would be based on binocular-stereo cues alone. With two exceptions, we show that the entire pattern of results is consistent with what one would expect from classical studies of binocular disparity thresholds and separation/size discrimination thresholds measured with simple laboratory stimuli. The first exception is some deviation from the expected pattern at close distances (especially for monocular viewing). The second exception is that thresholds in natural scenes are lower, presumably because of the rich figural cues contained in natural images.
Assuntos
Sinais (Psicologia) , Percepção de Distância/fisiologia , Visão Binocular/fisiologia , Visão Monocular/fisiologia , Percepção Visual/fisiologia , Adulto , Percepção de Profundidade/fisiologia , Humanos , Masculino , Adulto JovemRESUMO
An extension of the signal-detection theory framework is described and demonstrated for two-alternative identification tasks. The extended framework assumes that the subject and an arbitrary model (or two subjects, or the same subject on two occasions) are performing the same task with the same stimuli, and that on each trial they both compute values of a decision variable. Thus, their joint performance is described by six fundamental quantities: two levels of intrinsic discriminability (d'), two values of decision criterion, and two decision-variable correlations (DVCs), one for each of the two categories of stimuli. The framework should be widely applicable for testing models and characterizing individual differences in behavioral and neurophysiological studies of perception and cognition. We demonstrate the framework for the well-known task of detecting a Gaussian target in white noise. We find that (a) subjects' DVCs are approximately equal to the square root of their efficiency relative to ideal (in agreement with the prediction of a popular class of models), (b) between-subjects and within-subject (double-pass) DVCs increase with target contrast and are greater for target-present than target-absent trials (rejecting many models),