Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 82
Filter
1.
J Vis ; 24(5): 12, 2024 May 01.
Article in English | MEDLINE | ID: mdl-38787569

ABSTRACT

Materials exhibit an extraordinary range of visual appearances. Characterizing and quantifying appearance is important not only for basic research on perceptual mechanisms but also for computer graphics and a wide range of industrial applications. Although methods exist for capturing and representing the optical properties of materials and how they vary across surfaces (Haindl & Filip, 2013), the representations are typically very high-dimensional, and how these representations relate to subjective perceptual impressions of material appearance remains poorly understood. Here, we used a data-driven approach to characterizing the perceived appearance characteristics of 30 samples of wood veneer using a "visual fingerprint" that describes each sample as a multidimensional feature vector, with each dimension capturing a different aspect of the appearance. Fifty-six crowd-sourced participants viewed triplets of movies depicting different wood samples as the sample rotated. Their task was to report which of the two match samples was subjectively most similar to the test sample. In another online experiment, 45 participants rated 10 wood-related appearance characteristics for each of the samples. The results reveal a consistent embedding of the samples across both experiments and a set of nine perceptual dimensions capturing aspects including the roughness, directionality, and spatial scale of the surface patterns. We also showed that a weighted linear combination of 11 image statistics, inspired by the rating characteristics, predicts perceptual dimensions well.


Subject(s)
Wood , Humans , Female , Adult , Male , Young Adult , Surface Properties , Photic Stimulation/methods , Form Perception/physiology , Pattern Recognition, Visual/physiology
2.
Curr Biol ; 34(5): 1098-1106.e5, 2024 03 11.
Article in English | MEDLINE | ID: mdl-38218184

ABSTRACT

Visual shape perception is central to many everyday tasks, from object recognition to grasping and handling tools.1,2,3,4,5,6,7,8,9,10 Yet how shape is encoded in the visual system remains poorly understood. Here, we probed shape representations using visual aftereffects-perceptual distortions that occur following extended exposure to a stimulus.11,12,13,14,15,16,17 Such effects are thought to be caused by adaptation in neural populations that encode both simple, low-level stimulus characteristics17,18,19,20 and more abstract, high-level object features.21,22,23 To tease these two contributions apart, we used machine-learning methods to synthesize novel shapes in a multidimensional shape space, derived from a large database of natural shapes.24 Stimuli were carefully selected such that low-level and high-level adaptation models made distinct predictions about the shapes that observers would perceive following adaptation. We found that adaptation along vector trajectories in the high-level shape space predicted shape aftereffects better than simple low-level processes. Our findings reveal the central role of high-level statistical features in the visual representation of shape. The findings also hint that human vision is attuned to the distribution of shapes experienced in the natural environment.


Subject(s)
Vision, Ocular , Visual Perception , Humans , Perceptual Distortion , Environment , Pattern Recognition, Visual , Photic Stimulation
3.
Behav Brain Sci ; 46: e386, 2023 Dec 06.
Article in English | MEDLINE | ID: mdl-38054335

ABSTRACT

Everyone agrees that testing hypotheses is important, but Bowers et al. provide scant details about where hypotheses about perception and brain function should come from. We suggest that the answer lies in considering how information about the outside world could be acquired - that is, learned - over the course of evolution and development. Deep neural networks (DNNs) provide one tool to address this question.


Subject(s)
Brain , Neural Networks, Computer , Humans , Learning
4.
J Neurosci ; 43(49): 8504-8514, 2023 12 06.
Article in English | MEDLINE | ID: mdl-37848285

ABSTRACT

Selecting suitable grasps on three-dimensional objects is a challenging visuomotor computation, which involves combining information about an object (e.g., its shape, size, and mass) with information about the actor's body (e.g., the optimal grasp aperture and hand posture for comfortable manipulation). Here, we used functional magnetic resonance imaging to investigate brain networks associated with these distinct aspects during grasp planning and execution. Human participants of either sex viewed and then executed preselected grasps on L-shaped objects made of wood and/or brass. By leveraging a computational approach that accurately predicts human grasp locations, we selected grasp points that disentangled the role of multiple grasp-relevant factors, that is, grasp axis, grasp size, and object mass. Representational Similarity Analysis revealed that grasp axis was encoded along dorsal-stream regions during grasp planning. Grasp size was first encoded in ventral stream areas during grasp planning then in premotor regions during grasp execution. Object mass was encoded in ventral stream and (pre)motor regions only during grasp execution. Premotor regions further encoded visual predictions of grasp comfort, whereas the ventral stream encoded grasp comfort during execution, suggesting its involvement in haptic evaluation. These shifts in neural representations thus capture the sensorimotor transformations that allow humans to grasp objects.SIGNIFICANCE STATEMENT Grasping requires integrating object properties with constraints on hand and arm postures. Using a computational approach that accurately predicts human grasp locations by combining such constraints, we selected grasps on objects that disentangled the relative contributions of object mass, grasp size, and grasp axis during grasp planning and execution in a neuroimaging study. Our findings reveal a greater role of dorsal-stream visuomotor areas during grasp planning, and, surprisingly, increasing ventral stream engagement during execution. We propose that during planning, visuomotor representations initially encode grasp axis and size. Perceptual representations of object material properties become more relevant instead as the hand approaches the object and motor programs are refined with estimates of the grip forces required to successfully lift the object.


Subject(s)
Brain , Psychomotor Performance , Humans , Brain Mapping/methods , Hand Strength , Hand
5.
Mem Cognit ; 2023 Sep 05.
Article in English | MEDLINE | ID: mdl-37668880

ABSTRACT

Many objects and materials in our environment are subject to transformations that alter their shape. For example, branches bend in the wind, ice melts, and paper crumples. Still, we recognize objects and materials across these changes, suggesting we can distinguish an object's original features from those caused by the transformations ("shape scission"). Yet, if we truly understand transformations, we should not only be able to identify their signatures but also actively apply the transformations to new objects (i.e., through imagination or mental simulation). Here, we investigated this ability using a drawing task. On a tablet computer, participants viewed a sample contour and its transformed version, and were asked to apply the same transformation to a test contour by drawing what the transformed test shape should look like. Thus, they had to (i) infer the transformation from the shape differences, (ii) envisage its application to the test shape, and (iii) draw the result. Our findings show that drawings were more similar to the ground truth transformed test shape than to the original test shape-demonstrating the inference and reproduction of transformations from observation. However, this was only observed for relatively simple shapes. The ability was also modulated by transformation type and magnitude but not by the similarity between sample and test shapes. Together, our findings suggest that we can distinguish between representations of original object shapes and their transformations, and can use visual imagery to mentally apply nonrigid transformations to observed objects, showing how we not only perceive but also 'understand' shape.

6.
Curr Biol ; 33(17): R894-R895, 2023 09 11.
Article in English | MEDLINE | ID: mdl-37699342

ABSTRACT

Imagine staring into a clear river, starving, desperately searching for a fish to spear and cook. You see a dark shape lurking beneath the surface. It doesn't resemble any sort of fish you've encountered before - but you're hungry. To catch it, you need to anticipate which way it will move when you lunge for it, to compensate for your own sensory and motor processing delays1,2,3. Yet you know nothing about the behaviour of this creature, and do not know in which direction it will try to escape. What cues do you then use to drive such anticipatory responses? Fortunately, many species4, including humans, have the remarkable ability to predict the directionality of objects based on their shape - even if they are unfamiliar and so we cannot rely on semantic knowledge about their movements5. While it is known that such directional inferences can guide attention5, we do not yet fully understand how such causal inferences are made, or the extent to which they enable anticipatory behaviours. Does the oculomotor system, which moves our eyes to optimise visual input, use directional inferences from shape to anticipate upcoming motion direction? Such anticipation is necessary to stabilise the moving object on the high-resolution fovea of the retina while tracking the shape, a primary goal of the oculomotor system6, and to guide any future interactions7,8. Here, we leveraged a well-known behaviour of the oculomotor system: anticipatory smooth eye movements (ASEM), where an increase in eye velocity is observed in the direction of a stimulus' expected motion, before the stimulus actually moves3, to show that the oculomotor system extracts directional information from shape, and uses this inference to predict and anticipate upcoming motion.


Subject(s)
Eye Movements , Retina , Animals , Humans , Cell Movement , Cooking , Cues
7.
J Vis ; 23(7): 8, 2023 07 03.
Article in English | MEDLINE | ID: mdl-37432844

ABSTRACT

When we look at an object, we simultaneously see how glossy or matte it is, how light or dark, and what color. Yet, at each point on the object's surface, both diffuse and specular reflections are mixed in different proportions, resulting in substantial spatial chromatic and luminance variations. To further complicate matters, this pattern changes radically when the object is viewed under different lighting conditions. The purpose of this study was to simultaneously measure our ability to judge color and gloss using an image set capturing diverse object and illuminant properties. Participants adjusted the hue, lightness, chroma, and specular reflectance of a reference object so that it appeared to be made of the same material as a test object. Critically, the two objects were presented under different lighting environments. We found that hue matches were highly accurate, except for under a chromatically atypical illuminant. Chroma and lightness constancy were generally poor, but these failures correlated well with simple image statistics. Gloss constancy was particularly poor, and these failures were only partially explained by reflection contrast. Importantly, across all measures, participants were highly consistent with one another in their deviations from constancy. Although color and gloss constancy hold well in simple conditions, the variety of lighting and shape in the real world presents significant challenges to our visual system's ability to judge intrinsic material properties.


Subject(s)
Lighting , Humans
8.
Curr Biol ; 33(14): R760-R762, 2023 07 24.
Article in English | MEDLINE | ID: mdl-37490860

ABSTRACT

A new study shows how the brain exploits the parts of images where surfaces curve out of view to recover both the three-dimensional shape and material properties of objects. This sheds light on a long-standing 'chicken-and-egg' problem in perception research.


Subject(s)
Form Perception , Visual Perception , Vision, Ocular , Head , Depth Perception
9.
J Vis Exp ; (194)2023 04 21.
Article in English | MEDLINE | ID: mdl-37154551

ABSTRACT

To grasp an object successfully, we must select appropriate contact regions for our hands on the surface of the object. However, identifying such regions is challenging. This paper describes a workflow to estimate the contact regions from marker-based tracking data. Participants grasp real objects, while we track the 3D position of both the objects and the hand, including the fingers' joints. We first determine the joint Euler angles from a selection of tracked markers positioned on the back of the hand. Then, we use state-of-the-art hand mesh reconstruction algorithms to generate a mesh model of the participant's hand in the current pose and the 3D position. Using objects that were either 3D printed or 3D scanned-and are, thus, available as both real objects and mesh data-allows the hand and object meshes to be co-registered. In turn, this allows the estimation of approximate contact regions by calculating the intersections between the hand mesh and the co-registered 3D object mesh. The method may be used to estimate where and how humans grasp objects under a variety of conditions. Therefore, the method could be of interest to researchers studying visual and haptic perception, motor control, human-computer interaction in virtual and augmented reality, and robotics.


Subject(s)
Hand , Robotics , Humans , Hand Strength
10.
Front Neurosci ; 16: 1088926, 2022.
Article in English | MEDLINE | ID: mdl-36578823

ABSTRACT

[This corrects the article DOI: 10.3389/fnins.2020.591898.].

11.
Curr Biol ; 32(21): R1224-R1225, 2022 11 07.
Article in English | MEDLINE | ID: mdl-36347228

ABSTRACT

The discovery of mental rotation was one of the most significant landmarks in experimental psychology, leading to the ongoing assumption that to visually compare objects from different three-dimensional viewpoints, we use explicit internal simulations of object rotations, to 'mentally adjust' one object until it matches the other1. These rotations are thought to be performed on three-dimensional representations of the object, by literal analogy to physical rotations. In particular, it is thought that an imagined object is continuously adjusted at a constant three-dimensional angular rotation rate from its initial orientation to the final orientation through all intervening viewpoints2. While qualitative theories have tried to account for this phenomenon3, to date there has been no explicit, image-computable model of the underlying processes. As a result, there is no quantitative account of why some object viewpoints appear more similar to one another than others when the three-dimensional angular difference between them is the same4,5. We reasoned that the specific pattern of non-uniformities in the perception of viewpoints can reveal the visual computations underlying mental rotation. We therefore compared human viewpoint perception with a model based on the kind of two-dimensional 'optical flow' computations that are thought to underlie motion perception in biological vision6, finding that the model reproduces the specific errors that participants make. This suggests that mental rotation involves simulating the two-dimensional retinal image change that would occur when rotating objects. When we compare objects, we do not do so in a distal three-dimensional representation as previously assumed, but by measuring how much the proximal stimulus would change if we watched the object rotate, capturing perspectival appearance changes7.


Subject(s)
Motion Perception , Optic Flow , Humans , Pattern Recognition, Visual , Visual Perception
12.
J Vis ; 22(7): 6, 2022 06 01.
Article in English | MEDLINE | ID: mdl-35713928

ABSTRACT

Specular highlights are the most important image feature for surface gloss perception. Yet, recognizing whether a bright patch in an image is due to specular reflection or some other cause (e.g., texture marking) is challenging, and it remains unclear how the visual system reliably identifies highlights. There is currently no image-computable model that emulates human highlight identification, so here we sought to develop a neural network that reproduces observers' characteristic successes and failures. We rendered 179,085 images of glossy, undulating, textured surfaces. Given such images as input, a feedforward convolutional neural network was trained to output an image containing only the specular reflectance component. Participants viewed such images and reported whether or not specific pixels were highlights. The queried pixels were carefully selected to distinguish between ground truth and a simple thresholding of image intensity. The neural network outperformed the simple thresholding model-and ground truth-at predicting human responses. We then used a genetic algorithm to selectively delete connections within the neural network to identify variants of the network that approximated human judgments even more closely. The best resulting network shared 68% of the variance with human judgments-more than the unpruned network. As a first step toward interpreting the network, we then used representational similarity analysis to compare its inner representations to a wide variety of hand-engineered image features. We find that the network learns representations that are similar not only to directly image-computable predictors but also to more complex predictors such as intrinsic or geometric factors, as well as some indications of photo-geometrical constraints learned by the network. However, our network fails to replicate human response patterns to violations of photo-geometric constraints (rotated highlights) as described by other authors.


Subject(s)
Deep Learning , Humans , Judgment , Neural Networks, Computer , Problem Solving , Surface Properties
13.
Brain Sci ; 12(5)2022 May 20.
Article in English | MEDLINE | ID: mdl-35625053

ABSTRACT

Plants and animals are among the most behaviorally significant superordinate categories for humans. Visually assigning objects to such high-level classes is challenging because highly distinct items must be grouped together (e.g., chimpanzees and geckos) while more similar items must sometimes be separated (e.g., stick insects and twigs). As both animals and plants typically possess complex multi-limbed shapes, the perceptual organization of shape into parts likely plays a crucial rule in identifying them. Here, we identify a number of distinctive growth characteristics that affect the spatial arrangement and properties of limbs, yielding useful cues for differentiating plants from animals. We developed a novel algorithm based on shape skeletons to create many novel object pairs that differ in their part structure but are otherwise very similar. We found that particular part organizations cause stimuli to look systematically more like plants or animals. We then generated other 110 sequences of shapes morphing from animal- to plant-like appearance by modifying three aspects of part structure: sprouting parts, curvedness of parts, and symmetry of part pairs. We found that all three parameters correlated strongly with human animal/plant judgments. Together our findings suggest that subtle changes in the properties and organization of parts can provide powerful cues in superordinate categorization.

14.
Elife ; 112022 05 10.
Article in English | MEDLINE | ID: mdl-35536739

ABSTRACT

Humans have the amazing ability to learn new visual concepts from just a single exemplar. How we achieve this remains mysterious. State-of-the-art theories suggest observers rely on internal 'generative models', which not only describe observed objects, but can also synthesize novel variations. However, compelling evidence for generative models in human one-shot learning remains sparse. In most studies, participants merely compare candidate objects created by the experimenters, rather than generating their own ideas. Here, we overcame this key limitation by presenting participants with 2D 'Exemplar' shapes and asking them to draw their own 'Variations' belonging to the same class. The drawings reveal that participants inferred-and synthesized-genuine novel categories that were far more varied than mere copies. Yet, there was striking agreement between participants about which shape features were most distinctive, and these tended to be preserved in the drawn Variations. Indeed, swapping distinctive parts caused objects to swap apparent category. Our findings suggest that internal generative models are key to how humans generalize from single exemplars. When observers see a novel object for the first time, they identify its most distinctive features and infer a generative model of its shape, allowing them to mentally synthesize plausible variants.


Subject(s)
Generalization, Psychological , Learning , Humans , Pattern Recognition, Visual
15.
J Vis ; 22(4): 4, 2022 03 02.
Article in English | MEDLINE | ID: mdl-35266961

ABSTRACT

Distinguishing mirror from glass is a challenging visual inference, because both materials derive their appearance from their surroundings, yet we rarely experience difficulties in telling them apart. Very few studies have investigated how the visual system distinguishes reflections from refractions and to date, there is no image-computable model that emulates human judgments. Here we sought to develop a deep neural network that reproduces the patterns of visual judgments human observers make. To do this, we trained thousands of convolutional neural networks on more than 750,000 simulated mirror and glass objects, and compared their performance with human judgments, as well as alternative classifiers based on "hand-engineered" image features. For randomly chosen images, all classifiers and humans performed with high accuracy, and therefore correlated highly with one another. However, to assess how similar models are to humans, it is not sufficient to compare accuracy or correlation on random images. A good model should also predict the characteristic errors that humans make. We, therefore, painstakingly assembled a diagnostic image set for which humans make systematic errors, allowing us to isolate signatures of human-like performance. A large-scale, systematic search through feedforward neural architectures revealed that relatively shallow (three-layer) networks predicted human judgments better than any other models we tested. This is the first image-computable model that emulates human errors and succeeds in distinguishing mirror from glass, and hints that mid-level visual processing might be particularly important for the task.


Subject(s)
Neural Networks, Computer , Visual Perception , Humans
16.
Curr Biol ; 32(6): R272-R273, 2022 03 28.
Article in English | MEDLINE | ID: mdl-35349812

ABSTRACT

Recent research has uncovered a surprising new role of colour in the perception of three-dimensional shape. The brain is exquisitely sensitive to visual patterns emerging from the way different wavelengths interact with surfaces.


Subject(s)
Color Perception , Visual Perception , Brain
17.
J Vis ; 22(4): 17, 2022 03 02.
Article in English | MEDLINE | ID: mdl-35353153

ABSTRACT

Color constancy is our ability to perceive constant colors across varying illuminations. Here, we trained deep neural networks to be color constant and evaluated their performance with varying cues. Inputs to the networks consisted of two-dimensional images of simulated cone excitations derived from three-dimensional (3D) rendered scenes of 2,115 different 3D shapes, with spectral reflectances of 1,600 different Munsell chips, illuminated under 278 different natural illuminations. The models were trained to classify the reflectance of the objects. Testing was done with four new illuminations with equally spaced CIEL*a*b* chromaticities, two along the daylight locus and two orthogonal to it. High levels of color constancy were achieved with different deep neural networks, and constancy was higher along the daylight locus. When gradually removing cues from the scene, constancy decreased. Both ResNets and classical ConvNets of varying degrees of complexity performed well. However, DeepCC, our simplest sequential convolutional network, represented colors along the three color dimensions of human color vision, while ResNets showed a more complex representation.


Subject(s)
Color Perception , Color Vision , Humans , Lighting , Photic Stimulation , Retinal Cone Photoreceptor Cells
18.
iScience ; 25(3): 103970, 2022 Mar 18.
Article in English | MEDLINE | ID: mdl-35281732

ABSTRACT

Many natural materials have complex, multi-scale structures. Consequently, the inferred identity of a surface can vary with the assumed spatial scale of the scene: a plowed field seen from afar can resemble corduroy seen up close. We investigated this 'material-scale ambiguity' using 87 photographs of diverse materials (e.g., water, sand, stone, metal, and wood). Across two experiments, separate groups of participants (N = 72 adults) provided judgements of the material category depicted in each image, either with or without manipulations of apparent distance (by verbal instructions, or adding objects of familiar size). Our results demonstrate that these manipulations can cause identical images to be assigned to completely different material categories, depending on the assumed scale. Under challenging conditions, therefore, the categorization of materials is susceptible to simple manipulations of apparent distance, revealing a striking example of top-down effects in the interpretation of image features.

19.
Acta Psychol (Amst) ; 221: 103457, 2021 Nov.
Article in English | MEDLINE | ID: mdl-34883348

ABSTRACT

The development of material property perception for grasping objects is not well explored during early childhood. Therefore, we investigated infants', 3-year-old children's, and adults' unimanual grasping behavior and reaching kinematics for objects of different rigidity using a 3D motion capture system. In Experiment 1, 11-month-old infants and for purposes of comparison adults, and in Experiment 2, 3-year old children were encouraged to lift relatively heavy objects with one of two handles differing in rigidity after visual (Condition 1) and visual-haptic exploration (Condition 2). Experiment 1 revealed that 11-months-olds, after visual object exploration, showed no significant material preference, and thus did not consider the material to facilitate grasping. After visual-haptic object exploration and when grasping the contralateral handles, infants showed an unexpected preference for the soft handles, which were harder to use to lift the object. In contrast, adults generally grasped the rigid handle exploiting their knowledge about efficient and functional grasping in both conditions. Reaching kinematics were barely affected by rigidity, but rather by condition and age. Experiment 2 revealed that 3-year-olds no longer exhibit a preference for grasping soft handles, but still no adult-like preference for rigid handles in both conditions. This suggests that material rigidity plays a minor role in infants' grasping behavior when only visual material information is available. Also, 3-year-olds seem to be on an intermediate level in the development from (1) preferring the pleasant sensation of a soft fabric, to (2) preferring the efficient rigid handle.


Subject(s)
Hand Strength , Stereognosis , Adult , Biomechanical Phenomena , Child, Preschool , Humans , Infant , Psychomotor Performance , Visual Perception
20.
J Vis ; 21(12): 14, 2021 11 01.
Article in English | MEDLINE | ID: mdl-34817568

ABSTRACT

The visual computations underlying human gloss perception remain poorly understood, and to date there is no image-computable model that reproduces human gloss judgments independent of shape and viewing conditions. Such a model could provide a powerful platform for testing hypotheses about the detailed workings of surface perception. Here, we made use of recent developments in artificial neural networks to test how well we could recreate human responses in a high-gloss versus low-gloss discrimination task. We rendered >70,000 scenes depicting familiar objects made of either mirror-like or near-matte textured materials. We trained numerous classifiers to distinguish the two materials in our images-ranging from linear classifiers using simple pixel statistics to convolutional neural networks (CNNs) with up to 12 layers-and compared their classifications with human judgments. To determine which classifiers made the same kinds of errors as humans, we painstakingly identified a set of 60 images in which human judgments are consistently decoupled from ground truth. We then conducted a Bayesian hyperparameter search to identify which out of several thousand CNNs most resembled humans. We found that, although architecture has only a relatively weak effect, high correlations with humans are somewhat more typical in networks of shallower to intermediate depths (three to five layers). We also trained deep convolutional generative adversarial networks (DCGANs) of different depths to recreate images based on our high- and low-gloss database. Responses from human observers show that two layers in a DCGAN can recreate gloss recognizably for human observers. Together, our results indicate that human gloss classification can best be explained by computations resembling early to mid-level vision.


Subject(s)
Neural Networks, Computer , Perception , Bayes Theorem , Humans , Visual Perception
SELECTION OF CITATIONS
SEARCH DETAIL
...