|

Brain2GAN: Feature-disentangled neural encoding and decoding of visual perception in the primate brain.

Dado, Thirza; Papale, Paolo; Lozano, Antonio; Le, Lynn; Wang, Feng; van Gerven, Marcel; Roelfsema, Pieter; Güçlütürk, Yagmur; Güçlü, Umut.

PLoS Comput Biol ; 20(5): e1012058, 2024 May.

Article En | MEDLINE | ID: mdl-38709818

A challenging goal of neural coding is to characterize the neural representations underlying visual perception. To this end, multi-unit activity (MUA) of macaque visual cortex was recorded in a passive fixation task upon presentation of faces and natural images. We analyzed the relationship between MUA and latent representations of state-of-the-art deep generative models, including the conventional and feature-disentangled representations of generative adversarial networks (GANs) (i.e., z- and w-latents of StyleGAN, respectively) and language-contrastive representations of latent diffusion networks (i.e., CLIP-latents of Stable Diffusion). A mass univariate neural encoding analysis of the latent representations showed that feature-disentangled w representations outperform both z and CLIP representations in explaining neural responses. Further, w-latent features were found to be positioned at the higher end of the complexity gradient which indicates that they capture visual information relevant to high-level neural activity. Subsequently, a multivariate neural decoding analysis of the feature-disentangled representations resulted in state-of-the-art spatiotemporal reconstructions of visual perception. Taken together, our results not only highlight the important role of feature-disentanglement in shaping high-level neural representations underlying visual perception but also serve as an important benchmark for the future of neural coding.

Models, Neurological , Visual Cortex , Visual Perception , Animals , Visual Perception/physiology , Visual Cortex/physiology , Macaca mulatta , Computational Biology , Neural Networks, Computer , Photic Stimulation , Male , Neurons/physiology , Brain/physiology

Gaze-contingent processing improves mobility, scene recognition and visual search in simulated head-steered prosthetic vision.

de Ruyter van Steveninck, Jaap; Nipshagen, Mo; van Gerven, Marcel; Güçlü, Umut; Güçlüturk, Yagmur; van Wezel, Richard.

J Neural Eng ; 21(2)2024 Apr 10.

Article En | MEDLINE | ID: mdl-38502957

Objective.The enabling technology of visual prosthetics for the blind is making rapid progress. However, there are still uncertainties regarding the functional outcomes, which can depend on many design choices in the development. In visual prostheses with a head-mounted camera, a particularly challenging question is how to deal with the gaze-locked visual percept associated with spatial updating conflicts in the brain. The current study investigates a recently proposed compensation strategy based on gaze-contingent image processing with eye-tracking. Gaze-contingent processing is expected to reinforce natural-like visual scanning and reestablished spatial updating based on eye movements. The beneficial effects remain to be investigated for daily life activities in complex visual environments.Approach.The current study evaluates the benefits of gaze-contingent processing versus gaze-locked and gaze-ignored simulations in the context of mobility, scene recognition and visual search, using a virtual reality simulated prosthetic vision paradigm with sighted subjects.Main results.Compared to gaze-locked vision, gaze-contingent processing was consistently found to improve the speed in all experimental tasks, as well as the subjective quality of vision. Similar or further improvements were found in a control condition that ignores gaze-dependent effects, a simulation that is unattainable in the clinical reality.Significance.Our results suggest that gaze-locked vision and spatial updating conflicts can be debilitating for complex visually-guided activities of daily living such as mobility and orientation. Therefore, for prospective users of head-steered prostheses with an unimpaired oculomotor system, the inclusion of a compensatory eye-tracking system is strongly endorsed.

Activities of Daily Living , Vision, Ocular , Humans , Prospective Studies , Eye Movements , Computer Simulation

Towards biologically plausible phosphene simulation for the differentiable optimization of visual cortical prostheses.

van der Grinten, Maureen; de Ruyter van Steveninck, Jaap; Lozano, Antonio; Pijnacker, Laura; Rueckauer, Bodo; Roelfsema, Pieter; van Gerven, Marcel; van Wezel, Richard; Güçlü, Umut; Güçlütürk, Yagmur.

Elife ; 132024 Feb 22.

Article En | MEDLINE | ID: mdl-38386406

Blindness affects millions of people around the world. A promising solution to restoring a form of vision for some individuals are cortical visual prostheses, which bypass part of the impaired visual pathway by converting camera input to electrical stimulation of the visual system. The artificially induced visual percept (a pattern of localized light flashes, or 'phosphenes') has limited resolution, and a great portion of the field's research is devoted to optimizing the efficacy, efficiency, and practical usefulness of the encoding of visual information. A commonly exploited method is non-invasive functional evaluation in sighted subjects or with computational models by using simulated prosthetic vision (SPV) pipelines. An important challenge in this approach is to balance enhanced perceptual realism, biologically plausibility, and real-time performance in the simulation of cortical prosthetic vision. We present a biologically plausible, PyTorch-based phosphene simulator that can run in real-time and uses differentiable operations to allow for gradient-based computational optimization of phosphene encoding models. The simulator integrates a wide range of clinical results with neurophysiological evidence in humans and non-human primates. The pipeline includes a model of the retinotopic organization and cortical magnification of the visual cortex. Moreover, the quantitative effects of stimulation parameters and temporal dynamics on phosphene characteristics are incorporated. Our results demonstrate the simulator's suitability for both computational applications such as end-to-end deep learning-based prosthetic vision optimization as well as behavioral experiments. The modular and open-source software provides a flexible simulation framework for computational, clinical, and behavioral neuroscientists working on visual neuroprosthetics.

Phosphenes , Visual Prosthesis , Animals , Humans , Computer Simulation , Software , Blindness/therapy

Brain2Pix: Fully convolutional naturalistic video frame reconstruction from brain activity.

Le, Lynn; Ambrogioni, Luca; Seeliger, Katja; Güçlütürk, Yagmur; van Gerven, Marcel; Güçlü, Umut.

Front Neurosci ; 16: 940972, 2022.

Article En | MEDLINE | ID: mdl-36452333

Reconstructing complex and dynamic visual perception from brain activity remains a major challenge in machine learning applications to neuroscience. Here, we present a new method for reconstructing naturalistic images and videos from very large single-participant functional magnetic resonance imaging data that leverages the recent success of image-to-image transformation networks. This is achieved by exploiting spatial information obtained from retinotopic mappings across the visual system. More specifically, we first determine what position each voxel in a particular region of interest would represent in the visual field based on its corresponding receptive field location. Then, the 2D image representation of the brain activity on the visual field is passed to a fully convolutional image-to-image network trained to recover the original stimuli using VGG feature loss with an adversarial regularizer. In our experiments, we show that our method offers a significant improvement over existing video reconstruction techniques.

Real-world indoor mobility with simulated prosthetic vision: The benefits and feasibility of contour-based scene simplification at different phosphene resolutions.

de Ruyter van Steveninck, Jaap; van Gestel, Tom; Koenders, Paula; van der Ham, Guus; Vereecken, Floris; Güçlü, Umut; van Gerven, Marcel; Güçlütürk, Yagmur; van Wezel, Richard.

J Vis ; 22(2): 1, 2022 02 01.

Article En | MEDLINE | ID: mdl-35103758

Neuroprosthetic implants are a promising technology for restoring some form of vision in people with visual impairments via electrical neurostimulation in the visual pathway. Although an artificially generated prosthetic percept is relatively limited compared with normal vision, it may provide some elementary perception of the surroundings, re-enabling daily living functionality. For mobility in particular, various studies have investigated the benefits of visual neuroprosthetics in a simulated prosthetic vision paradigm with varying outcomes. The previous literature suggests that scene simplification via image processing, and particularly contour extraction, may potentially improve the mobility performance in a virtual environment. In the current simulation study with sighted participants, we explore both the theoretically attainable benefits of strict scene simplification in an indoor environment by controlling the environmental complexity, as well as the practically achieved improvement with a deep learning-based surface boundary detection implementation compared with traditional edge detection. A simulated electrode resolution of 26 × 26 was found to provide sufficient information for mobility in a simple environment. Our results suggest that, for a lower number of implanted electrodes, the removal of background textures and within-surface gradients may be beneficial in theory. However, the deep learning-based implementation for surface boundary detection did not improve mobility performance in the current study. Furthermore, our findings indicate that, for a greater number of electrodes, the removal of within-surface gradients and background textures may deteriorate, rather than improve, mobility. Therefore, finding a balanced amount of scene simplification requires a careful tradeoff between informativity and interpretability that may depend on the number of implanted electrodes.

Form Perception , Phosphenes , Feasibility Studies , Humans , Vision Disorders , Vision, Ocular

Hyperrealistic neural decoding for reconstructing faces from fMRI activations via the GAN latent space.

Dado, Thirza; Güçlütürk, Yagmur; Ambrogioni, Luca; Ras, Gabriëlle; Bosch, Sander; van Gerven, Marcel; Güçlü, Umut.

Sci Rep ; 12(1): 141, 2022 01 07.

Article En | MEDLINE | ID: mdl-34997012

Neural decoding can be conceptualized as the problem of mapping brain responses back to sensory stimuli via a feature space. We introduce (i) a novel experimental paradigm that uses well-controlled yet highly naturalistic stimuli with a priori known feature representations and (ii) an implementation thereof for HYPerrealistic reconstruction of PERception (HYPER) of faces from brain recordings. To this end, we embrace the use of generative adversarial networks (GANs) at the earliest step of our neural decoding pipeline by acquiring fMRI data as participants perceive face images synthesized by the generator network of a GAN. We show that the latent vectors used for generation effectively capture the same defining stimulus properties as the fMRI measurements. As such, these latents (conditioned on the GAN) are used as the in-between feature representations underlying the perceived images that can be predicted in neural decoding for (re-)generation of the originally perceived stimuli, leading to the most accurate reconstructions of perception to date.

Brain Mapping , Brain/diagnostic imaging , Image Interpretation, Computer-Assisted , Magnetic Resonance Imaging , Neural Networks, Computer , Adult , Brain/physiology , Face , Humans , Male , Photic Stimulation , Predictive Value of Tests , Recognition, Psychology , Visual Perception

Representations of naturalistic stimulus complexity in early and associative visual and auditory cortices.

Güçlütürk, Yagmur; Güçlü, Umut; van Gerven, Marcel; van Lier, Rob.

Sci Rep ; 8(1): 3439, 2018 02 21.

Article En | MEDLINE | ID: mdl-29467495

The complexity of sensory stimuli has an important role in perception and cognition. However, its neural representation is not well understood. Here, we characterize the representations of naturalistic visual and auditory stimulus complexity in early and associative visual and auditory cortices. This is realized by means of encoding and decoding analyses of two fMRI datasets in the visual and auditory modalities. Our results implicate most early and some associative sensory areas in representing the complexity of naturalistic sensory stimuli. For example, parahippocampal place area, which was previously shown to represent scene features, is shown to also represent scene complexity. Similarly, posterior regions of superior temporal gyrus and superior temporal sulcus, which were previously shown to represent syntactic (language) complexity, are shown to also represent music (auditory) complexity. Furthermore, our results suggest the existence of gradients in sensitivity to naturalistic sensory stimulus complexity in these areas.

Auditory Cortex/physiology , Visual Cortex/physiology , Acoustic Stimulation , Adult , Auditory Perception , Brain Mapping , Female , Humans , Language , Magnetic Resonance Imaging , Male , Music , Photic Stimulation , Visual Perception , Young Adult

Liking versus Complexity: Decomposing the Inverted U-curve.

Güçlütürk, Yagmur; Jacobs, Richard H A H; van Lier, Rob.

Front Hum Neurosci ; 10: 112, 2016.

Article En | MEDLINE | ID: mdl-27047359

The relationship between liking and stimulus complexity is commonly reported to follow an inverted U-curve. However, large individual differences among complexity preferences of participants have frequently been observed since the earliest studies on the topic. The common use of across-participant analysis methods that ignore these large individual differences in aesthetic preferences gives an impression of high agreement between individuals. In this study, we collected ratings of liking and perceived complexity from 30 participants for a set of digitally generated grayscale images. In addition, we calculated an objective measure of complexity for each image. Our results reveal that the inverted U-curve relationship between liking and stimulus complexity comes about as the combination of different individual liking functions. Specifically, after automatically clustering the participants based on their liking ratings, we determined that one group of participants in our sample had increasingly lower liking ratings for increasingly more complex stimuli, while a second group of participants had increasingly higher liking ratings for increasingly more complex stimuli. Based on our findings, we call for a focus on the individual differences in aesthetic preferences, adoption of alternative analysis methods that would account for these differences and a re-evaluation of established rules of human aesthetic preferences.