Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 63
Filtrar
1.
Nat Methods ; 18(9): 1112-1116, 2021 09.
Artigo em Inglês | MEDLINE | ID: mdl-34462591

RESUMO

Optogenetic methods have been widely used in rodent brains, but remain relatively under-developed for nonhuman primates such as rhesus macaques, an animal model with a large brain expressing sophisticated sensory, motor and cognitive behaviors. To address challenges in behavioral optogenetics in large brains, we developed Opto-Array, a chronically implantable array of light-emitting diodes for high-throughput optogenetic perturbation. We demonstrated that optogenetic silencing in the macaque primary visual cortex with the help of the Opto-Array results in reliable retinotopic visual deficits in a luminance discrimination task. We separately confirmed that Opto-Array illumination results in local neural silencing, and that behavioral effects are not due to tissue heating. These results demonstrate the effectiveness of the Opto-Array for behavioral optogenetic applications in large brains.


Assuntos
Encéfalo/fisiologia , Optogenética/métodos , Próteses e Implantes , Animais , Comportamento Animal , Eletrônica/métodos , Tecnologia de Fibra Óptica , Macaca mulatta , Masculino , Córtex Visual
2.
PLoS Comput Biol ; 19(12): e1011713, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-38079444

RESUMO

A core problem in visual object learning is using a finite number of images of a new object to accurately identify that object in future, novel images. One longstanding, conceptual hypothesis asserts that this core problem is solved by adult brains through two connected mechanisms: 1) the re-representation of incoming retinal images as points in a fixed, multidimensional neural space, and 2) the optimization of linear decision boundaries in that space, via simple plasticity rules applied to a single downstream layer. Though this scheme is biologically plausible, the extent to which it explains learning behavior in humans has been unclear-in part because of a historical lack of image-computable models of the putative neural space, and in part because of a lack of measurements of human learning behaviors in difficult, naturalistic settings. Here, we addressed these gaps by 1) drawing from contemporary, image-computable models of the primate ventral visual stream to create a large set of testable learning models (n = 2,408 models), and 2) using online psychophysics to measure human learning trajectories over a varied set of tasks involving novel 3D objects (n = 371,000 trials), which we then used to develop (and publicly release) empirical benchmarks for comparing learning models to humans. We evaluated each learning model on these benchmarks, and found those based on deep, high-level representations from neural networks were surprisingly aligned with human behavior. While no tested model explained the entirety of replicable human behavior, these results establish that rudimentary plasticity rules, when combined with appropriate visual representations, have high explanatory power in predicting human behavior with respect to this core object learning problem.


Assuntos
Redes Neurais de Computação , Reconhecimento Visual de Modelos , Adulto , Animais , Humanos , Primatas , Encéfalo , Aprendizagem Espacial , Modelos Neurológicos , Percepção Visual
3.
Proc Natl Acad Sci U S A ; 118(3)2021 01 19.
Artigo em Inglês | MEDLINE | ID: mdl-33431673

RESUMO

Deep neural networks currently provide the best quantitative models of the response patterns of neurons throughout the primate ventral visual stream. However, such networks have remained implausible as a model of the development of the ventral stream, in part because they are trained with supervised methods requiring many more labels than are accessible to infants during development. Here, we report that recent rapid progress in unsupervised learning has largely closed this gap. We find that neural network models learned with deep unsupervised contrastive embedding methods achieve neural prediction accuracy in multiple ventral visual cortical areas that equals or exceeds that of models derived using today's best supervised methods and that the mapping of these neural network models' hidden layers is neuroanatomically consistent across the ventral stream. Strikingly, we find that these methods produce brain-like representations even when trained solely with real human child developmental data collected from head-mounted cameras, despite the fact that these datasets are noisy and limited. We also find that semisupervised deep contrastive embeddings can leverage small numbers of labeled examples to produce representations with substantially improved error-pattern consistency to human behavior. Taken together, these results illustrate a use of unsupervised learning to provide a quantitative model of a multiarea cortical brain system and present a strong candidate for a biologically plausible computational theory of primate sensory learning.


Assuntos
Rede Nervosa/fisiologia , Redes Neurais de Computação , Neurônios/fisiologia , Reconhecimento Visual de Modelos/fisiologia , Córtex Visual/fisiologia , Animais , Criança , Conjuntos de Dados como Assunto , Humanos , Macaca/fisiologia , Rede Nervosa/anatomia & histologia , Aprendizado de Máquina não Supervisionado , Córtex Visual/anatomia & histologia
4.
Behav Brain Sci ; 46: e390, 2023 Dec 06.
Artigo em Inglês | MEDLINE | ID: mdl-38054303

RESUMO

In the target article, Bowers et al. dispute deep artificial neural network (ANN) models as the currently leading models of human vision without producing alternatives. They eschew the use of public benchmarking platforms to compare vision models with the brain and behavior, and they advocate for a fragmented, phenomenon-specific modeling approach. These are unconstructive to scientific progress. We outline how the Brain-Score community is moving forward to add new model-to-human comparisons to its community-transparent suite of benchmarks.


Assuntos
Encéfalo , Redes Neurais de Computação , Humanos
5.
Neural Comput ; 34(8): 1652-1675, 2022 07 14.
Artigo em Inglês | MEDLINE | ID: mdl-35798321

RESUMO

The computational role of the abundant feedback connections in the ventral visual stream is unclear, enabling humans and nonhuman primates to effortlessly recognize objects across a multitude of viewing conditions. Prior studies have augmented feedforward convolutional neural networks (CNNs) with recurrent connections to study their role in visual processing; however, often these recurrent networks are optimized directly on neural data or the comparative metrics used are undefined for standard feedforward networks that lack these connections. In this work, we develop task-optimized convolutional recurrent (ConvRNN) network models that more correctly mimic the timing and gross neuroanatomy of the ventral pathway. Properly chosen intermediate-depth ConvRNN circuit architectures, which incorporate mechanisms of feedforward bypassing and recurrent gating, can achieve high performance on a core recognition task, comparable to that of much deeper feedforward networks. We then develop methods that allow us to compare both CNNs and ConvRNNs to finely grained measurements of primate categorization behavior and neural response trajectories across thousands of stimuli. We find that high-performing ConvRNNs provide a better match to these data than feedforward networks of any depth, predicting the precise timings at which each stimulus is behaviorally decoded from neural activation patterns. Moreover, these ConvRNN circuits consistently produce quantitatively accurate predictions of neural dynamics from V4 and IT across the entire stimulus presentation. In fact, we find that the highest-performing ConvRNNs, which best match neural and behavioral data, also achieve a strong Pareto trade-off between task performance and overall network size. Taken together, our results suggest the functional purpose of recurrence in the ventral pathway is to fit a high-performing network in cortex, attaining computational power through temporal rather than spatial complexity.


Assuntos
Análise e Desempenho de Tarefas , Percepção Visual , Animais , Humanos , Macaca mulatta/fisiologia , Redes Neurais de Computação , Reconhecimento Visual de Modelos/fisiologia , Reconhecimento Psicológico/fisiologia , Vias Visuais/fisiologia , Percepção Visual/fisiologia
6.
J Neurosci ; 38(33): 7255-7269, 2018 08 15.
Artigo em Inglês | MEDLINE | ID: mdl-30006365

RESUMO

Primates, including humans, can typically recognize objects in visual images at a glance despite naturally occurring identity-preserving image transformations (e.g., changes in viewpoint). A primary neuroscience goal is to uncover neuron-level mechanistic models that quantitatively explain this behavior by predicting primate performance for each and every image. Here, we applied this stringent behavioral prediction test to the leading mechanistic models of primate vision (specifically, deep, convolutional, artificial neural networks; ANNs) by directly comparing their behavioral signatures against those of humans and rhesus macaque monkeys. Using high-throughput data collection systems for human and monkey psychophysics, we collected more than one million behavioral trials from 1472 anonymous humans and five male macaque monkeys for 2400 images over 276 binary object discrimination tasks. Consistent with previous work, we observed that state-of-the-art deep, feedforward convolutional ANNs trained for visual categorization (termed DCNNIC models) accurately predicted primate patterns of object-level confusion. However, when we examined behavioral performance for individual images within each object discrimination task, we found that all tested DCNNIC models were significantly nonpredictive of primate performance and that this prediction failure was not accounted for by simple image attributes nor rescued by simple model modifications. These results show that current DCNNIC models cannot account for the image-level behavioral patterns of primates and that new ANN models are needed to more precisely capture the neural mechanisms underlying primate object vision. To this end, large-scale, high-resolution primate behavioral benchmarks such as those obtained here could serve as direct guides for discovering such models.SIGNIFICANCE STATEMENT Recently, specific feedforward deep convolutional artificial neural networks (ANNs) models have dramatically advanced our quantitative understanding of the neural mechanisms underlying primate core object recognition. In this work, we tested the limits of those ANNs by systematically comparing the behavioral responses of these models with the behavioral responses of humans and monkeys at the resolution of individual images. Using these high-resolution metrics, we found that all tested ANN models significantly diverged from primate behavior. Going forward, these high-resolution, large-scale primate behavioral benchmarks could serve as direct guides for discovering better ANN models of the primate visual system.


Assuntos
Macaca mulatta/fisiologia , Redes Neurais de Computação , Reconhecimento Visual de Modelos/fisiologia , Reconhecimento Psicológico/fisiologia , Animais , Discriminação Psicológica/fisiologia , Humanos , Masculino , Modelos Neurológicos , Psicofísica , Especificidade da Espécie
8.
Proc Natl Acad Sci U S A ; 112(21): 6730-5, 2015 May 26.
Artigo em Inglês | MEDLINE | ID: mdl-25953336

RESUMO

Neurons that respond more to images of faces over nonface objects were identified in the inferior temporal (IT) cortex of primates three decades ago. Although it is hypothesized that perceptual discrimination between faces depends on the neural activity of IT subregions enriched with "face neurons," such a causal link has not been directly established. Here, using optogenetic and pharmacological methods, we reversibly suppressed the neural activity in small subregions of IT cortex of macaque monkeys performing a facial gender-discrimination task. Each type of intervention independently demonstrated that suppression of IT subregions enriched in face neurons induced a contralateral deficit in face gender-discrimination behavior. The same neural suppression of other IT subregions produced no detectable change in behavior. These results establish a causal link between the neural activity in IT face neuron subregions and face gender-discrimination behavior. Also, the demonstration that brief neural suppression of specific spatial subregions of IT induces behavioral effects opens the door for applying the technical advantages of optogenetics to a systematic attack on the causal relationship between IT cortex and high-level visual perception.


Assuntos
Face/anatomia & histologia , Macaca mulatta/anatomia & histologia , Macaca mulatta/fisiologia , Caracteres Sexuais , Lobo Temporal/citologia , Lobo Temporal/fisiologia , Animais , Fenômenos Eletrofisiológicos , Feminino , Agonistas de Receptores de GABA-A/administração & dosagem , Masculino , Muscimol/administração & dosagem , Optogenética , Lobo Temporal/efeitos dos fármacos , Percepção Visual/efeitos dos fármacos , Percepção Visual/fisiologia
9.
J Neurosci ; 36(50): 12729-12745, 2016 12 14.
Artigo em Inglês | MEDLINE | ID: mdl-27810930

RESUMO

While early cortical visual areas contain fine scale spatial organization of neuronal properties, such as orientation preference, the spatial organization of higher-level visual areas is less well understood. The fMRI demonstration of face-preferring regions in human ventral cortex and monkey inferior temporal cortex ("face patches") raises the question of how neural selectivity for faces is organized. Here, we targeted hundreds of spatially registered neural recordings to the largest fMRI-identified face-preferring region in monkeys, the middle face patch (MFP), and show that the MFP contains a graded enrichment of face-preferring neurons. At its center, as much as 93% of the sites we sampled responded twice as strongly to faces than to nonface objects. We estimate the maximum neurophysiological size of the MFP to be ∼6 mm in diameter, consistent with its previously reported size under fMRI. Importantly, face selectivity in the MFP varied strongly even between neighboring sites. Additionally, extremely face-selective sites were ∼40 times more likely to be present inside the MFP than outside. These results provide the first direct quantification of the size and neural composition of the MFP by showing that the cortical tissue localized to the fMRI defined region consists of a very high fraction of face-preferring sites near its center, and a monotonic decrease in that fraction along any radial spatial axis. SIGNIFICANCE STATEMENT: The underlying organization of neurons that give rise to the large spatial regions of activity observed with fMRI is not well understood. Neurophysiological studies that have targeted the fMRI identified face patches in monkeys have provided evidence for both large-scale clustering and a heterogeneous spatial organization. Here we used a novel x-ray imaging system to spatially map the responses of hundreds of sites in and around the middle face patch. We observed that face-selective signal localized to the middle face patch was characterized by a gradual spatial enrichment. Furthermore, strongly face-selective sites were ∼40 times more likely to be found inside the patch than outside of the patch.


Assuntos
Face , Reconhecimento Psicológico/fisiologia , Lobo Temporal/fisiologia , Animais , Mapeamento Encefálico , Fenômenos Eletrofisiológicos/fisiologia , Feminino , Macaca mulatta , Imageamento por Ressonância Magnética , Masculino , Modelos Neurológicos , Neurônios/fisiologia , Estimulação Luminosa , Córtex Visual/fisiologia
10.
Proc Natl Acad Sci U S A ; 111(23): 8619-24, 2014 Jun 10.
Artigo em Inglês | MEDLINE | ID: mdl-24812127

RESUMO

The ventral visual stream underlies key human visual object recognition abilities. However, neural encoding in the higher areas of the ventral stream remains poorly understood. Here, we describe a modeling approach that yields a quantitatively accurate model of inferior temporal (IT) cortex, the highest ventral cortical area. Using high-throughput computational techniques, we discovered that, within a class of biologically plausible hierarchical neural network models, there is a strong correlation between a model's categorization performance and its ability to predict individual IT neural unit response data. To pursue this idea, we then identified a high-performing neural network that matches human performance on a range of recognition tasks. Critically, even though we did not constrain this model to match neural data, its top output layer turns out to be highly predictive of IT spiking responses to complex naturalistic images at both the single site and population levels. Moreover, the model's intermediate layers are highly predictive of neural responses in the V4 cortex, a midlevel visual area that provides the dominant cortical input to IT. These results show that performance optimization--applied in a biologically appropriate model class--can be used to build quantitative predictive models of neural processing.


Assuntos
Macaca mulatta/fisiologia , Modelos Neurológicos , Redes Neurais de Computação , Córtex Visual/fisiologia , Algoritmos , Animais , Humanos , Rede Nervosa/fisiologia , Estimulação Luminosa/métodos , Desempenho Psicomotor/fisiologia , Reconhecimento Psicológico/fisiologia , Vias Visuais/fisiologia , Percepção Visual/fisiologia
11.
J Neurosci ; 35(35): 12127-36, 2015 Sep 02.
Artigo em Inglês | MEDLINE | ID: mdl-26338324

RESUMO

Although the rhesus monkey is used widely as an animal model of human visual processing, it is not known whether invariant visual object recognition behavior is quantitatively comparable across monkeys and humans. To address this question, we systematically compared the core object recognition behavior of two monkeys with that of human subjects. To test true object recognition behavior (rather than image matching), we generated several thousand naturalistic synthetic images of 24 basic-level objects with high variation in viewing parameters and image background. Monkeys were trained to perform binary object recognition tasks on a match-to-sample paradigm. Data from 605 human subjects performing the same tasks on Mechanical Turk were aggregated to characterize "pooled human" object recognition behavior, as well as 33 separate Mechanical Turk subjects to characterize individual human subject behavior. Our results show that monkeys learn each new object in a few days, after which they not only match mean human performance but show a pattern of object confusion that is highly correlated with pooled human confusion patterns and is statistically indistinguishable from individual human subjects. Importantly, this shared human and monkey pattern of 3D object confusion is not shared with low-level visual representations (pixels, V1+; models of the retina and primary visual cortex) but is shared with a state-of-the-art computer vision feature representation. Together, these results are consistent with the hypothesis that rhesus monkeys and humans share a common neural shape representation that directly supports object perception. SIGNIFICANCE STATEMENT: To date, several mammalian species have shown promise as animal models for studying the neural mechanisms underlying high-level visual processing in humans. In light of this diversity, making tight comparisons between nonhuman and human primates is particularly critical in determining the best use of nonhuman primates to further the goal of the field of translating knowledge gained from animal models to humans. To the best of our knowledge, this study is the first systematic attempt at comparing a high-level visual behavior of humans and macaque monkeys.


Assuntos
Aprendizagem/fisiologia , Reconhecimento Visual de Modelos/fisiologia , Reconhecimento Psicológico/fisiologia , Animais , Humanos , Macaca mulatta , Masculino , Estimulação Luminosa , Psicofísica , Especificidade da Espécie
12.
J Neurosci ; 35(39): 13402-18, 2015 Sep 30.
Artigo em Inglês | MEDLINE | ID: mdl-26424887

RESUMO

To go beyond qualitative models of the biological substrate of object recognition, we ask: can a single ventral stream neuronal linking hypothesis quantitatively account for core object recognition performance over a broad range of tasks? We measured human performance in 64 object recognition tests using thousands of challenging images that explore shape similarity and identity preserving object variation. We then used multielectrode arrays to measure neuronal population responses to those same images in visual areas V4 and inferior temporal (IT) cortex of monkeys and simulated V1 population responses. We tested leading candidate linking hypotheses and control hypotheses, each postulating how ventral stream neuronal responses underlie object recognition behavior. Specifically, for each hypothesis, we computed the predicted performance on the 64 tests and compared it with the measured pattern of human performance. All tested hypotheses based on low- and mid-level visually evoked activity (pixels, V1, and V4) were very poor predictors of the human behavioral pattern. However, simple learned weighted sums of distributed average IT firing rates exactly predicted the behavioral pattern. More elaborate linking hypotheses relying on IT trial-by-trial correlational structure, finer IT temporal codes, or ones that strictly respect the known spatial substructures of IT ("face patches") did not improve predictive power. Although these results do not reject those more elaborate hypotheses, they suggest a simple, sufficient quantitative model: each object recognition task is learned from the spatially distributed mean firing rates (100 ms) of ∼60,000 IT neurons and is executed as a simple weighted sum of those firing rates. Significance statement: We sought to go beyond qualitative models of visual object recognition and determine whether a single neuronal linking hypothesis can quantitatively account for core object recognition behavior. To achieve this, we designed a database of images for evaluating object recognition performance. We used multielectrode arrays to characterize hundreds of neurons in the visual ventral stream of nonhuman primates and measured the object recognition performance of >100 human observers. Remarkably, we found that simple learned weighted sums of firing rates of neurons in monkey inferior temporal (IT) cortex accurately predicted human performance. Although previous work led us to expect that IT would outperform V4, we were surprised by the quantitative precision with which simple IT-based linking hypotheses accounted for human behavior.


Assuntos
Aprendizagem/fisiologia , Neurônios/fisiologia , Reconhecimento Psicológico/fisiologia , Lobo Temporal/fisiologia , Algoritmos , Animais , Simulação por Computador , Potenciais Evocados Visuais/fisiologia , Humanos , Macaca mulatta , Desempenho Psicomotor/fisiologia , Especificidade da Espécie , Lobo Temporal/citologia , Campos Visuais/fisiologia , Vias Visuais/fisiologia , Percepção Visual/fisiologia
13.
Nat Methods ; 15(10): 772-773, 2018 10.
Artigo em Inglês | MEDLINE | ID: mdl-30275586
14.
PLoS Comput Biol ; 10(12): e1003963, 2014 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-25521294

RESUMO

The primate visual system achieves remarkable visual object recognition performance even in brief presentations, and under changes to object exemplar, geometric transformations, and background variation (a.k.a. core visual object recognition). This remarkable performance is mediated by the representation formed in inferior temporal (IT) cortex. In parallel, recent advances in machine learning have led to ever higher performing models of object recognition using artificial deep neural networks (DNNs). It remains unclear, however, whether the representational performance of DNNs rivals that of the brain. To accurately produce such a comparison, a major difficulty has been a unifying metric that accounts for experimental limitations, such as the amount of noise, the number of neural recording sites, and the number of trials, and computational limitations, such as the complexity of the decoding classifier and the number of classifier training examples. In this work, we perform a direct comparison that corrects for these experimental limitations and computational considerations. As part of our methodology, we propose an extension of "kernel analysis" that measures the generalization accuracy as a function of representational complexity. Our evaluations show that, unlike previous bio-inspired models, the latest DNNs rival the representational performance of IT cortex on this visual object recognition task. Furthermore, we show that models that perform well on measures of representational performance also perform well on measures of representational similarity to IT, and on measures of predicting individual IT multi-unit responses. Whether these DNNs rely on computational mechanisms similar to the primate visual system is yet to be determined, but, unlike all previous bio-inspired models, that possibility cannot be ruled out merely on representational performance grounds.


Assuntos
Modelos Neurológicos , Rede Nervosa/fisiologia , Redes Neurais de Computação , Reconhecimento Visual de Modelos/fisiologia , Lobo Temporal/fisiologia , Algoritmos , Animais , Macaca mulatta , Masculino
15.
J Neurosci ; 33(38): 15207-19, 2013 Sep 18.
Artigo em Inglês | MEDLINE | ID: mdl-24048850

RESUMO

Maps obtained by functional magnetic resonance imaging (fMRI) are thought to reflect the underlying spatial layout of neural activity. However, previous studies have not been able to directly compare fMRI maps to high-resolution neurophysiological maps, particularly in higher level visual areas. Here, we used a novel stereo microfocal x-ray system to localize thousands of neural recordings across monkey inferior temporal cortex (IT), construct large-scale maps of neuronal object selectivity at subvoxel resolution, and compare those neurophysiology maps with fMRI maps from the same subjects. While neurophysiology maps contained reliable structure at the sub-millimeter scale, fMRI maps of object selectivity contained information at larger scales (>2.5 mm) and were only partly correlated with raw neurophysiology maps collected in the same subjects. However, spatial smoothing of neurophysiology maps more than doubled that correlation, while a variety of alternative transforms led to no significant improvement. Furthermore, raw spiking signals, once spatially smoothed, were as predictive of fMRI maps as local field potential signals. Thus, fMRI of the inferior temporal lobe reflects a spatially low-passed version of neurophysiology signals. These findings strongly validate the widespread use of fMRI for detecting large (>2.5 mm) neuronal domains of object selectivity but show that a complete understanding of even the most pure domains (e.g., faces vs nonface objects) requires investigation at fine scales that can currently only be obtained with invasive neurophysiological methods.


Assuntos
Potenciais de Ação/fisiologia , Mapeamento Encefálico , Imageamento por Ressonância Magnética , Neurônios/fisiologia , Lobo Temporal , Animais , Humanos , Processamento de Imagem Assistida por Computador , Macaca mulatta , Masculino , Oxigênio/sangue , Estatísticas não Paramétricas , Lobo Temporal/irrigação sanguínea , Lobo Temporal/citologia , Lobo Temporal/fisiologia
16.
PLoS Comput Biol ; 9(8): e1003167, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23950700

RESUMO

The anterior inferotemporal cortex (IT) is the highest stage along the hierarchy of visual areas that, in primates, processes visual objects. Although several lines of evidence suggest that IT primarily represents visual shape information, some recent studies have argued that neuronal ensembles in IT code the semantic membership of visual objects (i.e., represent conceptual classes such as animate and inanimate objects). In this study, we investigated to what extent semantic, rather than purely visual information, is represented in IT by performing a multivariate analysis of IT responses to a set of visual objects. By relying on a variety of machine-learning approaches (including a cutting-edge clustering algorithm that has been recently developed in the domain of statistical physics), we found that, in most instances, IT representation of visual objects is accounted for by their similarity at the level of shape or, more surprisingly, low-level visual properties. Only in a few cases we observed IT representations of semantic classes that were not explainable by the visual similarity of their members. Overall, these findings reassert the primary function of IT as a conveyor of explicit visual shape information, and reveal that low-level visual properties are represented in IT to a greater extent than previously appreciated. In addition, our work demonstrates how combining a variety of state-of-the-art multivariate approaches, and carefully estimating the contribution of shape similarity to the representation of object categories, can substantially advance our understanding of neuronal coding of visual objects in cortex.


Assuntos
Modelos Neurológicos , Neurônios/fisiologia , Lobo Temporal/fisiologia , Visão Ocular/fisiologia , Algoritmos , Animais , Análise por Conglomerados , Biologia Computacional , Análise Discriminante , Haplorrinos , Análise Multivariada , Neurônios/citologia , Semântica , Lobo Temporal/citologia
17.
Annu Rev Vis Sci ; 2024 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-38950431

RESUMO

Inferences made about objects via vision, such as rapid and accurate categorization, are core to primate cognition despite the algorithmic challenge posed by varying viewpoints and scenes. Until recently, the brain mechanisms that support these capabilities were deeply mysterious. However, over the past decade, this scientific mystery has been illuminated by the discovery and development of brain-inspired, image-computable, artificial neural network (ANN) systems that rival primates in these behavioral feats. Apart from fundamentally changing the landscape of artificial intelligence, modified versions of these ANN systems are the current leading scientific hypotheses of an integrated set of mechanisms in the primate ventral visual stream that support core object recognition. What separates brain-mapped versions of these systems from prior conceptual models is that they are sensory computable, mechanistic, anatomically referenced, and testable (SMART). In this article, we review and provide perspective on the brain mechanisms addressed by the current leading SMART models. We review their empirical brain and behavioral alignment successes and failures, discuss the next frontiers for an even more accurate mechanistic understanding, and outline the likely applications.

18.
Neuron ; 112(14): 2435-2451.e7, 2024 Jul 17.
Artigo em Inglês | MEDLINE | ID: mdl-38733985

RESUMO

A key feature of cortical systems is functional organization: the arrangement of functionally distinct neurons in characteristic spatial patterns. However, the principles underlying the emergence of functional organization in the cortex are poorly understood. Here, we develop the topographic deep artificial neural network (TDANN), the first model to predict several aspects of the functional organization of multiple cortical areas in the primate visual system. We analyze the factors driving the TDANN's success and find that it balances two objectives: learning a task-general sensory representation and maximizing the spatial smoothness of responses according to a metric that scales with cortical surface area. In turn, the representations learned by the TDANN are more brain-like than in spatially unconstrained models. Finally, we provide evidence that the TDANN's functional organization balances performance with between-area connection length. Our results offer a unified principle for understanding the functional organization of the primate ventral visual system.


Assuntos
Redes Neurais de Computação , Córtex Visual , Córtex Visual/fisiologia , Animais , Modelos Neurológicos , Vias Visuais/fisiologia , Neurônios/fisiologia
19.
ArXiv ; 2024 Jan 11.
Artigo em Inglês | MEDLINE | ID: mdl-38259351

RESUMO

Vision is widely understood as an inference problem. However, two contrasting conceptions of the inference process have each been influential in research on biological vision as well as the engineering of machine vision. The first emphasizes bottom-up signal flow, describing vision as a largely feedforward, discriminative inference process that filters and transforms the visual information to remove irrelevant variation and represent behaviorally relevant information in a format suitable for downstream functions of cognition and behavioral control. In this conception, vision is driven by the sensory data, and perception is direct because the processing proceeds from the data to the latent variables of interest. The notion of "inference" in this conception is that of the engineering literature on neural networks, where feedforward convolutional neural networks processing images are said to perform inference. The alternative conception is that of vision as an inference process in Helmholtz's sense, where the sensory evidence is evaluated in the context of a generative model of the causal processes that give rise to it. In this conception, vision inverts a generative model through an interrogation of the sensory evidence in a process often thought to involve top-down predictions of sensory data to evaluate the likelihood of alternative hypotheses. The authors include scientists rooted in roughly equal numbers in each of the conceptions and motivated to overcome what might be a false dichotomy between them and engage the other perspective in the realm of theory and experiment. The primate brain employs an unknown algorithm that may combine the advantages of both conceptions. We explain and clarify the terminology, review the key empirical evidence, and propose an empirical research program that transcends the dichotomy and sets the stage for revealing the mysterious hybrid algorithm of primate vision.

20.
J Neurosci ; 32(47): 16666-82, 2012 Nov 21.
Artigo em Inglês | MEDLINE | ID: mdl-23175821

RESUMO

Functional magnetic resonance imaging (fMRI) has revealed multiple subregions in monkey inferior temporal cortex (IT) that are selective for images of faces over other objects. The earliest of these subregions, the posterior lateral face patch (PL), has not been studied previously at the neurophysiological level. Perhaps not surprisingly, we found that PL contains a high concentration of "face-selective" cells when tested with standard image sets comparable to those used previously to define the region at the level of fMRI. However, we here report that several different image sets and analytical approaches converge to show that nearly all face-selective PL cells are driven by the presence of a single eye in the context of a face outline. Most strikingly, images containing only an eye, even when incorrectly positioned in an outline, drove neurons nearly as well as full-face images, and face images lacking only this feature led to longer latency responses. Thus, bottom-up face processing is relatively local and linearly integrates features-consistent with parts-based models-grounding investigation of how the presence of a face is first inferred in the IT face processing hierarchy.


Assuntos
Olho , Face , Reconhecimento Psicológico/fisiologia , Percepção Visual/fisiologia , Algoritmos , Animais , Eletrodos , Feminino , Processamento de Imagem Assistida por Computador , Modelos Lineares , Macaca mulatta , Imageamento por Ressonância Magnética , Masculino , Neurônios , Distribuição Normal , Estimulação Luminosa , Retina/fisiologia , Campos Visuais
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA