RESUMEN
We investigate the task of retrieving information from compositional distributed representations formed by hyperdimensional computing/vector symbolic architectures and present novel techniques that achieve new information rate bounds. First, we provide an overview of the decoding techniques that can be used to approach the retrieval task. The techniques are categorized into four groups. We then evaluate the considered techniques in several settings that involve, for example, inclusion of external noise and storage elements with reduced precision. In particular, we find that the decoding techniques from the sparse coding and compressed sensing literature (rarely used for hyperdimensional computing/vector symbolic architectures) are also well suited for decoding information from the compositional distributed representations. Combining these decoding techniques with interference cancellation ideas from communications improves previously reported bounds (Hersche et al., 2021) of the information rate of the distributed representations from 1.20 to 1.40 bits per dimension for smaller codebooks and from 0.60 to 1.26 bits per dimension for larger codebooks.
RESUMEN
We describe a stochastic, dynamical system capable of inference and learning in a probabilistic latent variable model. The most challenging problem in such models-sampling the posterior distribution over latent variables-is proposed to be solved by harnessing natural sources of stochasticity inherent in electronic and neural systems. We demonstrate this idea for a sparse coding model by deriving a continuous-time equation for inferring its latent variables via Langevin dynamics. The model parameters are learned by simultaneously evolving according to another continuous-time equation, thus bypassing the need for digital accumulators or a global clock. Moreover, we show that Langevin dynamics lead to an efficient procedure for sampling from the posterior distribution in the L0 sparse regime, where latent variables are encouraged to be set to zero as opposed to having a small L1 norm. This allows the model to properly incorporate the notion of sparsity rather than having to resort to a relaxed version of sparsity to make optimization tractable. Simulations of the proposed dynamical system on both synthetic and natural image data sets demonstrate that the model is capable of probabilistically correct inference, enabling learning of the dictionary as well as parameters of the prior.
Asunto(s)
Algoritmos , AprendizajeRESUMEN
This article reviews recent progress in the development of the computing framework Vector Symbolic Architectures (also known as Hyperdimensional Computing). This framework is well suited for implementation in stochastic, emerging hardware and it naturally expresses the types of cognitive operations required for Artificial Intelligence (AI). We demonstrate in this article that the field-like algebraic structure of Vector Symbolic Architectures offers simple but powerful operations on high-dimensional vectors that can support all data structures and manipulations relevant to modern computing. In addition, we illustrate the distinguishing feature of Vector Symbolic Architectures, "computing in superposition," which sets it apart from conventional computing. It also opens the door to efficient solutions to the difficult combinatorial search problems inherent in AI applications. We sketch ways of demonstrating that Vector Symbolic Architectures are computationally universal. We see them acting as a framework for computing with distributed representations that can play a role of an abstraction layer for emerging computing hardware. This article serves as a reference for computer architects by illustrating the philosophy behind Vector Symbolic Architectures, techniques of distributed computing with them, and their relevance to emerging computing hardware, such as neuromorphic computing.
RESUMEN
We describe the design and performance of a high-fidelity wearable head-, body-, and eye-tracking system that offers significant improvement over previous such devices. This device's sensors include a binocular eye tracker, an RGB-D scene camera, a high-frame-rate scene camera, and two visual odometry sensors, for a total of ten cameras, which we synchronize and record from with a data rate of over 700 MB/s. The sensors are operated by a mini-PC optimized for fast data collection, and powered by a small battery pack. The device records a subject's eye, head, and body positions, simultaneously with RGB and depth data from the subject's visual environment, measured with high spatial and temporal resolution. The headset weighs only 1.4 kg, and the backpack with batteries 3.9 kg. The device can be comfortably worn by the subject, allowing a high degree of mobility. Together, this system overcomes many limitations of previous such systems, allowing high-fidelity characterization of the dynamics of natural vision.
RESUMEN
We develop theoretical foundations of resonator networks, a new type of recurrent neural network introduced in Frady, Kent, Olshausen, and Sommer (2020), a companion article in this issue, to solve a high-dimensional vector factorization problem arising in Vector Symbolic Architectures. Given a composite vector formed by the Hadamard product between a discrete set of high-dimensional vectors, a resonator network can efficiently decompose the composite into these factors. We compare the performance of resonator networks against optimization-based methods, including Alternating Least Squares and several gradient-based algorithms, showing that resonator networks are superior in several important ways. This advantage is achieved by leveraging a combination of nonlinear dynamics and searching in superposition, by which estimates of the correct solution are formed from a weighted superposition of all possible solutions. While the alternative methods also search in superposition, the dynamics of resonator networks allow them to strike a more effective balance between exploring the solution space and exploiting local information to drive the network toward probable solutions. Resonator networks are not guaranteed to converge, but within a particular regime they almost always do. In exchange for relaxing the guarantee of global convergence, resonator networks are dramatically more effective at finding factorizations than all alternative approaches considered.
Asunto(s)
Encéfalo/fisiología , Cognición/fisiología , Redes Neurales de la Computación , Animales , HumanosRESUMEN
The ability to encode and manipulate data structures with distributed neural representations could qualitatively enhance the capabilities of traditional neural networks by supporting rule-based symbolic reasoning, a central property of cognition. Here we show how this may be accomplished within the framework of Vector Symbolic Architectures (VSAs) (Plate, 1991; Gayler, 1998; Kanerva, 1996), whereby data structures are encoded by combining high-dimensional vectors with operations that together form an algebra on the space of distributed representations. In particular, we propose an efficient solution to a hard combinatorial search problem that arises when decoding elements of a VSA data structure: the factorization of products of multiple codevectors. Our proposed algorithm, called a resonator network, is a new type of recurrent neural network that interleaves VSA multiplication operations and pattern completion. We show in two examples-parsing of a tree-like data structure and parsing of a visual scene-how the factorization problem arises and how the resonator network can solve it. More broadly, resonator networks open the possibility of applying VSAs to myriad artificial intelligence problems in real-world domains. The companion article in this issue (Kent, Frady, Sommer, & Olshausen, 2020) presents a rigorous analysis and evaluation of the performance of resonator networks, showing it outperforms alternative approaches.
Asunto(s)
Encéfalo/fisiología , Cognición/fisiología , Redes Neurales de la Computación , Animales , HumanosRESUMEN
A mathematical model and a possible neural mechanism are proposed to account for how fixational drift motion in the retina confers a benefit for the discrimination of high-acuity targets. We show that by simultaneously estimating object shape and eye motion, neurons in visual cortex can compute a higher quality representation of an object by averaging out non-uniformities in the retinal sampling lattice. The model proposes that this is accomplished by two separate populations of cortical neurons - one providing a representation of object shape and another representing eye position or motion - which are coupled through specific multiplicative connections. Combined with recent experimental findings, our model suggests that the visual system may utilize principles not unlike those used in computational imaging for achieving "super-resolution" via camera motion.
Asunto(s)
Modelos Teóricos , Percepción de Movimiento/fisiología , Retina/fisiología , Agudeza Visual/fisiología , Humanos , Neuronas/fisiología , Corteza Visual/fisiologíaRESUMEN
We investigate how the population nonlinearities resulting from lateral inhibition and thresholding in sparse coding networks influence neural response selectivity and robustness. We show that when compared to pointwise nonlinear models, such population nonlinearities improve the selectivity to a preferred stimulus and protect against adversarial perturbations of the input. These findings are predicted from the geometry of the single-neuron iso-response surface, which provides new insight into the relationship between selectivity and adversarial robustness. Inhibitory lateral connections curve the iso-response surface outward in the direction of selectivity. Since adversarial perturbations are orthogonal to the iso-response surface, adversarial attacks tend to be aligned with directions of selectivity. Consequently, the network is less easily fooled by perceptually irrelevant perturbations to the input. Together, these findings point to benefits of integrating computational principles found in biological vision systems into artificial neural networks.
Asunto(s)
Redes Neurales de la Computación , Aprendizaje Automático no Supervisado , Percepción Visual/fisiología , Humanos , Modelos Neurológicos , Neuronas/fisiología , Dinámicas no Lineales , Procesos EstocásticosRESUMEN
We statistically characterize the population spiking activity obtained from simultaneous recordings of neurons across all layers of a cortical microcolumn. Three types of models are compared: an Ising model which captures pairwise correlations between units, a Restricted Boltzmann Machine (RBM) which allows for modeling of higher-order correlations, and a semi-Restricted Boltzmann Machine which is a combination of Ising and RBM models. Model parameters were estimated in a fast and efficient manner using minimum probability flow, and log likelihoods were compared using annealed importance sampling. The higher-order models reveal localized activity patterns which reflect the laminar organization of neurons within a cortical column. The higher-order models also outperformed the Ising model in log-likelihood: On populations of 20 cells, the RBM had 10% higher log-likelihood (relative to an independent model) than a pairwise model, increasing to 45% gain in a larger network with 100 spatiotemporal elements, consisting of 10 neurons over 10 time steps. We further removed the need to model stimulus-induced correlations by incorporating a peri-stimulus time histogram term, in which case the higher order models continued to perform best. These results demonstrate the importance of higher-order interactions to describe the structure of correlated activity in cortical networks. Boltzmann Machines with hidden units provide a succinct and effective way to capture these dependencies without increasing the difficulty of model estimation and evaluation.
Asunto(s)
Potenciales de Acción/fisiología , Modelos Neurológicos , Neuronas/fisiología , Corteza Visual/fisiología , Algoritmos , Animales , Gatos , Análisis por Conglomerados , Estimulación LuminosaRESUMEN
We propose a normative model for spatial representation in the hippocampal formation that combines optimality principles, such as maximizing coding range and spatial information per neuron, with an algebraic framework for computing in distributed representation. Spatial position is encoded in a residue number system, with individual residues represented by high-dimensional, complex-valued vectors. These are composed into a single vector representing position by a similarity-preserving, conjunctive vector-binding operation. Self-consistency between the representations of the overall position and of the individual residues is enforced by a modular attractor network whose modules correspond to the grid cell modules in entorhinal cortex. The vector binding operation can also associate different contexts to spatial representations, yielding a model for entorhinal cortex and hippocampus. We show that the model achieves normative desiderata including superlinear scaling of patterns with dimension, robust error correction, and hexagonal, carry-free encoding of spatial position. These properties in turn enable robust path integration and association with sensory inputs. More generally, the model formalizes how compositional computations could occur in the hippocampal formation and leads to testable experimental predictions.
RESUMEN
A prominent approach to solving combinatorial optimization problems on parallel hardware is Ising machines, i.e., hardware implementations of networks of interacting binary spin variables. Most Ising machines leverage second-order interactions although important classes of optimization problems, such as satisfiability problems, map more seamlessly to Ising networks with higher-order interactions. Here, we demonstrate that higher-order Ising machines can solve satisfiability problems more resource-efficiently in terms of the number of spin variables and their connections when compared to traditional second-order Ising machines. Further, our results show on a benchmark dataset of Boolean k-satisfiability problems that higher-order Ising machines implemented with coupled oscillators rapidly find solutions that are better than second-order Ising machines, thus, improving the current state-of-the-art for Ising machines.
RESUMEN
We introduce Residue Hyperdimensional Computing, a computing framework that unifies residue number systems with an algebra defined over random, high-dimensional vectors. We show how residue numbers can be represented as high-dimensional vectors in a manner that allows algebraic operations to be performed with component-wise, parallelizable operations on the vector elements. The resulting framework, when combined with an efficient method for factorizing high-dimensional vectors, can represent and operate on numerical values over a large dynamic range using vastly fewer resources than previous methods, and it exhibits impressive robustness to noise. We demonstrate the potential for this framework to solve computationally difficult problems in visual perception and combinatorial optimization, showing improvement over baseline methods. More broadly, the framework provides a possible account for the computational operations of grid cells in the brain, and it suggests new machine learning architectures for representing and manipulating numerical data.
RESUMEN
We present a model of intermediate-level visual representation that is based on learning invariances from movies of the natural environment. The model is composed of two stages of processing: an early feature representation layer and a second layer in which invariances are explicitly represented. Invariances are learned as the result of factoring apart the temporally stable and dynamic components embedded in the early feature representation. The structure contained in these components is made explicit in the activities of second-layer units that capture invariances in both form and motion. When trained on natural movies, the first layer produces a factorization, or separation, of image content into a temporally persistent part representing local edge structure and a dynamic part representing local motion structure, consistent with known response properties in early visual cortex (area V1). This factorization linearizes statistical dependencies among the first-layer units, making them learnable by the second layer. The second-layer units are split into two populations according to the factorization in the first layer. The form-selective units receive their input from the temporally persistent part (local edge structure) and after training result in a diverse set of higher-order shape features consisting of extended contours, multiscale edges, textures, and texture boundaries. The motion-selective units receive their input from the dynamic part (local motion structure) and after training result in a representation of image translation over different spatial scales and directions, in addition to more complex deformations. These representations provide a rich description of dynamic natural images and testable hypotheses regarding intermediate-level representation in visual cortex.
Asunto(s)
Percepción de Forma , Aprendizaje/fisiología , Movimiento (Física) , Corteza Visual/fisiología , Humanos , Modelos Neurológicos , Neuronas/fisiología , Reconocimiento Visual de Modelos/fisiología , Estimulación Luminosa/métodosRESUMEN
Operations on high-dimensional, fixed-width vectors can be used to distribute information from several vectors over a single vector of the same width. For example, a set of key-value pairs can be encoded into a single vector with multiplication and addition of the corresponding key and value vectors: the keys are bound to their values with component-wise multiplication, and the key-value pairs are combined into a single superposition vector with component-wise addition. The superposition vector is, thus, a memory which can then be queried for the value of any of the keys, but the result of the query is approximate. The exact vector is retrieved from a codebook (a.k.a. item memory), which contains vectors defined in the system. To perform these operations, the item memory vectors and the superposition vector must be the same width. Increasing the capacity of the memory requires increasing the width of the superposition and item memory vectors. In this article, we demonstrate that in a regime where many (e.g., 1,000 or more) key-value pairs are stored, an associative memory which maps key vectors to value vectors requires less memory and less computing to obtain the same reliability of storage as a superposition vector. These advantages are obtained because the number of storage locations in an associate memory can be increased without increasing the width of the vectors in the item memory. An associative memory would not replace a superposition vector as a medium of storage, but could augment it, because data recalled from an associative memory could be used in algorithms that use a superposition vector. This would be analogous to how human working memory (which stores about seven items) uses information recalled from long-term memory (which is much larger than the working memory). We demonstrate the advantages of an associative memory experimentally using the storage of large finite-state automata, which could model the storage and recall of state-dependent behavior by brains.
RESUMEN
An amendment to this paper has been published and can be accessed via a link at the top of the paper.
RESUMEN
Exponential growth in data generation and large-scale data science has created an unprecedented need for inexpensive, low-power, low-latency, high-density information storage. This need has motivated significant research into multi-level memory devices that are capable of storing multiple bits of information per device. The memory state of these devices is intrinsically analog. Furthermore, much of the data they will store, along with the subsequent operations on the majority of this data, are all intrinsically analog-valued. Ironically though, in the current storage paradigm, both the devices and data are quantized for use with digital systems and digital error-correcting codes. Here, we recast the storage problem as a communication problem. This then allows us to use ideas from analog coding and show, using phase change memory as a prototypical multi-level storage technology, that analog-valued emerging memory devices can achieve higher capacities when paired with analog codes. Further, we show that storing analog signals directly through joint coding can achieve low distortion with reduced coding complexity. Specifically, by jointly optimizing for signal statistics, device statistics, and a distortion metric, we demonstrate that single-symbol analog codings can perform comparably to digital codings with asymptotically large code lengths. These results show that end-to-end analog memory systems have the potential to not only reach higher storage capacities than discrete systems but also to significantly lower coding complexity, leading to faster and more energy efficient data storage.
RESUMEN
We can claim that we know what the visual system does once we can predict neural responses to arbitrary stimuli, including those seen in nature. In the early visual system, models based on one or more linear receptive fields hold promise to achieve this goal as long as the models include nonlinear mechanisms that control responsiveness, based on stimulus context and history, and take into account the nonlinearity of spike generation. These linear and nonlinear mechanisms might be the only essential determinants of the response, or alternatively, there may be additional fundamental determinants yet to be identified. Research is progressing with the goals of defining a single "standard model" for each stage of the visual pathway and testing the predictive power of these models on the responses to movies of natural scenes. These predictive models represent, at a given stage of the visual pathway, a compact description of visual computation. They would be an invaluable guide for understanding the underlying biophysical and anatomical mechanisms and relating neural responses to visual perception.
Asunto(s)
Corteza Visual/crecimiento & desarrollo , Vías Visuales/crecimiento & desarrollo , Percepción Visual/fisiología , Animales , Humanos , Estimulación Luminosa/métodos , Corteza Visual/citología , Vías Visuales/citologíaRESUMEN
Several theoretical, computational, and experimental studies suggest that neurons encode sensory information using a small number of active neurons at any given point in time. This strategy, referred to as 'sparse coding', could possibly confer several advantages. First, it allows for increased storage capacity in associative memories; second, it makes the structure in natural signals explicit; third, it represents complex data in a way that is easier to read out at subsequent levels of processing; and fourth, it saves energy. Recent physiological recordings from sensory neurons have indicated that sparse coding could be a ubiquitous strategy employed in several different modalities across different organisms.