RESUMO
Computing, since its inception, has been processor-centric, with memory separated from compute. Inspired by the organic brain and optimized for inorganic silicon, NorthPole is a neural inference architecture that blurs this boundary by eliminating off-chip memory, intertwining compute with memory on-chip, and appearing externally as an active memory chip. NorthPole is a low-precision, massively parallel, densely interconnected, energy-efficient, and spatial computing architecture with a co-optimized, high-utilization programming model. On the ResNet50 benchmark image classification network, relative to a graphics processing unit (GPU) that uses a comparable 12-nanometer technology process, NorthPole achieves a 25 times higher energy metric of frames per second (FPS) per watt, a 5 times higher space metric of FPS per transistor, and a 22 times lower time metric of latency. Similar results are reported for the Yolo-v4 detection network. NorthPole outperforms all prevalent architectures, even those that use more-advanced technology processes.
RESUMO
Deep networks are now able to achieve human-level performance on a broad spectrum of recognition tasks. Independently, neuromorphic computing has now demonstrated unprecedented energy-efficiency through a new chip architecture based on spiking neurons, low precision synapses, and a scalable communication network. Here, we demonstrate that neuromorphic computing, despite its novel architectural primitives, can implement deep convolution networks that (i) approach state-of-the-art classification accuracy across eight standard datasets encompassing vision and speech, (ii) perform inference while preserving the hardware's underlying energy-efficiency and high throughput, running on the aforementioned datasets at between 1,200 and 2,600 frames/s and using between 25 and 275 mW (effectively >6,000 frames/s per Watt), and (iii) can be specified and trained using backpropagation with the same ease-of-use as contemporary deep learning. This approach allows the algorithmic power of deep learning to be merged with the efficiency of neuromorphic processors, bringing the promise of embedded, intelligent, brain-inspired computing one step closer.
RESUMO
Stochastic neural networks such as Restricted Boltzmann Machines (RBMs) have been successfully used in applications ranging from speech recognition to image classification, and are particularly interesting because of their potential for generative tasks. Inference and learning in these algorithms use a Markov Chain Monte Carlo procedure called Gibbs sampling, where a logistic function forms the kernel of this sampler. On the other side of the spectrum, neuromorphic systems have shown great promise for low-power and parallelized cognitive computing, but lack well-suited applications and automation procedures. In this work, we propose a systematic method for bridging the RBM algorithm and digital neuromorphic systems, with a generative pattern completion task as proof of concept. For this, we first propose a method of producing the Gibbs sampler using bio-inspired digital noisy integrate-and-fire neurons. Next, we describe the process of mapping generative RBMs trained offline onto the IBM TrueNorth neurosynaptic processor-a low-power digital neuromorphic VLSI substrate. Mapping these algorithms onto neuromorphic hardware presents unique challenges in network connectivity and weight and bias quantization, which, in turn, require architectural and design strategies for the physical realization. Generative performance is analyzed to validate the neuromorphic requirements and to best select the neuron parameters for the model. Lastly, we describe a design automation procedure which achieves optimal resource usage, accounting for the novel hardware adaptations. This work represents the first implementation of generative RBM inference on a neuromorphic VLSI substrate.
Assuntos
Algoritmos , Redes Neurais de Computação , Potenciais de Ação , Cadeias de Markov , Modelos Neurológicos , Neurônios/fisiologiaRESUMO
Inspired by the brain's structure, we have developed an efficient, scalable, and flexible non-von Neumann architecture that leverages contemporary silicon technology. To demonstrate, we built a 5.4-billion-transistor chip with 4096 neurosynaptic cores interconnected via an intrachip network that integrates 1 million programmable spiking neurons and 256 million configurable synapses. Chips can be tiled in two dimensions via an interchip communication interface, seamlessly scaling the architecture to a cortexlike sheet of arbitrary size. The architecture is well suited to many applications that use complex neural networks in real time, for example, multiobject detection and classification. With 400-pixel-by-240-pixel video input at 30 frames per second, the chip consumes 63 milliwatts.
Assuntos
Interfaces Cérebro-Computador , Encéfalo , Simulação por Computador , Redes Neurais de Computação , Neurônios , Software , SinapsesRESUMO
We present a biomimetic system that captures essential functional properties of the glomerular layer of the mammalian olfactory bulb, specifically including its capacity to decorrelate similar odor representations without foreknowledge of the statistical distributions of analyte features. Our system is based on a digital neuromorphic chip consisting of 256 leaky-integrate-and-fire neurons, 1024 × 256 crossbar synapses, and address-event representation communication circuits. The neural circuits configured in the chip reflect established connections among mitral cells, periglomerular cells, external tufted cells, and superficial short-axon cells within the olfactory bulb, and accept input from convergent sets of sensors configured as olfactory sensory neurons. This configuration generates functional transformations comparable to those observed in the glomerular layer of the mammalian olfactory bulb. Our circuits, consuming only 45 pJ of active power per spike with a power supply of 0.85 V, can be used as the first stage of processing in low-power artificial chemical sensing devices inspired by natural olfactory systems.
RESUMO
Understanding the network structure of white matter communication pathways is essential for unraveling the mysteries of the brain's function, organization, and evolution. To this end, we derive a unique network incorporating 410 anatomical tracing studies of the macaque brain from the Collation of Connectivity data on the Macaque brain (CoCoMac) neuroinformatic database. Our network consists of 383 hierarchically organized regions spanning cortex, thalamus, and basal ganglia; models the presence of 6,602 directed long-distance connections; is three times larger than any previously derived brain network; and contains subnetworks corresponding to classic corticocortical, corticosubcortical, and subcortico-subcortical fiber systems. We found that the empirical degree distribution of the network is consistent with the hypothesis of the maximum entropy exponential distribution and discovered two remarkable bridges between the brain's structure and function via network-theoretical analysis. First, prefrontal cortex contains a disproportionate share of topologically central regions. Second, there exists a tightly integrated core circuit, spanning parts of premotor cortex, prefrontal cortex, temporal lobe, parietal lobe, thalamus, basal ganglia, cingulate cortex, insula, and visual cortex, that includes much of the task-positive and task-negative networks and might play a special role in higher cognition and consciousness.
Assuntos
Encéfalo/fisiologia , Modelos Neurológicos , Rede Nervosa/fisiologia , Vias Neurais/fisiologia , Animais , Encéfalo/anatomia & histologia , Mapeamento Encefálico , Bases de Dados Factuais , Macaca , Vias Neurais/anatomia & histologiaRESUMO
Volumetric, slice-based, 3-D atlases are invaluable tools for understanding complex cortical convolutions. We present a simple scheme to convert a slice-based atlas to a conceptual surface atlas that is easier to visualize and understand. The key idea is to unfold each slice into a one-dimensional vector, and concatenate a succession of these vectors--while maintaining as much spatial contiguity as possible--into a 2-D matrix. We illustrate our methodology using a coronal slice-based atlas of the Rhesus Monkey cortex. The conceptual surface-based atlases provide a useful complement to slice-based atlases for the purposes of indexing and browsing.
Assuntos
Mapeamento Encefálico/métodos , Córtex Cerebral/anatomia & histologia , Córtex Cerebral/fisiologia , Reconhecimento Automatizado de Padrão , Algoritmos , Animais , Encéfalo , Gráficos por Computador , Bases de Dados Factuais , Interpretação de Imagem Assistida por Computador/métodos , Imageamento Tridimensional , Macaca mulatta , Software , Interface Usuário-ComputadorRESUMO
Estimating the complete set of white matter fascicles (the projectome) from diffusion data requires evaluating an enormous number of potential pathways; consequently, most algorithms use computationally efficient greedy methods to search for pathways. The limitation of this approach is that critical global parameters--such as data prediction error and white matter volume conservation--are not taken into account. We describe BlueMatter, a parallel algorithm for global projectome evaluation, which uniquely accounts for global prediction error and volume conservation. Leveraging the BlueGene/L supercomputing architecture, BlueMatter explores a massive database of 180 billion candidate fascicles. The candidates are derived from several sources, including atlases and multiple tractography algorithms. Using BlueMatter we created the highest resolution, volume-conserved projectome of the human brain.