Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 3 de 3
Filtrar
Mais filtros

Base de dados
Ano de publicação
Tipo de documento
Intervalo de ano de publicação
1.
Sensors (Basel) ; 20(12)2020 Jun 16.
Artigo em Inglês | MEDLINE | ID: mdl-32560238

RESUMO

Neuromorphic vision sensors detect changes in luminosity taking inspiration from mammalian retina and providing a stream of events with high temporal resolution, also known as Dynamic Vision Sensors (DVS). This continuous stream of events can be used to extract spatio-temporal patterns from a scene. A time-surface represents a spatio-temporal context for a given spatial radius around an incoming event from a sensor at a specific time history. Time-surfaces can be organized in a hierarchical way to extract features from input events using the Hierarchy Of Time-Surfaces algorithm, hereinafter HOTS. HOTS can be organized in consecutive layers to extract combination of features in a similar way as some deep-learning algorithms do. This work introduces a novel FPGA architecture for accelerating HOTS network. This architecture is mainly based on block-RAM memory and the non-restoring square root algorithm, requiring basic components and enabling it for low-power low-latency embedded applications. The presented architecture has been tested on a Zynq 7100 platform at 100 MHz. The results show that the latencies are in the range of 1 µ s to 6.7 µ s, requiring a maximum dynamic power consumption of 77 mW. This system was tested with a gesture recognition dataset, obtaining an accuracy loss for 16-bit precision of only 1.2% with respect to the original software HOTS.

2.
IEEE Trans Biomed Circuits Syst ; 13(1): 159-169, 2019 02.
Artigo em Inglês | MEDLINE | ID: mdl-30418884

RESUMO

Deep Learning algorithms have become state-of-the-art methods for multiple fields, including computer vision, speech recognition, natural language processing, and audio recognition, among others. In image vision, convolutional neural networks (CNN) stand out. This kind of network is expensive in terms of computational resources due to the large number of operations required to process a frame. In recent years, several frame-based chip solutions to deploy CNN for real time have been developed. Despite the good results in power and accuracy given by these solutions, the number of operations is still high, due the complexity of the current network models. However, it is possible to reduce the number of operations using different computer vision techniques other than frame-based, e.g., neuromorphic event-based techniques. There exist several neuromorphic vision sensors whose pixels detect changes in luminosity. Inspired in the leaky integrate-and-fire (LIF) neuron, we propose in this manuscript an event-based field-programmable gate array (FPGA) multiconvolution system. Its main novelty is the combination of a memory arbiter for efficient memory access to allow row-by-row kernel processing. This system is able to convolve 64 filters across multiple kernel sizes, from 1 × 1 to 7 × 7, with latencies of 1.3  µs and 9.01  µs, respectively, generating a continuous flow of output events. The proposed architecture will easily fit spike-based CNNs.


Assuntos
Algoritmos , Neurônios/fisiologia , Potenciais da Membrana/fisiologia , Redes Neurais de Computação
3.
IEEE Trans Neural Netw Learn Syst ; 30(3): 644-656, 2019 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-30047912

RESUMO

Convolutional neural networks (CNNs) have become the dominant neural network architecture for solving many state-of-the-art (SOA) visual processing tasks. Even though graphical processing units are most often used in training and deploying CNNs, their power efficiency is less than 10 GOp/s/W for single-frame runtime inference. We propose a flexible and efficient CNN accelerator architecture called NullHop that implements SOA CNNs useful for low-power and low-latency application scenarios. NullHop exploits the sparsity of neuron activations in CNNs to accelerate the computation and reduce memory requirements. The flexible architecture allows high utilization of available computing resources across kernel sizes ranging from 1×1 to 7×7 . NullHop can process up to 128 input and 128 output feature maps per layer in a single pass. We implemented the proposed architecture on a Xilinx Zynq field-programmable gate array (FPGA) platform and presented the results showing how our implementation reduces external memory transfers and compute time in five different CNNs ranging from small ones up to the widely known large VGG16 and VGG19 CNNs. Postsynthesis simulations using Mentor Modelsim in a 28-nm process with a clock frequency of 500 MHz show that the VGG19 network achieves over 450 GOp/s. By exploiting sparsity, NullHop achieves an efficiency of 368%, maintains over 98% utilization of the multiply-accumulate units, and achieves a power efficiency of over 3 TOp/s/W in a core area of 6.3 mm2. As further proof of NullHop's usability, we interfaced its FPGA implementation with a neuromorphic event camera for real-time interactive demonstrations.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA