Pesquisa | Portal Regional da BVS

Efficient SNN multi-cores MAC array acceleration on SpiNNaker 2.

Huang, Jiaxin; Kelber, Florian; Vogginger, Bernhard; Liu, Chen; Kreutz, Felix; Gerhards, Pascal; Scholz, Daniel; Knobloch, Klaus; Mayr, Christian G.

Front Neurosci ; 17: 1223262, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-37609449

RESUMO

The potential low-energy feature of the spiking neural network (SNN) engages the attention of the AI community. Only CPU-involved SNN processing inevitably results in an inherently long temporal span in the cases of large models and massive datasets. This study introduces the MAC array, a parallel architecture on each processing element (PE) of SpiNNaker 2, into the computational process of SNN inference. Based on the work of single-core optimization algorithms, we investigate the parallel acceleration algorithms for collaborating with multi-core MAC arrays. The proposed Echelon Reorder model information densification algorithm, along with the adapted multi-core two-stage splitting and authorization deployment strategies, achieves efficient spatio-temporal load balancing and optimization performance. We evaluate the performance by benchmarking a wide range of constructed SNN models to research on the influence degree of different factors. We also benchmark with two actual SNN models (the gesture recognition model of the real-world application and balanced random cortex-like network from neuroscience) on the neuromorphic multi-core hardware SpiNNaker 2. The echelon optimization algorithm with mixed processors realizes 74.28% and 85.78% memory footprint of the original MAC calculation on these two models, respectively. The execution time of echelon algorithms using only MAC or mixed processors accounts for ≤ 24.56% of the serial ARM baseline. Accelerating SNN inference with algorithms in this study is essentially the general sparse matrix-matrix multiplication (SpGEMM) problem. This article explicitly expands the application field of the SpGEMM issue to SNN, developing novel SpGEMM optimization algorithms fitting the SNN feature and MAC array.

A 16-Channel Fully Configurable Neural SoC With 1.52 µW/Ch Signal Acquisition, 2.79 µW/Ch Real-Time Spike Classifier, and 1.79 TOPS/W Deep Neural Network Accelerator in 22 nm FDSOI.

Zeinolabedin, Seyed Mohammad Ali; Schuffny, Franz Marcus; George, Richard; Kelber, Florian; Bauer, Heiner; Scholze, Stefan; Hanzsche, Stefan; Stolba, Marco; Dixius, Andreas; Ellguth, Georg; Walter, Dennis; Hoppner, Sebastian; Mayr, Christian.

IEEE Trans Biomed Circuits Syst ; 16(1): 94-107, 2022 02.

Artigo em Inglês | MEDLINE | ID: mdl-35025750

RESUMO

With the advent of high-density micro-electrodes arrays, developing neural probes satisfying the real-time and stringent power-efficiency requirements becomes more challenging. A smart neural probe is an essential device in future neuroscientific research and medical applications. To realize such devices, we present a 22 nm FDSOI SoC with complex on-chip real-time data processing and training for neural signal analysis. It consists of a digitally-assisted 16-channel analog front-end with 1.52 µW/Ch, dedicated bio-processing accelerators for spike detection and classification with 2.79 µW/Ch, and a 125 MHz RISC-V CPU, utilizing adaptive body biasing at 0.5 V with a supporting 1.79 TOPS/W MAC array. The proposed SoC shows a proof-of-concept of how to realize a high-level integration of various on-chip accelerators to satisfy the neural probe requirements for modern applications.

Assuntos

Redes Neurais de Computação , Processamento de Sinais Assistido por Computador , Óxidos N-Cíclicos , Eletrodos

RESUMO

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA