Rechercher | Portail Régional BVS

1.

Spike-based dynamic computing with asynchronous sensing-computing neuromorphic chip.

Yao, Man; Richter, Ole; Zhao, Guangshe; Qiao, Ning; Xing, Yannan; Wang, Dingheng; Hu, Tianxiang; Fang, Wei; Demirci, Tugba; De Marchi, Michele; Deng, Lei; Yan, Tianyi; Nielsen, Carsten; Sheik, Sadique; Wu, Chenxi; Tian, Yonghong; Xu, Bo; Li, Guoqi.

Nat Commun ; 15(1): 4464, 2024 May 25.

Article de Anglais | MEDLINE | ID: mdl-38796464

RÉSUMÉ

By mimicking the neurons and synapses of the human brain and employing spiking neural networks on neuromorphic chips, neuromorphic computing offers a promising energy-efficient machine intelligence. How to borrow high-level brain dynamic mechanisms to help neuromorphic computing achieve energy advantages is a fundamental issue. This work presents an application-oriented algorithm-software-hardware co-designed neuromorphic system for this issue. First, we design and fabricate an asynchronous chip called "Speck", a sensing-computing neuromorphic system on chip. With the low processor resting power of 0.42mW, Speck can satisfy the hardware requirements of dynamic computing: no-input consumes no energy. Second, we uncover the "dynamic imbalance" in spiking neural networks and develop an attention-based framework for achieving the algorithmic requirements of dynamic computing: varied inputs consume energy with large variance. Together, we demonstrate a neuromorphic system with real-time power as low as 0.70mW. This work exhibits the promising potentials of neuromorphic computing with its asynchronous event-driven, sparse, and dynamic nature.

Sujet(s)

Algorithmes , , Neurones , Humains , Neurones/physiologie , Modèles neurologiques , Potentiels d'action/physiologie , Synapses/physiologie , Encéphale/physiologie , Logiciel

2.

Sparser spiking activity can be better: Feature Refine-and-Mask spiking neural network for event-based visual recognition.

Yao, Man; Zhang, Hengyu; Zhao, Guangshe; Zhang, Xiyu; Wang, Dingheng; Cao, Gang; Li, Guoqi.

Neural Netw ; 166: 410-423, 2023 Sep.

Article de Anglais | MEDLINE | ID: mdl-37549609

RÉSUMÉ

Event-based visual, a new visual paradigm with bio-inspired dynamic perception and µs level temporal resolution, has prominent advantages in many specific visual scenarios and gained much research interest. Spiking neural network (SNN) is naturally suitable for dealing with event streams due to its temporal information processing capability and event-driven nature. However, existing works SNN neglect the fact that the input event streams are spatially sparse and temporally non-uniform, and just treat these variant inputs equally. This situation interferes with the effectiveness and efficiency of existing SNNs. In this paper, we propose the feature Refine-and-Mask SNN (RM-SNN), which has the ability of self-adaption to regulate the spiking response in a data-dependent way. We use the Refine-and-Mask (RM) module to refine all features and mask the unimportant features to optimize the membrane potential of spiking neurons, which in turn drops the spiking activity. Inspired by the fact that not all events in spatio-temporal streams are task-relevant, we execute the RM module in both temporal and channel dimensions. Extensive experiments on seven event-based benchmarks, DVS128 Gesture, DVS128 Gait, CIFAR10-DVS, N-Caltech101, DailyAction-DVS, UCF101-DVS, and HMDB51-DVS demonstrate that under the multi-scale constraints of input time window, RM-SNN can significantly reduce the network average spiking activity rate while improving the task performance. In addition, by visualizing spiking responses, we analyze why sparser spiking activity can be better. Code.

Sujet(s)

, Perception du temps , Potentiels d'action/physiologie , , Neurones/physiologie

3.

Kronecker CP Decomposition With Fast Multiplication for Compressing RNNs.

Wang, Dingheng; Wu, Bijiao; Zhao, Guangshe; Yao, Man; Chen, Hengnu; Deng, Lei; Yan, Tianyi; Li, Guoqi.

IEEE Trans Neural Netw Learn Syst ; 34(5): 2205-2219, 2023 May.

Article de Anglais | MEDLINE | ID: mdl-34534089

RÉSUMÉ

Recurrent neural networks (RNNs) are powerful in the tasks oriented to sequential data, such as natural language processing and video recognition. However, because the modern RNNs have complex topologies and expensive space/computation complexity, compressing them becomes a hot and promising topic in recent years. Among plenty of compression methods, tensor decomposition, e.g., tensor train (TT), block term (BT), tensor ring (TR), and hierarchical Tucker (HT), appears to be the most amazing approach because a very high compression ratio might be obtained. Nevertheless, none of these tensor decomposition formats can provide both space and computation efficiency. In this article, we consider to compress RNNs based on a novel Kronecker CANDECOMP/PARAFAC (KCP) decomposition, which is derived from Kronecker tensor (KT) decomposition, by proposing two fast algorithms of multiplication between the input and the tensor-decomposed weight. According to our experiments based on UCF11, Youtube Celebrities Face, UCF50, TIMIT, TED-LIUM, and Spiking Heidelberg digits datasets, it can be verified that the proposed KCP-RNNs have a comparable performance of accuracy with those in other tensor-decomposed formats, and even 278 219× compression ratio could be obtained by the low-rank KCP. More importantly, KCP-RNNs are efficient in both space and computation complexity compared with other tensor-decomposed ones. Besides, we find KCP has the best potential of parallel computing to accelerate the calculations in neural networks.

4.

Nonlinear tensor train format for deep neural network compression.

Wang, Dingheng; Zhao, Guangshe; Chen, Hengnu; Liu, Zhexian; Deng, Lei; Li, Guoqi.

Neural Netw ; 144: 320-333, 2021 Dec.

Article de Anglais | MEDLINE | ID: mdl-34547670

RÉSUMÉ

Deep neural network (DNN) compression has become a hot topic in the research of deep learning since the scale of modern DNNs turns into too huge to implement on practical resource constrained platforms such as embedded devices. Among variant compression methods, tensor decomposition appears to be a relatively simple and efficient strategy owing to its solid mathematical foundations and regular data structure. Generally, tensorizing neural weights into higher-order tensors for better decomposition, and directly mapping efficient tensor structure to neural architecture with nonlinear activation functions, are the two most common ways. However, the considerable accuracy loss is still a fly in the ointment for the tensorizing way especially for convolutional neural networks (CNNs), while the number of studies in the mapping way is comparatively limited and corresponding compression ratio appears to be not considerable. Therefore, in this work, by researching multiple types of tensor decompositions, we realize that tensor train (TT), which has specific and efficient sequenced contractions, is potential to take into account both of tensorizing and mapping ways. Then we propose a novel nonlinear tensor train (NTT) format, which contains extra nonlinear activation functions embedded in sequenced contractions and convolutions on the top of the normal TT decomposition and the proposed TT format connected by convolutions, to compensate the accuracy loss that normal TT cannot give. Further than just shrinking the space complexity of original weight matrices and convolutional kernels, we prove that NTT can afford an efficient inference time as well. Extensive experiments and discussions demonstrate that the compressed DNNs in our NTT format can almost maintain the accuracy at least on MNIST, UCF11 and CIFAR-10 datasets, and the accuracy loss caused by normal TT could be compensated significantly on large-scale datasets such as ImageNet.

Sujet(s)

Compression de données , , Algorithmes , Phénomènes physiques

5.

QTTNet: Quantized tensor train neural networks for 3D object and video recognition.

Lee, Donghyun; Wang, Dingheng; Yang, Yukuan; Deng, Lei; Zhao, Guangshe; Li, Guoqi.

Neural Netw ; 141: 420-432, 2021 Sep.

Article de Anglais | MEDLINE | ID: mdl-34146969

RÉSUMÉ

Relying on the rapidly increasing capacity of computing clusters and hardware, convolutional neural networks (CNNs) have been successfully applied in various fields and achieved state-of-the-art results. Despite these exciting developments, the huge memory cost is still involved in training and inferring a large-scale CNN model and makes it hard to be widely used in resource-limited portable devices. To address this problem, we establish a training framework for three-dimensional convolutional neural networks (3DCNNs) named QTTNet that combines tensor train (TT) decomposition and data quantization together for further shrinking the model size and decreasing the memory and time cost. Through this framework, we can fully explore the superiority of TT in reducing the number of trainable parameters and the advantage of quantization in decreasing the bit-width of data, particularly compressing 3DCNN model greatly with little accuracy degradation. In addition, due to the low bit quantization to all parameters during the inference process including TT-cores, activations, and batch normalizations, the proposed method naturally takes advantage in memory and time cost. Experimental results of compressing 3DCNNs for 3D object and video recognition on ModelNet40, UCF11, and UCF50 datasets verify the effectiveness of the proposed method. The best compression ratio we have obtained is up to nearly 180× with competitive performance compared with other state-of-the-art researches. Moreover, the total bytes of our QTTNet models on ModelNet40 and UCF11 datasets can be 1000× lower than some typical practices such as MVCNN.

Sujet(s)

, Compression de données , Imagerie tridimensionnelle

6.

Hybrid tensor decomposition in neural network compression.

Wu, Bijiao; Wang, Dingheng; Zhao, Guangshe; Deng, Lei; Li, Guoqi.

Neural Netw ; 132: 309-320, 2020 Dec.

Article de Anglais | MEDLINE | ID: mdl-32977276

RÉSUMÉ

Deep neural networks (DNNs) have enabled impressive breakthroughs in various artificial intelligence (AI) applications recently due to its capability of learning high-level features from big data. However, the current demand of DNNs for computational resources especially the storage consumption is growing due to that the increasing sizes of models are being required for more and more complicated applications. To address this problem, several tensor decomposition methods including tensor-train (TT) and tensor-ring (TR) have been applied to compress DNNs and shown considerable compression effectiveness. In this work, we introduce the hierarchical Tucker (HT), a classical but rarely-used tensor decomposition method, to investigate its capability in neural network compression. We convert the weight matrices and convolutional kernels to both HT and TT formats for comparative study, since the latter is the most widely used decomposition method and the variant of HT. We further theoretically and experimentally discover that the HT format has better performance on compressing weight matrices, while the TT format is more suited for compressing convolutional kernels. Based on this phenomenon we propose a strategy of hybrid tensor decomposition by combining TT and HT together to compress convolutional and fully connected parts separately and attain better accuracy than only using the TT or HT format on convolutional neural networks (CNNs). Our work illuminates the prospects of hybrid tensor decomposition for neural network compression.

Sujet(s)

Compression de données/méthodes , Apprentissage profond , , Algorithmes , Intelligence artificielle

7.

Compressing 3DCNNs based on tensor train decomposition.

Wang, Dingheng; Zhao, Guangshe; Li, Guoqi; Deng, Lei; Wu, Yang.

Neural Netw ; 131: 215-230, 2020 Nov.

Article de Anglais | MEDLINE | ID: mdl-32805632

RÉSUMÉ

Three-dimensional convolutional neural networks (3DCNNs) have been applied in many tasks, e.g., video and 3D point cloud recognition. However, due to the higher dimension of convolutional kernels, the space complexity of 3DCNNs is generally larger than that of traditional two-dimensional convolutional neural networks (2DCNNs). To miniaturize 3DCNNs for the deployment in confining environments such as embedded devices, neural network compression is a promising approach. In this work, we adopt the tensor train (TT) decomposition, a straightforward and simple in situ training compression method, to shrink the 3DCNN models. Through proposing tensorizing 3D convolutional kernels in TT format, we investigate how to select appropriate TT ranks for achieving higher compression ratio. We have also discussed the redundancy of 3D convolutional kernels for compression, core significance and future directions of this work, as well as the theoretical computation complexity versus practical executing time of convolution in TT. In the light of multiple contrast experiments based on VIVA challenge, UCF11, UCF101, and ModelNet40 datasets, we conclude that TT decomposition can compress 3DCNNs by around one hundred times without significant accuracy loss, which will enable its applications in extensive real world scenarios.

Sujet(s)

Compression de données/méthodes ,

RÉSUMÉ

Sujet(s)

RÉSUMÉ

Sujet(s)

RÉSUMÉ

RÉSUMÉ

Sujet(s)

RÉSUMÉ

Sujet(s)

RÉSUMÉ

Sujet(s)

RÉSUMÉ

Sujet(s)

ENVOYER À:

SÉLECTION CITATIONS

DÉTAIL DE RECHERCHE