Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 64
Filtrar
Más filtros












Base de datos
Intervalo de año de publicación
1.
Sci Adv ; 10(30): eadm8430, 2024 Jul 26.
Artículo en Inglés | MEDLINE | ID: mdl-39058783

RESUMEN

Advances in artificial intelligence enable neural networks to learn a wide variety of tasks, yet our understanding of the learning dynamics of these networks remains limited. Here, we study the temporal dynamics during learning of Hebbian feedforward neural networks in tasks of continual familiarity detection. Drawing inspiration from network neuroscience, we examine the network's dynamic reconfiguration, focusing on how network modules evolve throughout learning. Through a comprehensive assessment involving metrics like network accuracy, modular flexibility, and distribution entropy across diverse learning modes, our approach reveals various previously unknown patterns of network reconfiguration. We find that the emergence of network modularity is a salient predictor of performance and that modularization strengthens with increasing flexibility throughout learning. These insights not only elucidate the nuanced interplay of network modularity, accuracy, and learning dynamics but also bridge our understanding of learning in artificial and biological agents.


Asunto(s)
Redes Neurales de la Computación , Humanos , Aprendizaje/fisiología , Inteligencia Artificial , Reconocimiento en Psicología/fisiología , Algoritmos
2.
Neural Netw ; 178: 106493, 2024 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-38970946

RESUMEN

Visual object tracking, which is primarily based on visible light image sequences, encounters numerous challenges in complicated scenarios, such as low light conditions, high dynamic ranges, and background clutter. To address these challenges, incorporating the advantages of multiple visual modalities is a promising solution for achieving reliable object tracking. However, the existing approaches usually integrate multimodal inputs through adaptive local feature interactions, which cannot leverage the full potential of visual cues, thus resulting in insufficient feature modeling. In this study, we propose a novel multimodal hybrid tracker (MMHT) that utilizes frame-event-based data for reliable single object tracking. The MMHT model employs a hybrid backbone consisting of an artificial neural network (ANN) and a spiking neural network (SNN) to extract dominant features from different visual modalities and then uses a unified encoder to align the features across different domains. Moreover, we propose an enhanced transformer-based module to fuse multimodal features using attention mechanisms. With these methods, the MMHT model can effectively construct a multiscale and multidimensional visual feature space and achieve discriminative feature modeling. Extensive experiments demonstrate that the MMHT model exhibits competitive performance in comparison with that of other state-of-the-art methods. Overall, our results highlight the effectiveness of the MMHT model in terms of addressing the challenges faced in visual object tracking tasks.


Asunto(s)
Redes Neurales de la Computación , Humanos , Algoritmos , Procesamiento de Imagen Asistido por Computador/métodos
3.
IEEE Trans Image Process ; 33: 4274-4287, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-39042526

RESUMEN

Recent advances in bio-inspired vision with event cameras and associated spiking neural networks (SNNs) have provided promising solutions for low-power consumption neuromorphic tasks. However, as the research of event cameras is still in its infancy, the amount of labeled event stream data is much less than that of the RGB database. The traditional method of converting static images into event streams by simulation to increase the sample size cannot simulate the characteristics of event cameras such as high temporal resolution. To take advantage of both the rich knowledge in labeled RGB images and the features of the event camera, we propose a transfer learning method from the RGB to the event domain in this paper. Specifically, we first introduce a transfer learning framework named R2ETL (RGB to Event Transfer Learning), including a novel encoding alignment module and a feature alignment module. Then, we introduce the temporal centered kernel alignment (TCKA) loss function to improve the efficiency of transfer learning. It aligns the distribution of temporal neuron states by adding a temporal learning constraint. Finally, we theoretically analyze the amount of data required by the deep neuromorphic model to prove the necessity of our method. Numerous experiments demonstrate that our proposed framework outperforms the state-of-the-art SNN and artificial neural network (ANN) models trained on event streams, including N-MNIST, CIFAR10-DVS and N-Caltech101. This indicates that the R2ETL framework is able to leverage the knowledge of labeled RGB images to help the training of SNN on event streams.

4.
Artículo en Inglés | MEDLINE | ID: mdl-38833393

RESUMEN

Sensory information recognition is primarily processed through the ventral and dorsal visual pathways in the primate brain visual system, which exhibits layered feature representations bearing a strong resemblance to convolutional neural networks (CNNs), encompassing reconstruction and classification. However, existing studies often treat these pathways as distinct entities, focusing individually on pattern reconstruction or classification tasks, overlooking a key feature of biological neurons, the fundamental units for neural computation of visual sensory information. Addressing these limitations, we introduce a unified framework for sensory information recognition with augmented spikes. By integrating pattern reconstruction and classification within a single framework, our approach not only accurately reconstructs multimodal sensory information but also provides precise classification through definitive labeling. Experimental evaluations conducted on various datasets including video scenes, static images, dynamic auditory scenes, and functional magnetic resonance imaging (fMRI) brain activities demonstrate that our framework delivers state-of-the-art pattern reconstruction quality and classification accuracy. The proposed framework enhances the biological realism of multimodal pattern recognition models, offering insights into how the primate brain visual system effectively accomplishes the reconstruction and classification tasks through the integration of ventral and dorsal pathways.

5.
Natl Sci Rev ; 11(5): nwae102, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38689713

RESUMEN

Spiking neural networks (SNNs) are gaining increasing attention for their biological plausibility and potential for improved computational efficiency. To match the high spatial-temporal dynamics in SNNs, neuromorphic chips are highly desired to execute SNNs in hardware-based neuron and synapse circuits directly. This paper presents a large-scale neuromorphic chip named Darwin3 with a novel instruction set architecture, which comprises 10 primary instructions and a few extended instructions. It supports flexible neuron model programming and local learning rule designs. The Darwin3 chip architecture is designed in a mesh of computing nodes with an innovative routing algorithm. We used a compression mechanism to represent synaptic connections, significantly reducing memory usage. The Darwin3 chip supports up to 2.35 million neurons, making it the largest of its kind on the neuron scale. The experimental results showed that the code density was improved by up to 28.3× in Darwin3, and that the neuron core fan-in and fan-out were improved by up to 4096× and 3072× by connection compression compared to the physical memory depth. Our Darwin3 chip also provided memory saving between 6.8× and 200.8× when mapping convolutional spiking neural networks onto the chip, demonstrating state-of-the-art performance in accuracy and latency compared to other neuromorphic chips.

6.
Commun Biol ; 7(1): 487, 2024 Apr 22.
Artículo en Inglés | MEDLINE | ID: mdl-38649503

RESUMEN

The phenomenon of semantic satiation, which refers to the loss of meaning of a word or phrase after being repeated many times, is a well-known psychological phenomenon. However, the microscopic neural computational principles responsible for these mechanisms remain unknown. In this study, we use a deep learning model of continuous coupled neural networks to investigate the mechanism underlying semantic satiation and precisely describe this process with neuronal components. Our results suggest that, from a mesoscopic perspective, semantic satiation may be a bottom-up process. Unlike existing macroscopic psychological studies that suggest that semantic satiation is a top-down process, our simulations use a similar experimental paradigm as classical psychology experiments and observe similar results. Satiation of semantic objectives, similar to the learning process of our network model used for object recognition, relies on continuous learning and switching between objects. The underlying neural coupling strengthens or weakens satiation. Taken together, both neural and network mechanisms play a role in controlling semantic satiation.


Asunto(s)
Aprendizaje Profundo , Semántica , Humanos , Redes Neurales de la Computación , Modelos Neurológicos
7.
Artículo en Inglés | MEDLINE | ID: mdl-38498737

RESUMEN

Spiking Neural Networks (SNNs) have attracted significant attention for their energy-efficient and brain-inspired event-driven properties. Recent advancements, notably Spiking-YOLO, have enabled SNNs to undertake advanced object detection tasks. Nevertheless, these methods often suffer from increased latency and diminished detection accuracy, rendering them less suitable for latency-sensitive mobile platforms. Additionally, the conversion of artificial neural networks (ANNs) to SNNs frequently compromises the integrity of the ANNs' structure, resulting in poor feature representation and heightened conversion errors. To address the issues of high latency and low detection accuracy, we introduce two solutions: timestep compression and spike-time-dependent integrated (STDI) coding. Timestep compression effectively reduces the number of timesteps required in the ANN-to-SNN conversion by condensing information. The STDI coding employs a time-varying threshold to augment information capacity. Furthermore, we have developed an SNN-based spatial pyramid pooling (SPP) structure, optimized to preserve the network's structural efficacy during conversion. Utilizing these approaches, we present the ultralow latency and highly accurate object detection model, SUHD. SUHD exhibits exceptional performance on challenging datasets like PASCAL VOC and MS COCO, achieving a remarkable reduction of approximately 750 times in timesteps and a 30% enhancement in mean average precision (mAP) compared to Spiking-YOLO on MS COCO. To the best of our knowledge, SUHD is currently the deepest spike-based object detection model, achieving ultralow timesteps for lossless conversion.

8.
Neural Netw ; 172: 106092, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-38211460

RESUMEN

Spiking neural networks (SNNs) are considered an attractive option for edge-side applications due to their sparse, asynchronous and event-driven characteristics. However, the application of SNNs to object detection tasks faces challenges in achieving good detection accuracy and high detection speed. To overcome the aforementioned challenges, we propose an end-to-end Trainable Spiking-YOLO (Tr-Spiking-YOLO) for low-latency and high-performance object detection. We evaluate our model on not only frame-based PASCAL VOC dataset but also event-based GEN1 Automotive Detection dataset, and investigate the impacts of different decoding methods on detection performance. The experimental results show that our model achieves competitive/better performance in terms of accuracy, latency and energy consumption compared to similar artificial neural network (ANN) and conversion-based SNN object detection model. Furthermore, when deployed on an edge device, our model achieves a processing speed of approximately from 14 to 39 FPS while maintaining a desirable mean Average Precision (mAP), which is capable of real-time detection on resource-constrained platforms.


Asunto(s)
Redes Neurales de la Computación
9.
Artículo en Inglés | MEDLINE | ID: mdl-38100345

RESUMEN

Spiking neural networks (SNNs) operating with asynchronous discrete events show higher energy efficiency with sparse computation. A popular approach for implementing deep SNNs is artificial neural network (ANN)-SNN conversion combining both efficient training of ANNs and efficient inference of SNNs. However, the accuracy loss is usually nonnegligible, especially under few time steps, which restricts the applications of SNN on latency-sensitive edge devices greatly. In this article, we first identify that such performance degradation stems from the misrepresentation of the negative or overflow residual membrane potential in SNNs. Inspired by this, we decompose the conversion error into three parts: quantization error, clipping error, and residual membrane potential representation error. With such insights, we propose a two-stage conversion algorithm to minimize those errors, respectively. In addition, we show that each stage achieves significant performance gains in a complementary manner. By evaluating on challenging datasets including CIFAR-10, CIFAR-100, and ImageNet, the proposed method demonstrates the state-of-the-art performance in terms of accuracy, latency, and energy preservation. Furthermore, our method is evaluated using a more challenging object detection task, revealing notable gains in regression performance under ultralow latency, when compared with existing spike-based detection algorithms. Codes will be available at: https://github.com/Windere/snn-cvt-dual-phase.

10.
Patterns (N Y) ; 4(10): 100831, 2023 Oct 13.
Artículo en Inglés | MEDLINE | ID: mdl-37876899

RESUMEN

Networks of spiking neurons underpin the extraordinary information-processing capabilities of the brain and have become pillar models in neuromorphic artificial intelligence. Despite extensive research on spiking neural networks (SNNs), most studies are established on deterministic models, overlooking the inherent non-deterministic, noisy nature of neural computations. This study introduces the noisy SNN (NSNN) and the noise-driven learning (NDL) rule by incorporating noisy neuronal dynamics to exploit the computational advantages of noisy neural processing. The NSNN provides a theoretical framework that yields scalable, flexible, and reliable computation and learning. We demonstrate that this framework leads to spiking neural models with competitive performance, improved robustness against challenging perturbations compared with deterministic SNNs, and better reproducing probabilistic computation in neural coding. Generally, this study offers a powerful and easy-to-use tool for machine learning, neuromorphic intelligence practitioners, and computational neuroscience researchers.

11.
Artículo en Inglés | MEDLINE | ID: mdl-37651489

RESUMEN

Traditional spiking learning algorithm aims to train neurons to spike at a specific time or on a particular frequency, which requires precise time and frequency labels in the training process. While in reality, usually only aggregated labels of sequential patterns are provided. The aggregate-label (AL) learning is proposed to discover these predictive features in distracting background streams only by aggregated spikes. It has achieved much success recently, but it is still computationally intensive and has limited use in deep networks. To address these issues, we propose an event-driven spiking aggregate learning algorithm (SALA) in this article. Specifically, to reduce the computational complexity, we improve the conventional spike-threshold-surface (STS) calculation in AL learning by analytical calculating voltage peak values in spiking neurons. Then we derive the algorithm to multilayers by event-driven strategy using aggregated spikes. We conduct comprehensive experiments on various tasks including temporal clue recognition, segmented and continuous speech recognition, and neuromorphic image classification. The experimental results demonstrate that the new STS method improves the efficiency of AL learning significantly, and the proposed algorithm outperforms the conventional spiking algorithm in various temporal clue recognition tasks.

12.
Neural Netw ; 166: 174-187, 2023 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-37494763

RESUMEN

Experience replay (ER) is a widely-adopted neuroscience-inspired method to perform lifelong learning. Nonetheless, existing ER-based approaches consider very coarse memory modules with simple memory and rehearsal mechanisms that cannot fully exploit the potential of memory replay. Evidence from neuroscience has provided fine-grained memory and rehearsal mechanisms, such as the dual-store memory system consisting of PFC-HC circuits. However, the computational abstraction of these processes is still very challenging. To address these problems, we introduce the Dual-Memory (Dual-MEM) model emulating the memorization, consolidation, and rehearsal process in the PFC-HC dual-store memory circuit. Dual-MEM maintains an incrementally updated short-term memory to benefit current-task learning. At the end of the current task, short-term memories will be consolidated into long-term ones for future rehearsal to alleviate forgetting. For the Dual-MEM optimization, we propose two learning policies that emulate different memory retrieval strategies: Direct Retrieval Learning and Mixup Retrieval Learning. Extensive evaluations on eight benchmarks demonstrate that Dual-MEM delivers compelling performance while maintaining high learning and memory utilization efficiencies under the challenging experience-once setting.


Asunto(s)
Aprendizaje , Memoria a Corto Plazo , Educación Continua , Formación de Concepto
13.
Comput Biol Med ; 163: 107114, 2023 09.
Artículo en Inglés | MEDLINE | ID: mdl-37329620

RESUMEN

To navigate in space, it is important to predict headings in real-time from neural responses in the brain to vestibular and visual signals, and the ventral intraparietal area (VIP) is one of the critical brain areas. However, it remains unexplored in the population level how the heading perception is represented in VIP. And there are no commonly used methods suitable for decoding the headings from the population responses in VIP, given the large spatiotemporal dynamics and heterogeneity in the neural responses. Here, responses were recorded from 210 VIP neurons in three rhesus monkeys when they were performing a heading perception task. And by specifically and separately modelling the both dynamics with sparse representation, we built a sequential sparse autoencoder (SSAE) to do the population decoding on the recorded dataset and tried to maximize the decoding performance. The SSAE relies on a three-layer sparse autoencoder to extract temporal and spatial heading features in the dataset via unsupervised learning, and a softmax classifier to decode the headings. Compared with other population decoding methods, the SSAE achieves a leading accuracy of 96.8% ± 2.1%, and shows the advantages of robustness, low storage and computing burden for real-time prediction. Therefore, our SSAE model performs well in learning neurobiologically plausible features comprising dynamic navigational information.


Asunto(s)
Movimientos Oculares , Percepción de Movimiento , Animales , Lóbulo Parietal/fisiología , Percepción de Movimiento/fisiología , Estimulación Luminosa/métodos , Encéfalo , Macaca mulatta
14.
Front Neurosci ; 17: 1204334, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37260839

RESUMEN

[This corrects the article DOI: 10.3389/fnins.2023.1123698.].

15.
Artículo en Inglés | MEDLINE | ID: mdl-37022405

RESUMEN

The temporal credit assignment (TCA) problem, which aims to detect predictive features hidden in distracting background streams, remains a core challenge in biological and machine learning. Aggregate-label (AL) learning is proposed by researchers to resolve this problem by matching spikes with delayed feedback. However, the existing AL learning algorithms only consider the information of a single timestep, which is inconsistent with the real situation. Meanwhile, there is no quantitative evaluation method for TCA problems. To address these limitations, we propose a novel attention-based TCA (ATCA) algorithm and a minimum editing distance (MED)-based quantitative evaluation method. Specifically, we define a loss function based on the attention mechanism to deal with the information contained within the spike clusters and use MED to evaluate the similarity between the spike train and the target clue flow. Experimental results on musical instrument recognition (MedleyDB), speech recognition (TIDIGITS), and gesture recognition (DVS128-Gesture) show that the ATCA algorithm can reach the state-of-the-art (SOTA) level compared with other AL learning algorithms.

16.
Artículo en Inglés | MEDLINE | ID: mdl-37030679

RESUMEN

A large quantity of labeled data is required to train high-performance deep spiking neural networks (SNNs), but obtaining labeled data is expensive. Active learning is proposed to reduce the quantity of labeled data required by deep learning models. However, conventional active learning methods in SNNs are not as effective as that in conventional artificial neural networks (ANNs) because of the difference in feature representation and information transmission. To address this issue, we propose an effective active learning method for a deep SNN model in this article. Specifically, a loss prediction module ActiveLossNet is proposed to extract features and select valuable samples for deep SNNs. Then, we derive the corresponding active learning algorithm for deep SNN models. Comprehensive experiments are conducted on CIFAR-10, MNIST, Fashion-MNIST, and SVHN on different SNN frameworks, including seven-layer CIFARNet and 20-layer ResNet-18. The comparison results demonstrate that the proposed active learning algorithm outperforms random selection and conventional ANN active learning methods. In addition, our method converges faster than conventional active learning methods.

17.
Front Neurosci ; 17: 1123698, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-36875665

RESUMEN

Event cameras are asynchronous and neuromorphically inspired visual sensors, which have shown great potential in object tracking because they can easily detect moving objects. Since event cameras output discrete events, they are inherently suitable to coordinate with Spiking Neural Network (SNN), which has a unique event-driven computation characteristic and energy-efficient computing. In this paper, we tackle the problem of event-based object tracking by a novel architecture with a discriminatively trained SNN, called the Spiking Convolutional Tracking Network (SCTN). Taking a segment of events as input, SCTN not only better exploits implicit associations among events rather than event-wise processing, but also fully utilizes precise temporal information and maintains the sparse representation in segments instead of frames. To make SCTN more suitable for object tracking, we propose a new loss function that introduces an exponential Intersection over Union (IoU) in the voltage domain. To the best of our knowledge, this is the first tracking network directly trained with SNN. Besides, we present a new event-based tracking dataset, dubbed DVSOT21. In contrast to other competing trackers, experimental results on DVSOT21 demonstrate that our method achieves competitive performance with very low energy consumption compared to ANN based trackers with very low energy consumption compared to ANN based trackers. With lower energy consumption, tracking on neuromorphic hardware will reveal its advantage.

18.
Cereb Cortex ; 33(11): 6772-6784, 2023 05 24.
Artículo en Inglés | MEDLINE | ID: mdl-36734278

RESUMEN

Gaze change can misalign spatial reference frames encoding visual and vestibular signals in cortex, which may affect the heading discrimination. Here, by systematically manipulating the eye-in-head and head-on-body positions to change the gaze direction of subjects, the performance of heading discrimination was tested with visual, vestibular, and combined stimuli in a reaction-time task in which the reaction time is under the control of subjects. We found the gaze change induced substantial biases in perceived heading, increased the threshold of discrimination and reaction time of subjects in all stimulus conditions. For the visual stimulus, the gaze effects were induced by changing the eye-in-world position, and the perceived heading was biased in the opposite direction of gaze. In contrast, the vestibular gaze effects were induced by changing the eye-in-head position, and the perceived heading was biased in the same direction of gaze. Although the bias was reduced when the visual and vestibular stimuli were combined, integration of the 2 signals substantially deviated from predictions of an extended diffusion model that accumulates evidence optimally over time and across sensory modalities. These findings reveal diverse gaze effects on the heading discrimination and emphasize that the transformation of spatial reference frames may underlie the effects.


Asunto(s)
Percepción de Movimiento , Vestíbulo del Laberinto , Humanos , Tiempo de Reacción , Corteza Cerebral , Sesgo , Percepción Visual , Estimulación Luminosa
19.
IEEE Trans Cybern ; 53(11): 7187-7198, 2023 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-36063509

RESUMEN

As the third-generation neural networks, spiking neural networks (SNNs) have great potential on neuromorphic hardware because of their high energy efficiency. However, deep spiking reinforcement learning (DSRL), that is, the reinforcement learning (RL) based on SNNs, is still in its preliminary stage due to the binary output and the nondifferentiable property of the spiking function. To address these issues, we propose a deep spiking Q -network (DSQN) in this article. Specifically, we propose a directly trained DSRL architecture based on the leaky integrate-and-fire (LIF) neurons and deep Q -network (DQN). Then, we adapt a direct spiking learning algorithm for the DSQN. We further demonstrate the advantages of using LIF neurons in DSQN theoretically. Comprehensive experiments have been conducted on 17 top-performing Atari games to compare our method with the state-of-the-art conversion method. The experimental results demonstrate the superiority of our method in terms of performance, stability, generalization and energy efficiency. To the best of our knowledge, our work is the first one to achieve state-of-the-art performance on multiple Atari games with the directly trained SNN.

20.
IEEE Trans Neural Netw Learn Syst ; 34(11): 9040-9053, 2023 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-35298385

RESUMEN

Neural architecture search (NAS) has attracted much attention in recent years. It automates the neural network construction for different tasks, which is traditionally addressed manually. In the literature, evolutionary optimization (EO) has been proposed for NAS due to its strong global search capability. However, despite the success enjoyed by EO, it is worth noting that existing EO algorithms for NAS are often very computationally expensive, which makes these algorithms unpractical in reality. Keeping this in mind, in this article, we propose an efficient memetic algorithm (MA) for automated convolutional neural network (CNN) architecture search. In contrast to existing EO algorithms for CNN architecture design, a new cell-based architecture search space, and new global and local search operators are proposed for CNN architecture search. To further improve the efficiency of our proposed algorithm, we develop a one-epoch-based performance estimation strategy without any pretrained models to evaluate each found architecture on the training datasets. To investigate the performance of the proposed method, comprehensive empirical studies are conducted against 34 state-of-the-art peer algorithms, including manual algorithms, reinforcement learning (RL) algorithms, gradient-based algorithms, and evolutionary algorithms (EAs), on widely used CIFAR10 and CIFAR100 datasets. The obtained results confirmed the efficacy of the proposed approach for automated CNN architecture design.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...