Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Resultados 1 - 9 de 9
Filtrar
Más filtros

Banco de datos
Tipo del documento
Publication year range
1.
Nano Lett ; 24(19): 5862-5869, 2024 May 15.
Artículo en Inglés | MEDLINE | ID: mdl-38709809

RESUMEN

Dynamic vision perception and processing (DVPP) is in high demand by booming edge artificial intelligence. However, existing imaging systems suffer from low efficiency or low compatibility with advanced machine vision techniques. Here, we propose a reconfigurable bipolar image sensor (RBIS) for in-sensor DVPP based on a two-dimensional WSe2/GeSe heterostructure device. Owing to the gate-tunable and reversible built-in electric field, its photoresponse shows bipolarity as being positive or negative. High-efficiency DVPP incorporating front-end RBIS and back-end CNN is then demonstrated. It shows a high recognition accuracy of over 94.9% on the derived DVS128 data set and requires much fewer neural network parameters than that without RBIS. Moreover, we demonstrate an optimized device with a vertically stacked structure and a stable nonvolatile bipolarity, which enables more efficient DVPP hardware. Our work demonstrates the potential of fabricating DVPP devices with a simple structure, high efficiency, and outputs compatible with advanced algorithms.

2.
Data Brief ; 54: 110340, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38550235

RESUMEN

The featured dataset, the Event-based Dataset of Assembly Tasks (EDAT24), showcases a selection of manufacturing primitive tasks (idle, pick, place, and screw), which are basic actions performed by human operators in any manufacturing assembly. The data were captured using a DAVIS240C event camera, an asynchronous vision sensor that registers events when changes in light intensity value occur. Events are a lightweight data format for conveying visual information and are well-suited for real-time detection and analysis of human motion. Each manufacturing primitive has 100 recorded samples of DAVIS240C data, including events and greyscale frames, for a total of 400 samples. In the dataset, the user interacts with objects from the open-source CT-Benchmark in front of the static DAVIS event camera. All data are made available in raw form (.aedat) and in pre-processed form (.npy). Custom-built Python code is made available together with the dataset to aid researchers to add new manufacturing primitives or extend the dataset with more samples.

3.
Neural Netw ; 172: 106092, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-38211460

RESUMEN

Spiking neural networks (SNNs) are considered an attractive option for edge-side applications due to their sparse, asynchronous and event-driven characteristics. However, the application of SNNs to object detection tasks faces challenges in achieving good detection accuracy and high detection speed. To overcome the aforementioned challenges, we propose an end-to-end Trainable Spiking-YOLO (Tr-Spiking-YOLO) for low-latency and high-performance object detection. We evaluate our model on not only frame-based PASCAL VOC dataset but also event-based GEN1 Automotive Detection dataset, and investigate the impacts of different decoding methods on detection performance. The experimental results show that our model achieves competitive/better performance in terms of accuracy, latency and energy consumption compared to similar artificial neural network (ANN) and conversion-based SNN object detection model. Furthermore, when deployed on an edge device, our model achieves a processing speed of approximately from 14 to 39 FPS while maintaining a desirable mean Average Precision (mAP), which is capable of real-time detection on resource-constrained platforms.


Asunto(s)
Redes Neurales de la Computación
4.
Biomimetics (Basel) ; 9(1)2024 Jan 02.
Artículo en Inglés | MEDLINE | ID: mdl-38248596

RESUMEN

Visual perception equips unmanned aerial vehicles (UAVs) with increasingly comprehensive and instant environmental perception, rendering it a crucial technology in intelligent UAV obstacle avoidance. However, the rapid movements of UAVs cause significant changes in the field of view, affecting the algorithms' ability to extract the visual features of collisions accurately. As a result, algorithms suffer from a high rate of false alarms and a delay in warning time. During the study of visual field angle curves of different orders, it was found that the peak times of the curves of higher-order information on the angular size of looming objects are linearly related to the time to collision (TTC) and occur before collisions. This discovery implies that encoding higher-order information on the angular size could resolve the issue of response lag. Furthermore, the fact that the image of a looming object adjusts to meet several looming visual cues compared to the background interference implies that integrating various field-of-view characteristics will likely enhance the model's resistance to motion interference. Therefore, this paper presents a concise A-LGMD model for detecting looming objects. The model is based on image angular acceleration and addresses problems related to imprecise feature extraction and insufficient time series modeling to enhance the model's ability to rapidly and precisely detect looming objects during the rapid self-motion of UAVs. The model draws inspiration from the lobula giant movement detector (LGMD), which shows high sensitivity to acceleration information. In the proposed model, higher-order information on the angular size is abstracted by the network and fused with multiple visual field angle characteristics to promote the selective response to looming objects. Experiments carried out on synthetic and real-world datasets reveal that the model can efficiently detect the angular acceleration of an image, filter out insignificant background motion, and provide early warnings. These findings indicate that the model could have significant potential in embedded collision detection systems of micro or small UAVs.

5.
Micromachines (Basel) ; 15(4)2024 Mar 22.
Artículo en Inglés | MEDLINE | ID: mdl-38675238

RESUMEN

For processing streaming events from a Dynamic Vision Sensor camera, two types of neural networks can be considered. One are spiking neural networks, where simple spike-based computation is suitable for low-power consumption, but the discontinuity in spikes can make the training complicated in terms of hardware. The other one are digital Complementary Metal Oxide Semiconductor (CMOS)-based neural networks that can be trained directly using the normal backpropagation algorithm. However, the hardware and energy overhead can be significantly large, because all streaming events must be accumulated and converted into histogram data, which requires a large amount of memory such as SRAM. In this paper, to combine the spike-based operation with the normal backpropagation algorithm, memristor-CMOS hybrid circuits are proposed for implementing event-driven neural networks in hardware. The proposed hybrid circuits are composed of input neurons, synaptic crossbars, hidden/output neurons, and a neural network's controller. Firstly, the input neurons perform preprocessing for the DVS camera's events. The events are converted to histogram data using very simple memristor-based latches in the input neurons. After preprocessing the events, the converted histogram data are delivered to an ANN implemented using synaptic memristor crossbars. The memristor crossbars can perform low-power Multiply-Accumulate (MAC) calculations according to the memristor's current-voltage relationship. The hidden and output neurons can convert the crossbar's column currents to the output voltages according to the Rectified Linear Unit (ReLU) activation function. The neural network's controller adjusts the MAC calculation frequency according to the workload of the event computation. Moreover, the controller can disable the MAC calculation clock automatically to minimize unnecessary power consumption. The proposed hybrid circuits have been verified by circuit simulation for several event-based datasets such as POKER-DVS and MNIST-DVS. The circuit simulation results indicate that the neural network's performance proposed in this paper is degraded by as low as 0.5% while saving as much as 79% in power consumption for POKER-DVS. The recognition rate of the proposed scheme is lower by 0.75% compared to the conventional one, for the MNIST-DVS dataset. In spite of this little loss, the power consumption can be reduced by as much as 75% for the proposed scheme.

6.
Neural Netw ; 179: 106502, 2024 Jul 03.
Artículo en Inglés | MEDLINE | ID: mdl-38996688

RESUMEN

There are primarily two classes of bio-inspired looming perception visual systems. The first class employs hierarchical neural networks inspired by well-acknowledged anatomical pathways responsible for looming perception, and the second maps nonlinear relationships between physical stimulus attributes and neuronal activity. However, even with multi-layered structures, the former class is sometimes fragile in looming selectivity, i.e., the ability to well discriminate between approaching and other categories of movements. While the latter class leaves qualms regarding how to encode visual movements to indicate physical attributes like angular velocity/size. Beyond those, we propose a novel looming perception model based on dynamic neural field (DNF). The DNF is a brain-inspired framework that incorporates both lateral excitation and inhibition within the field through instant feedback, it could be an easily-built model to fulfill the looming sensitivity observed in biological visual systems. To achieve our target of looming perception with computational efficiency, we introduce a single-field DNF with adaptive lateral interactions and dynamic activation threshold. The former mechanism creates antagonism to translating motion, and the latter suppresses excitation during receding. Accordingly, the proposed model exhibits the strongest response to moving objects signaling approaching over other types of external stimuli. The effectiveness of the proposed model is supported by relevant mathematical analysis and ablation study. The computational efficiency and robustness of the model are verified through systematic experiments including on-line collision-detection tasks in micro-mobile robots, at success rate of 93% compared with state-of-the-art methods. The results demonstrate its superiority over the model-based methods concerning looming perception.

7.
Artículo en Inglés | MEDLINE | ID: mdl-38600805

RESUMEN

In the era of the Internet of Things and the rapid progress of artificial intelligence, there is a growing demand for advanced dynamic vision systems. Vision systems are no longer confined to static object detection and recognition, as the detection and recognition of moving objects are becoming increasingly important. To meet the requirements for more precise and efficient dynamic vision, the development of adaptive multimodal motion detection devices becomes imperative. Inspired by the varied response rates in biological vision, we introduce the concept of critical flicker fusion frequency (cFFF) and develop an organic optoelectronic synaptic transistor with adjustable cFFF. In situ Kelvin probe force microscopy analysis reveals that light signal recognition in this device originates from charge transfer in the poly[(2,6-(4,8-bis(5-(2-ethylhexyl)thiophen-2-yl)benzo[1,2-b:4,5-b']dithiophene)-co-(1,3-di(5-thiophene-2-yl)-5,7-bis(2-ethylhexyl)-benzo[1,2-c:4,5-c']dithiophene-4,8-dione)] (PBDB-T)/pentacene heterojunction, which can be effectively modulated by gate voltage. Building upon this, we implement different cFFF within a single device to facilitate the detection and recognition of objects moving at different speeds. This approach allows for resource allocation during dynamic detection, resulting in a reduction in power consumption. Our research holds great potential for enhancing the capabilities of dynamic visual systems.

8.
SELECCIÓN DE REFERENCIAS
Detalles de la búsqueda