Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 20
Filtrar
1.
Sensors (Basel) ; 23(14)2023 Jul 17.
Artículo en Inglés | MEDLINE | ID: mdl-37514745

RESUMEN

Vision-based tactile sensors (VBTSs) have become the de facto method for giving robots the ability to obtain tactile feedback from their environment. Unlike other solutions to tactile sensing, VBTSs offer high spatial resolution feedback without compromising on instrumentation costs or incurring additional maintenance expenses. However, conventional cameras used in VBTS have a fixed update rate and output redundant data, leading to computational overhead.In this work, we present a neuromorphic vision-based tactile sensor (N-VBTS) that employs observations from an event-based camera for contact angle prediction. In particular, we design and develop a novel graph neural network, dubbed TactiGraph, that asynchronously operates on graphs constructed from raw N-VBTS streams exploiting their spatiotemporal correlations to perform predictions. Although conventional VBTSs use an internal illumination source, TactiGraph is reported to perform efficiently in both scenarios (with and without an internal illumination source) thus further reducing instrumentation costs. Rigorous experimental results revealed that TactiGraph achieved a mean absolute error of 0.62∘ in predicting the contact angle and was faster and more efficient than both conventional VBTS and other N-VBTS, with lower instrumentation costs. Specifically, N-VBTS requires only 5.5% of the computing time needed by VBTS when both are tested on the same scenario.

2.
Sensors (Basel) ; 23(6)2023 Mar 07.
Artículo en Inglés | MEDLINE | ID: mdl-36991602

RESUMEN

Video deblurring aims at removing the motion blur caused by the movement of objects or camera shake. Traditional video deblurring methods have mainly focused on frame-based deblurring, which takes only blurry frames as the input to produce sharp frames. However, frame-based deblurring has shown poor picture quality in challenging cases of video restoration where severely blurred frames are provided as the input. To overcome this issue, recent studies have begun to explore the event-based approach, which uses the event sequence captured by an event camera for motion deblurring. Event cameras have several advantages compared to conventional frame cameras. Among these advantages, event cameras have a low latency in imaging data acquisition (0.001 ms for event cameras vs. 10 ms for frame cameras). Hence, event data can be acquired at a high acquisition rate (up to one microsecond). This means that the event sequence contains more accurate motion information than video frames. Additionally, event data can be acquired with less motion blur. Due to these advantages, the use of event data is highly beneficial for achieving improvements in the quality of deblurred frames. Accordingly, the results of event-based video deblurring are superior to those of frame-based deblurring methods, even for severely blurred video frames. However, the direct use of event data can often generate visual artifacts in the final output frame (e.g., image noise and incorrect textures), because event data intrinsically contain insufficient textures and event noise. To tackle this issue in event-based deblurring, we propose a two-stage coarse-refinement network by adding a frame-based refinement stage that utilizes all the available frames with more abundant textures to further improve the picture quality of the first-stage coarse output. Specifically, a coarse intermediate frame is estimated by performing event-based video deblurring in the first-stage network. A residual hint attention (RHA) module is also proposed to extract useful attention information from the coarse output and all the available frames. This module connects the first and second stages and effectively guides the frame-based refinement of the coarse output. The final deblurred frame is then obtained by refining the coarse output using the residual hint attention and all the available frame information in the second-stage network. We validated the deblurring performance of the proposed network on the GoPro synthetic dataset (33 videos and 4702 frames) and the HQF real dataset (11 videos and 2212 frames). Compared to the state-of-the-art method (D2Net), we achieved a performance improvement of 1 dB in PSNR and 0.05 in SSIM on the GoPro dataset, and an improvement of 1.7 dB in PSNR and 0.03 in SSIM on the HQF dataset.

3.
Sensors (Basel) ; 22(20)2022 Oct 19.
Artículo en Inglés | MEDLINE | ID: mdl-36298314

RESUMEN

Computer vision techniques can monitor the rotational speed of rotating equipment or machines to understand their working conditions and prevent failures. Such techniques are highly precise, contactless, and potentially suitable for applications without massive setup changes. However, traditional vision sensors collect a significant amount of data to process and measure the rotation of high-speed systems, and they are susceptible to motion blur. This work proposes a new method for measuring rotational speed processing event-based data applied to high-speed systems using a neuromorphic sensor. This sensor produces event-based data and is designed to work with high temporal resolution and high dynamic range. The main advantages of the Event-based Angular Speed Measurement (EB-ASM) method are the high dynamic range, the absence of motion blurring, and the possibility of measuring multiple rotations simultaneously with a single device. The proposed method uses the time difference between spikes in a Kernel or Window selected in the sensor frame range. It is evaluated in two experimental scenarios by measuring a fan rotational speed and a Router Computer Numerical Control (CNC) spindle. The results compare measurements with a calibrated digital photo-tachometer. Based on the performed tests, the EB-ASM can measure the rotational speed with a mean absolute error of less than 0.2% for both scenarios.


Asunto(s)
Movimiento , Movimiento (Física) , Monitoreo Fisiológico
4.
Sensors (Basel) ; 22(15)2022 Jul 29.
Artículo en Inglés | MEDLINE | ID: mdl-35957244

RESUMEN

We present a new solution to tracking and mapping with an event camera. The motion of the camera contains both rotation and translation displacements in the plane, and the displacements happen in an arbitrarily structured environment. As a result, the image matching may no longer be represented by a low-dimensional homographic warping, thus complicating an application of the commonly used Image of Warped Events (IWE). We introduce a new solution to this problem by performing contrast maximization in 3D. The 3D location of the rays cast for each event is smoothly varied as a function of a continuous-time motion parametrization, and the optimal parameters are found by maximizing the contrast in a volumetric ray density field. Our method thus performs joint optimization over motion and structure. The practical validity of our approach is supported by an application to AGV motion estimation and 3D reconstruction with a single vehicle-mounted event camera. The method approaches the performance obtained with regular cameras and eventually outperforms in challenging visual conditions.


Asunto(s)
Algoritmos , Rotación
5.
Sensors (Basel) ; 21(18)2021 Sep 13.
Artículo en Inglés | MEDLINE | ID: mdl-34577349

RESUMEN

Dynamic Vision Sensors differ from conventional cameras in that only intensity changes of individual pixels are perceived and transmitted as an asynchronous stream instead of an entire frame. The technology promises, among other things, high temporal resolution and low latencies and data rates. While such sensors currently enjoy much scientific attention, there are only little publications on practical applications. One field of application that has hardly been considered so far, yet potentially fits well with the sensor principle due to its special properties, is automatic visual inspection. In this paper, we evaluate current state-of-the-art processing algorithms in this new application domain. We further propose an algorithmic approach for the identification of ideal time windows within an event stream for object classification. For the evaluation of our method, we acquire two novel datasets that contain typical visual inspection scenarios, i.e., the inspection of objects on a conveyor belt and during free fall. The success of our algorithmic extension for data processing is demonstrated on the basis of these new datasets by showing that classification accuracy of current algorithms is highly increased. By making our new datasets publicly available, we intend to stimulate further research on application of Dynamic Vision Sensors in machine vision applications.

6.
Sensors (Basel) ; 20(6)2020 Mar 13.
Artículo en Inglés | MEDLINE | ID: mdl-32183052

RESUMEN

Unsupervised feature extraction algorithms form one of the most important building blocks in machine learning systems. These algorithms are often adapted to the event-based domain to perform online learning in neuromorphic hardware. However, not designed for the purpose, such algorithms typically require significant simplification during implementation to meet hardware constraints, creating trade offs with performance. Furthermore, conventional feature extraction algorithms are not designed to generate useful intermediary signals which are valuable only in the context of neuromorphic hardware limitations. In this work a novel event-based feature extraction method is proposed that focuses on these issues. The algorithm operates via simple adaptive selection thresholds which allow a simpler implementation of network homeostasis than previous works by trading off a small amount of information loss in the form of missed events that fall outside the selection thresholds. The behavior of the selection thresholds and the output of the network as a whole are shown to provide uniquely useful signals indicating network weight convergence without the need to access network weights. A novel heuristic method for network size selection is proposed which makes use of noise events and their feature representations. The use of selection thresholds is shown to produce network activation patterns that predict classification accuracy allowing rapid evaluation and optimization of system parameters without the need to run back-end classifiers. The feature extraction method is tested on both the N-MNIST (Neuromorphic-MNIST) benchmarking dataset and a dataset of airplanes passing through the field of view. Multiple configurations with different classifiers are tested with the results quantifying the resultant performance gains at each processing stage.

7.
Sensors (Basel) ; 16(11)2016 Nov 08.
Artículo en Inglés | MEDLINE | ID: mdl-27834800

RESUMEN

Vision-based sensing algorithms are computationally-demanding tasks due to the large amount of data acquired and processed. Visual sensors deliver much information, even if data are redundant, and do not give any additional information. A Selective Change Driven (SCD) sensing system is based on a sensor that delivers, ordered by the magnitude of its change, only those pixels that have changed most since the last read-out. This allows the information stream to be adjusted to the computation capabilities. Following this strategy, a new SCD processing architecture for high-speed motion analysis, based on processing pixels instead of full frames, has been developed and implemented into a Field Programmable Gate-Array (FPGA). The programmable device controls the data stream, delivering a new object distance calculation for every new pixel. The acquisition, processing and delivery of a new object distance takes just 1.7 µ s. Obtaining a similar result using a conventional frame-based camera would require a device working at roughly 500 Kfps, which is far from being practical or even feasible. This system, built with the recently-developed 64 × 64 CMOS SCD sensor, shows the potential of the SCD approach when combined with a hardware processing system.

8.
Bioinspir Biomim ; 19(3)2024 Mar 04.
Artículo en Inglés | MEDLINE | ID: mdl-38373337

RESUMEN

Neuromorphic event-based cameras communicate transients in luminance instead of frames, providing visual information with a fine temporal resolution, high dynamic range and high signal-to-noise ratio. Enriching event data with color information allows for the reconstruction of colorful frame-like intensity maps, supporting improved performance and visually appealing results in various computer vision tasks. In this work, we simulated a biologically inspired color fusion system featuring a three-stage convolutional neural network for reconstructing color intensity maps from event data and sparse color cues. While current approaches for color fusion use full RGB frames in high resolution, our design uses event data and low-spatial and tonal-resolution quantized color cues, providing a high-performing small model for efficient colorful image reconstruction. The proposed model outperforms existing coloring schemes in terms of SSIM, LPIPS, PSNR, and CIEDE2000 metrics. We demonstrate that auxiliary limited color information can be used in conjunction with event data to successfully reconstruct both color and intensity frames, paving the way for more efficient hardware designs.


Asunto(s)
Benchmarking , Redes Neurales de la Computación , Diseño de Equipo , Relación Señal-Ruido , Procesamiento de Imagen Asistido por Computador
9.
Neural Netw ; 166: 410-423, 2023 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-37549609

RESUMEN

Event-based visual, a new visual paradigm with bio-inspired dynamic perception and µs level temporal resolution, has prominent advantages in many specific visual scenarios and gained much research interest. Spiking neural network (SNN) is naturally suitable for dealing with event streams due to its temporal information processing capability and event-driven nature. However, existing works SNN neglect the fact that the input event streams are spatially sparse and temporally non-uniform, and just treat these variant inputs equally. This situation interferes with the effectiveness and efficiency of existing SNNs. In this paper, we propose the feature Refine-and-Mask SNN (RM-SNN), which has the ability of self-adaption to regulate the spiking response in a data-dependent way. We use the Refine-and-Mask (RM) module to refine all features and mask the unimportant features to optimize the membrane potential of spiking neurons, which in turn drops the spiking activity. Inspired by the fact that not all events in spatio-temporal streams are task-relevant, we execute the RM module in both temporal and channel dimensions. Extensive experiments on seven event-based benchmarks, DVS128 Gesture, DVS128 Gait, CIFAR10-DVS, N-Caltech101, DailyAction-DVS, UCF101-DVS, and HMDB51-DVS demonstrate that under the multi-scale constraints of input time window, RM-SNN can significantly reduce the network average spiking activity rate while improving the task performance. In addition, by visualizing spiking responses, we analyze why sparser spiking activity can be better. Code.


Asunto(s)
Redes Neurales de la Computación , Percepción del Tiempo , Potenciales de Acción/fisiología , Reconocimiento en Psicología , Neuronas/fisiología
10.
J Imaging ; 8(8)2022 Jul 27.
Artículo en Inglés | MEDLINE | ID: mdl-36005453

RESUMEN

Event-based vision is an emerging field of computer vision that offers unique properties, such as asynchronous visual output, high temporal resolutions, and dependence on brightness changes, to generate data. These properties can enable robust high-temporal-resolution object detection and tracking when combined with frame-based vision. In this paper, we present a hybrid, high-temporal-resolution object detection and tracking approach that combines learned and classical methods using synchronized images and event data. Off-the-shelf frame-based object detectors are used for initial object detection and classification. Then, event masks, generated per detection, are used to enable inter-frame tracking at varying temporal resolutions using the event data. Detections are associated across time using a simple, low-cost association metric. Moreover, we collect and label a traffic dataset using the hybrid sensor DAVIS 240c. This dataset is utilized for quantitative evaluation using state-of-the-art detection and tracking metrics. We provide ground truth bounding boxes and object IDs for each vehicle annotation. Further, we generate high-temporal-resolution ground truth data to analyze tracking performance at different temporal rates. Our approach shows promising results, with minimal performance deterioration at higher temporal resolutions (48-384 Hz) when compared with the baseline frame-based performance at 24 Hz.

11.
Sensors (Basel) ; 11(11): 11000-20, 2011.
Artículo en Inglés | MEDLINE | ID: mdl-22346684

RESUMEN

Selective change driven (SCD) Vision is a biologically inspired strategy for acquiring, transmitting and processing images that significantly speeds up image sensing. SCD vision is based on a new CMOS image sensor which delivers, ordered by the absolute magnitude of its change, the pixels that have changed after the last time they were read out. Moreover, the traditional full frame processing hardware and programming methodology has to be changed, as a part of this biomimetic approach, to a new processing paradigm based on pixel processing in a data flow manner, instead of full frame image processing.


Asunto(s)
Inteligencia Artificial , Biomimética , Procesamiento de Imagen Asistido por Computador/métodos , Algoritmos , Simulación por Computador , Movimiento (Física)
12.
Front Neurosci ; 14: 439, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-32431592

RESUMEN

Spiking neural networks (SNNs) are potentially highly efficient models for inference on fully parallel neuromorphic hardware, but existing training methods that convert conventional artificial neural networks (ANNs) into SNNs are unable to exploit these advantages. Although ANN-to-SNN conversion has achieved state-of-the-art accuracy for static image classification tasks, the following subtle but important difference in the way SNNs and ANNs integrate information over time makes the direct application of conversion techniques for sequence processing tasks challenging. Whereas all connections in SNNs have a certain propagation delay larger than zero, ANNs assign different roles to feed-forward connections, which immediately update all neurons within the same time step, and recurrent connections, which have to be rolled out in time and are typically assigned a delay of one time step. Here, we present a novel method to obtain highly accurate SNNs for sequence processing by modifying the ANN training before conversion, such that delays induced by ANN rollouts match the propagation delays in the targeted SNN implementation. Our method builds on the recently introduced framework of streaming rollouts, which aims for fully parallel model execution of ANNs and inherently allows for temporal integration by merging paths of different delays between input and output of the network. The resulting networks achieve state-of-the-art accuracy for multiple event-based benchmark datasets, including N-MNIST, CIFAR10-DVS, N-CARS, and DvsGesture, and through the use of spatio-temporal shortcut connections yield low-latency approximate network responses that improve over time as more of the input sequence is processed. In addition, our converted SNNs are consistently more energy-efficient than their corresponding ANNs.

13.
Front Neurosci ; 14: 551, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-32655350

RESUMEN

In this work, we present a neuromorphic architecture for head pose estimation and scene representation for the humanoid iCub robot. The spiking neuronal network is fully realized in Intel's neuromorphic research chip, Loihi, and precisely integrates the issued motor commands to estimate the iCub's head pose in a neuronal path-integration process. The neuromorphic vision system of the iCub is used to correct for drift in the pose estimation. Positions of objects in front of the robot are memorized using on-chip synaptic plasticity. We present real-time robotic experiments using 2 degrees of freedom (DoF) of the robot's head and show precise path integration, visual reset, and object position learning on-chip. We discuss the requirements for integrating the robotic system and neuromorphic hardware with current technologies.

14.
Front Neurorobot ; 13: 82, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-31649524

RESUMEN

Object tracking based on the event-based camera or dynamic vision sensor (DVS) remains a challenging task due to the noise events, rapid change of event-stream shape, chaos of complex background textures, and occlusion. To address the challenges, this paper presents a robust event-stream object tracking method based on correlation filter mechanism and convolutional neural network (CNN) representation. In the proposed method, rate coding is used to encode the event-stream object. Feature representations from hierarchical convolutional layers of a pre-trained CNN are used to represent the appearance of the rate encoded event-stream object. Results prove that the proposed method not only achieves good tracking performance in many complicated scenes with noise events, complex background textures, occlusion, and intersected trajectories, but also is robust to variable scale, variable pose, and non-rigid deformations. In addition, the correlation filter-based method has the advantage of high speed. The proposed approach will promote the potential applications of these event-based vision sensors in autonomous driving, robots and many other high-speed scenes.

15.
Front Neurosci ; 12: 1047, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-30705618

RESUMEN

In this work, we investigate event-based feature extraction through a rigorous framework of testing. We test a hardware efficient variant of Spike Timing Dependent Plasticity (STDP) on a range of spatio-temporal kernels with different surface decaying methods, decay functions, receptive field sizes, feature numbers, and back end classifiers. This detailed investigation can provide helpful insights and rules of thumb for performance vs. complexity trade-offs in more generalized networks, especially in the context of hardware implementation, where design choices can incur significant resource costs. The investigation is performed using a new dataset consisting of model airplanes being dropped free-hand close to the sensor. The target objects exhibit a wide range of relative orientations and velocities. This range of target velocities, analyzed in multiple configurations, allows a rigorous comparison of time-based decaying surfaces (time surfaces) vs. event index-based decaying surface (index surfaces), which are used to perform unsupervised feature extraction, followed by target detection and recognition. We examine each processing stage by comparison to the use of raw events, as well as a range of alternative layer structures, and the use of random features. By comparing results from a linear classifier and an ELM classifier, we evaluate how each element of the system affects accuracy. To generate time and index surfaces, the most commonly used kernels, namely event binning kernels, linearly, and exponentially decaying kernels, are investigated. Index surfaces were found to outperform time surfaces in recognition when invariance to target velocity was made a requirement. In the investigation of network structure, larger networks of neurons with large receptive field sizes were found to perform best. We find that a small number of event-based feature extractors can project the complex spatio-temporal event patterns of the dataset to an almost linearly separable representation in feature space, with best performing linear classifier achieving 98.75% recognition accuracy, using only 25 feature extracting neurons.

16.
Front Neurorobot ; 12: 4, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-29515386

RESUMEN

In order to safely navigate and orient in their local surroundings autonomous systems need to rapidly extract and persistently track visual features from the environment. While there are many algorithms tackling those tasks for traditional frame-based cameras, these have to deal with the fact that conventional cameras sample their environment with a fixed frequency. Most prominently, the same features have to be found in consecutive frames and corresponding features then need to be matched using elaborate techniques as any information between the two frames is lost. We introduce a novel method to detect and track line structures in data streams of event-based silicon retinae [also known as dynamic vision sensors (DVS)]. In contrast to conventional cameras, these biologically inspired sensors generate a quasicontinuous stream of vision information analogous to the information stream created by the ganglion cells in mammal retinae. All pixels of DVS operate asynchronously without a periodic sampling rate and emit a so-called DVS address event as soon as they perceive a luminance change exceeding an adjustable threshold. We use the high temporal resolution achieved by the DVS to track features continuously through time instead of only at fixed points in time. The focus of this work lies on tracking lines in a mostly static environment which is observed by a moving camera, a typical setting in mobile robotics. Since DVS events are mostly generated at object boundaries and edges which in man-made environments often form lines they were chosen as feature to track. Our method is based on detecting planes of DVS address events in x-y-t-space and tracing these planes through time. It is robust against noise and runs in real time on a standard computer, hence it is suitable for low latency robotics. The efficacy and performance are evaluated on real-world data sets which show artificial structures in an office-building using event data for tracking and frame data for ground-truth estimation from a DAVIS240C sensor.

17.
Front Neurosci ; 11: 309, 2017.
Artículo en Inglés | MEDLINE | ID: mdl-28611582

RESUMEN

Neuromorphic vision research requires high-quality and appropriately challenging event-stream datasets to support continuous improvement of algorithms and methods. However, creating event-stream datasets is a time-consuming task, which needs to be recorded using the neuromorphic cameras. Currently, there are limited event-stream datasets available. In this work, by utilizing the popular computer vision dataset CIFAR-10, we converted 10,000 frame-based images into 10,000 event streams using a dynamic vision sensor (DVS), providing an event-stream dataset of intermediate difficulty in 10 different classes, named as "CIFAR10-DVS." The conversion of event-stream dataset was implemented by a repeated closed-loop smooth (RCLS) movement of frame-based images. Unlike the conversion of frame-based images by moving the camera, the image movement is more realistic in respect of its practical applications. The repeated closed-loop image movement generates rich local intensity changes in continuous time which are quantized by each pixel of the DVS camera to generate events. Furthermore, a performance benchmark in event-driven object classification is provided based on state-of-the-art classification algorithms. This work provides a large event-stream dataset and an initial benchmark for comparison, which may boost algorithm developments in even-driven pattern recognition and object classification.

18.
Neural Netw ; 66: 91-106, 2015 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-25828960

RESUMEN

This paper introduces an event-based luminance-free method to detect and match corner events from the output of asynchronous event-based neuromorphic retinas. The method relies on the use of space-time properties of moving edges. Asynchronous event-based neuromorphic retinas are composed of autonomous pixels, each of them asynchronously generating "spiking" events that encode relative changes in pixels' illumination at high temporal resolutions. Corner events are defined as the spatiotemporal locations where the aperture problem can be solved using the intersection of several geometric constraints in events' spatiotemporal spaces. A regularization process provides the required constraints, i.e. the motion attributes of the edges with respect to their spatiotemporal locations using local geometric properties of visual events. Experimental results are presented on several real scenes showing the stability and robustness of the detection and matching.


Asunto(s)
Procesamiento de Imagen Asistido por Computador/métodos , Modelos Neurológicos , Campos Visuales , Procesamiento de Imagen Asistido por Computador/instrumentación
19.
Neural Netw ; 45: 27-38, 2013 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-23545156

RESUMEN

This paper presents a novel N-ocular 3D reconstruction algorithm for event-based vision data from bio-inspired artificial retina sensors. Artificial retinas capture visual information asynchronously and encode it into streams of asynchronous spike-like pulse signals carrying information on, e.g., temporal contrast events in the scene. The precise time of the occurrence of these visual features are implicitly encoded in the spike timings. Due to the high temporal resolution of the asynchronous visual information acquisition, the output of these sensors is ideally suited for dynamic 3D reconstruction. The presented technique takes full benefit of the event-driven operation, i.e. events are processed individually at the moment they arrive. This strategy allows us to preserve the original dynamics of the scene, hence allowing for more robust 3D reconstructions. As opposed to existing techniques, this algorithm is based on geometric and time constraints alone, making it particularly simple to implement and largely linear.


Asunto(s)
Imagenología Tridimensional , Modelos Neurológicos , Neuronas/citología , Neuronas/fisiología , Retina/citología , Visión Ocular/fisiología , Algoritmos , Simulación por Computador , Humanos , Modelos Estadísticos , Percepción Visual
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA