Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 20
Filtrar
1.
Light Sci Appl ; 12(1): 95, 2023 Apr 18.
Artículo en Inglés | MEDLINE | ID: mdl-37072383

RESUMEN

Electronic nose (e-nose) technology for selectively identifying a target gas through chemoresistive sensors has gained much attention for various applications, such as smart factory and personal health monitoring. To overcome the cross-reactivity problem of chemoresistive sensors to various gas species, herein, we propose a novel sensing strategy based on a single micro-LED (µLED)-embedded photoactivated (µLP) gas sensor, utilizing the time-variant illumination for identifying the species and concentrations of various target gases. A fast-changing pseudorandom voltage input is applied to the µLED to generate forced transient sensor responses. A deep neural network is employed to analyze the obtained complex transient signals for gas detection and concentration estimation. The proposed sensor system achieves high classification (~96.99%) and quantification (mean absolute percentage error ~ 31.99%) accuracies for various toxic gases (methanol, ethanol, acetone, and nitrogen dioxide) with a single gas sensor consuming 0.53 mW. The proposed method may significantly improve the efficiency of e-nose technology in terms of cost, space, and power consumption.

2.
ACS Nano ; 17(1): 539-551, 2023 01 10.
Artículo en Inglés | MEDLINE | ID: mdl-36534781

RESUMEN

As interests in air quality monitoring related to environmental pollution and industrial safety increase, demands for gas sensors are rapidly increasing. Among various gas sensor types, the semiconductor metal oxide (SMO)-type sensor has advantages of high sensitivity, low cost, mass production, and small size but suffers from poor selectivity. To solve this problem, electronic nose (e-nose) systems using a gas sensor array and pattern recognition are widely used. However, as the number of sensors in the e-nose system increases, total power consumption also increases. In this study, an ultra-low-power e-nose system was developed using ultraviolet (UV) micro-LED (µLED) gas sensors and a convolutional neural network (CNN). A monolithic photoactivated gas sensor was developed by depositing a nanocolumnar In2O3 film coated with plasmonic metal nanoparticles (NPs) directly on the µLED. The e-nose system consists of two different µLED sensors with silver and gold NP coating, and the total power consumption was measured as 0.38 mW, which is one-hundredth of the conventional heater-based e-nose system. Responses to various target gases measured by multi-µLED gas sensors were analyzed by pattern recognition and used as the training data for the CNN algorithm. As a result, a real-time, highly selective e-nose system with a gas classification accuracy of 99.32% and a gas concentration regression error (mean absolute) of 13.82% for five different gases (air, ethanol, NO2, acetone, methanol) was developed. The µLED-based e-nose system can be stably battery-driven for a long period and is expected to be widely used in environmental internet of things (IoT) applications.


Asunto(s)
Aprendizaje Profundo , Nariz Electrónica , Redes Neurales de la Computación , Plata , Gases
3.
Adv Sci (Weinh) ; 9(18): e2106017, 2022 06.
Artículo en Inglés | MEDLINE | ID: mdl-35426489

RESUMEN

A neuromorphic module of an electronic nose (E-nose) is demonstrated by hybridizing a chemoresistive gas sensor made of a semiconductor metal oxide (SMO) and a single transistor neuron (1T-neuron) made of a metal-oxide-semiconductor field-effect transistor (MOSFET). By mimicking a biological olfactory neuron, it simultaneously detects a gas and encoded spike signals for in-sensor neuromorphic functioning. It identifies an odor source by analyzing the complicated mixed signals using a spiking neural network (SNN). The proposed E-nose does not require conversion circuits, which are essential for processing the sensory signals between the sensor array and processors in the conventional bulky E-nose. In addition, they do not have to include a central processing unit (CPU) and memory, which are required for von Neumann computing. The spike transmission of the biological olfactory system, which is known to be the main factor for reducing power consumption, is realized with the SNN for power savings compared to the conventional E-nose with a deep neural network (DNN). Therefore, the proposed neuromorphic E-nose is promising for application to Internet of Things (IoT), which demands a highly scalable and energy-efficient system. As a practical example, it is employed as an electronic sommelier by classifying different types of wines.


Asunto(s)
Redes Neurales de la Computación , Olfato , Nariz Electrónica , Neuronas/fisiología , Óxidos
4.
IEEE Trans Pattern Anal Mach Intell ; 44(2): 834-847, 2022 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-32750773

RESUMEN

Omni-directional images are becoming more prevalent for understanding the scene of all directions around a camera, as they provide a much wider field-of-view (FoV) compared to conventional images. In this work, we present a novel approach to represent omni-directional images and suggest how to apply CNNs on the proposed image representation. The proposed image representation method utilizes a spherical polyhedron to reduce distortion introduced inevitably when sampling pixels on a non-Euclidean spherical surface around the camera center. To apply convolution operation on our representation of images, we stack the neighboring pixels on top of each pixel and multiply with trainable parameters. This approach enables us to apply the same CNN architectures used in conventional euclidean 2D images on our proposed method in a straightforward manner. Compared to the previous work, we additionally compare different designs of kernels that can be applied to our proposed method. We also show that our method outperforms in monocular depth estimation task compared to other state-of-the-art representation methods of omni-directional images. In addition, we propose a novel method to fit bounding ellipses of arbitrary orientation using object detection networks and apply it to an omni-directional real-world human detection dataset.

5.
IEEE Trans Pattern Anal Mach Intell ; 44(10): 6890-6909, 2022 10.
Artículo en Inglés | MEDLINE | ID: mdl-34260349

RESUMEN

An event camera reports per-pixel intensity differences as an asynchronous stream of events with low latency, high dynamic range (HDR), and low power consumption. This stream of sparse/dense events limits the direct use of well-known computer vision applications for event cameras. Further applications of event streams to vision tasks that are sensitive to image quality issues, such as spatial resolution and blur, e.g., object detection, would benefit from a higher resolution of image reconstruction. Moreover, despite the recent advances in spatial resolution in event camera hardware, the majority of commercially available event cameras still have relatively low spatial resolutions when compared to conventional cameras. We propose an end-to-end recurrent network to reconstruct high-resolution, HDR, and temporally consistent grayscale or color frames directly from the event stream, and extend it to generate temporally consistent videos. We evaluate our algorithm on real-world and simulated sequences and verify that it reconstructs fine details of the scene, outperforming previous methods in quantitative quality measures. We further investigate how to (1) incorporate active pixel sensor frames (produced by an event camera) and events together in a complementary setting and (2) reconstruct images iteratively to create an even higher quality and resolution in the images.


Asunto(s)
Algoritmos , Procesamiento de Imagen Asistido por Computador , Procesamiento de Imagen Asistido por Computador/métodos , Programas Informáticos
6.
IEEE Trans Pattern Anal Mach Intell ; 44(6): 3048-3068, 2022 06.
Artículo en Inglés | MEDLINE | ID: mdl-33513099

RESUMEN

Deep neural models, in recent years, have been successful in almost every field, even solving the most complex problem statements. However, these models are huge in size with millions (and even billions) of parameters, demanding heavy computation power and failing to be deployed on edge devices. Besides, the performance boost is highly dependent on redundant labeled data. To achieve faster speeds and to handle the problems caused by the lack of labeled data, knowledge distillation (KD) has been proposed to transfer information learned from one model to another. KD is often characterized by the so-called 'Student-Teacher' (S-T) learning framework and has been broadly applied in model compression and knowledge transfer. This paper is about KD and S-T learning, which are being actively studied in recent years. First, we aim to provide explanations of what KD is and how/why it works. Then, we provide a comprehensive survey on the recent progress of KD methods together with S-T frameworks typically used for vision tasks. In general, we investigate some fundamental questions that have been driving this research area and thoroughly generalize the research progress and technical details. Additionally, we systematically analyze the research status of KD in vision applications. Finally, we discuss the potentials and open challenges of existing methods and prospect the future directions of KD and S-T learning.


Asunto(s)
Algoritmos , Redes Neurales de la Computación , Humanos , Inteligencia , Conocimiento , Estudiantes
7.
IEEE Trans Pattern Anal Mach Intell ; 44(12): 8874-8895, 2022 12.
Artículo en Inglés | MEDLINE | ID: mdl-34714739

RESUMEN

High dynamic range (HDR) imaging is a technique that allows an extensive dynamic range of exposures, which is important in image processing, computer graphics, and computer vision. In recent years, there has been a significant advancement in HDR imaging using deep learning (DL). This study conducts a comprehensive and insightful survey and analysis of recent developments in deep HDR imaging methodologies. We hierarchically and structurally group existing deep HDR imaging methods into five categories based on (1) number/domain of input exposures, (2) number of learning tasks, (3) novel sensor data, (4) novel learning strategies, and (5) applications. Importantly, we provide a constructive discussion on each category regarding its potential and challenges. Moreover, we review some crucial aspects of deep HDR imaging, such as datasets and evaluation metrics. Finally, we highlight some open problems and point out future research directions.


Asunto(s)
Aprendizaje Profundo , Algoritmos , Procesamiento de Imagen Asistido por Computador/métodos , Diagnóstico por Imagen , Gráficos por Computador
8.
IEEE Trans Pattern Anal Mach Intell ; 44(11): 7657-7673, 2022 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-34543191

RESUMEN

Event cameras sense brightness changes in each pixel and yield asynchronous event streams instead of producing intensity images. They have distinct advantages over conventional cameras, such as a high dynamic range (HDR) and no motion blur. To take advantage of event cameras with existing image-based algorithms, a few methods have been proposed to reconstruct images from event streams. However, the output images have a low resolution (LR) and are unrealistic. Low-quality outputs stem from broader applications of event cameras, where high-quality and high-resolution (HR) images are needed. In this work, we consider the problem of reconstructing and super-resolving images from LR events when no ground truth (GT) HR images and degradation models are available. We propose a novel end-to-end joint framework for single image reconstruction and super-resolution from LR event data. Our method is primarily unsupervised to handle the absence of real inputs from GT and deploys adversarial learning. To train our framework, we constructed an open dataset, including simulated events and real-world images. The use of the dataset boosts the network performance, and the network architectures and various loss functions in each phase help improve the quality of the resulting image. Various experiments showed that our method surpasses the state-of-the-art LR image reconstruction methods for real-world and synthetic datasets. The experiments for super-resolution (SR) image reconstruction also substantiate the effectiveness of the proposed method. We further extended our method to more challenging problems of HDR, sharp image reconstruction, and color events. In addition, we demonstrate that the reconstruction and super-resolution results serve as intermediate representations of events for high-level tasks, such as semantic segmentation, object recognition, and detection. We further examined how events affect the outputs of the three phases and analyze our method's efficacy through an ablation study.

9.
IEEE Trans Image Process ; 30: 7541-7553, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34449361

RESUMEN

Recent advances in deep neural networks (DNNs) have facilitated high-end applications, including holistic scene understanding (HSU), in which many tasks run in parallel with the same visual input. Following this trend, various methods have been proposed to use DNNs to perform multiple vision tasks. However, these methods are task-specific and less effective when considering multiple HSU tasks. End-to-end demonstrations of adversarial examples, which generate one-to-many heterogeneous adversarial examples in parallel from the same input, are scarce. Additionally, one-to-many mapping of adversarial examples for HSU usually requires joint representation learning and flexible constraints on magnitude, which can render the prevalent attack methods ineffective. In this paper, we propose PSAT-GAN, an end-to-end framework that follows the pipeline of HSU. It is based on a mixture of generative models and an adversarial classifier that employs partial weight sharing to learn a one-to-many mapping of adversarial examples in parallel, each of which is effective for its corresponding task in HSU attacks. PSAT-GAN is further enhanced by applying novel adversarial and soft-constraint losses to generate effective perturbations and avoid studying transferability. Experimental results indicate that our method is efficient in generating both universal and image-dependent adversarial examples to fool HSU tasks under either targeted or non-targeted settings.

10.
Sensors (Basel) ; 18(7)2018 Jul 19.
Artículo en Inglés | MEDLINE | ID: mdl-30029509

RESUMEN

A low-cost inertial measurement unit (IMU) and a rolling shutter camera form a conventional device configuration for localization of a mobile platform due to their complementary properties and low costs. This paper proposes a new calibration method that jointly estimates calibration and noise parameters of the low-cost IMU and the rolling shutter camera for effective sensor fusion in which accurate sensor calibration is very critical. Based on the graybox system identification, the proposed method estimates unknown noise density so that we can minimize calibration error and its covariance by using the unscented Kalman filter. Then, we refine the estimated calibration parameters with the estimated noise density in batch manner. Experimental results on synthetic and real data demonstrate the accuracy and stability of the proposed method and show that the proposed method provides consistent results even with unknown noise density of the IMU. Furthermore, a real experiment using a commercial smartphone validates the performance of the proposed calibration method in off-the-shelf devices.

11.
IEEE Trans Image Process ; 27(8): 3739-3752, 2018 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-29698206

RESUMEN

Person re-identification is the problem of recognizing people across different images or videos with non-overlapping views. Although a significant progress has been made in person re-identification over the last decade, it remains a challenging task because the appearances of people can seem extremely different across diverse camera viewpoints and person poses. In this paper, we propose a novel framework for person re-identification by analyzing camera viewpoints and person poses called pose-aware multi-shot matching. It robustly estimates individual poses and efficiently performs multi-shot matching based on the pose information. The experimental results obtained by using public person re-identification data sets show that the proposed methods outperform the current state-of-the-art methods, and are promising for accomplishing person re-identification under diverse viewpoints and pose variances.

12.
IEEE Trans Image Process ; 27(5): 2314-2325, 2018 May.
Artículo en Inglés | MEDLINE | ID: mdl-29470169

RESUMEN

This paper addresses the multi-attributed graph matching problem, which considers multiple attributes jointly while preserving the characteristics of each attribute for graph matching. Since most of conventional graph matching algorithms integrate multiple attributes to construct a single unified attribute in an oversimplified manner, the information from multiple attributes is often not completely utilized. In order to solve this problem, we propose a novel multi-layer graph structure that can preserve the characteristics of each attribute in separated layers, and also propose a multi-attributed graph matching algorithm based on random walk centrality with the proposed multi-layer graph structure. We compare the proposed algorithm with other state-of-the-art graph matching algorithms based on a single-layer structure using synthetic and real data sets and demonstrate the superior performance of the proposed multi-layer graph structure and the multi-attributed graph matching algorithm.

13.
IEEE Trans Pattern Anal Mach Intell ; 40(3): 595-610, 2018 03.
Artículo en Inglés | MEDLINE | ID: mdl-28410099

RESUMEN

Online multi-object tracking aims at estimating the tracks of multiple objects instantly with each incoming frame and the information provided up to the moment. It still remains a difficult problem in complex scenes, because of the large ambiguity in associating multiple objects in consecutive frames and the low discriminability between objects appearances. In this paper, we propose a robust online multi-object tracking method that can handle these difficulties effectively. We first define the tracklet confidence using the detectability and continuity of a tracklet, and decompose a multi-object tracking problem into small subproblems based on the tracklet confidence. We then solve the online multi-object tracking problem by associating tracklets and detections in different ways according to their confidence values. Based on this strategy, tracklets sequentially grow with online-provided detections, and fragmented tracklets are linked up with others without any iterative and expensive association steps. For more reliable association between tracklets and detections, we also propose a deep appearance learning method to learn a discriminative appearance model from large training datasets, since the conventional appearance learning methods do not provide rich representation that can distinguish multiple objects with large appearance variations. In addition, we combine online transfer learning for improving appearance discriminability by adapting the pre-trained deep model during online tracking. Experiments with challenging public datasets show distinct performance improvement over other state-of-the-arts batch and online tracking methods, and prove the effect and usefulness of the proposed methods for online multi-object tracking.

14.
J Integr Neurosci ; 16(3): 255-273, 2017.
Artículo en Inglés | MEDLINE | ID: mdl-28891514

RESUMEN

Due to the recent explosion in various forms of 3D content, the evaluation of such content from a neuroscience perspective is quite interesting. However, existing investigations of cortical oscillatory responses in stereoscopic depth perception are quite rare. Therefore, we investigated spatiotemporal and spatio-temporo-spectral features at four different stereoscopic depths within the comfort zone. We adopted a simultaneous EEG/MEG acquisition technique to collect the oscillatory responses of eight participants. We defined subject-specific retinal disparities and designed a single trial-based stereoscopic viewing experimental paradigm. In the group analysis, we observed that, as the depth increased from Level 1 to Level 3, there was a time-locked increase in the N200 component in MEG and the P300 component in EEG in the occipital and parietal areas, respectively. In addition, initial alpha and beta event-related desynchronizations (ERD) were observed at approximately 500 to 1000 msec, while theta, alpha, and beta event-related synchronizations (ERS) appeared at approximately 1000 to 2000 ms. Interestingly, there was a saturation point in the increase in cognitive responses, including N200, P300, and alpha ERD, even when the depth increased only within the comfort zone. Meanwhile, the magnitude of low beta ERD decreased in the dorsal pathway as depth increased. From these findings, we concluded that cognitive responses are likely to become saturated in the visual comfort zone, while perceptual load may increase with depth.


Asunto(s)
Encéfalo/fisiología , Cognición/fisiología , Percepción de Profundidad/fisiología , Electroencefalografía , Magnetoencefalografía , Sincronización Cortical , Electroencefalografía/métodos , Potenciales Evocados , Femenino , Humanos , Magnetoencefalografía/métodos , Masculino , Percepción de Movimiento/fisiología , Imagen Multimodal/métodos , Estimulación Luminosa/métodos , Procesamiento de Señales Asistido por Computador , Adulto Joven
15.
Appl Ergon ; 62: 158-167, 2017 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-28411726

RESUMEN

Recent advances in three-dimensional (3D) video technology have extended the range of our experience while providing various 3D applications to our everyday life. Nevertheless, the so-called visual discomfort (VD) problem inevitably degrades the quality of experience in stereoscopic 3D (S3D) displays. Meanwhile, electroencephalography (EEG) has been regarded as one of the most promising brain imaging modalities in the field of cognitive neuroscience. In an effort to facilitate comfort with S3D displays, we propose a new wellness platform using EEG. We first reveal features in EEG signals that are applicable to practical S3D video systems as an index for VD perception. We then develop a framework that can automatically determine severe perception of VD based on the EEG features during S3D video viewing by capitalizing on machine-learning-based braincomputer interface technology. The proposed platform can cooperate with advanced S3D video systems whose stereo baseline is adjustable. Thus, the optimal S3D content can be reconstructed according to a viewer's sensation of VD. Applications of the proposed platform to various S3D industries are suggested, and further technical challenges are discussed for follow-up research.


Asunto(s)
Percepción de Profundidad , Imagenología Tridimensional/efectos adversos , Trastornos de la Visión/fisiopatología , Percepción Visual/fisiología , Adulto , Comportamiento del Consumidor , Electroencefalografía , Humanos , Imagenología Tridimensional/instrumentación , Masculino , Procesamiento de Señales Asistido por Computador , Máquina de Vectores de Soporte , Grabación en Video , Trastornos de la Visión/etiología , Adulto Joven
16.
Neurosignals ; 24(1): 102-112, 2016.
Artículo en Inglés | MEDLINE | ID: mdl-27771723

RESUMEN

BACKGROUND/AIMS: In exploring human factors, stereoscopic 3D images have been used to investigate the neural responses associated with excessive depth, texture complexity, and other factors. However, the cortical oscillation associated with the complexity of stereoscopic images has been studied rarely. Here, we demonstrated that the oscillatory responses to three differently shaped 3D images (circle, star, and bat) increase as the complexity of the image increases. METHODS: We recorded simultaneous EEG/MEG for three different stimuli. Spatio-temporal and spatio-spectro-temporal features were investigated by non-parametric permutation test. RESULTS: The results showed that N300 and alpha inhibition increased in the ventral area as the shape complexity of the stereoscopic image increased. CONCLUSION: It seems that the relative disparity in complex stereoscopic images may increase cognitive processing (N300) and cortical load (alpha inhibition) in the ventral area.

17.
IEEE Trans Pattern Anal Mach Intell ; 38(5): 903-17, 2016 May.
Artículo en Inglés | MEDLINE | ID: mdl-26336117

RESUMEN

A robust algorithm is proposed for tracking a target object in dynamic conditions including motion blurs, illumination changes, pose variations, and occlusions. To cope with these challenging factors, multiple trackers based on different feature representations are integrated within a probabilistic framework. Each view of the proposed multiview (multi-channel) feature learning algorithm is concerned with one particular feature representation of a target object from which a tracker is developed with different levels of reliability. With the multiple trackers, the proposed algorithm exploits tracker interaction and selection for robust tracking performance. In the tracker interaction, a transition probability matrix is used to estimate dependencies between trackers. Multiple trackers communicate with each other by sharing information of sample distributions. The tracker selection process determines the most reliable tracker with the highest probability. To account for object appearance changes, the transition probability matrix and tracker probability are updated in a recursive Bayesian framework by reflecting the tracker reliability measured by a robust tracker likelihood function that learns to account for both transient and stable appearance changes. Experimental results on benchmark datasets demonstrate that the proposed interacting multiview algorithm performs robustly and favorably against state-of-the-art methods in terms of several quantitative metrics.

18.
IEEE Trans Med Imaging ; 34(11): 2379-93, 2015 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-26011864

RESUMEN

Recent achievement of the learning-based classification leads to the noticeable performance improvement in automatic polyp detection. Here, building large good datasets is very crucial for learning a reliable detector. However, it is practically challenging due to the diversity of polyp types, expensive inspection, and labor-intensive labeling tasks. For this reason, the polyp datasets usually tend to be imbalanced, i.e., the number of non-polyp samples is much larger than that of polyp samples, and learning with those imbalanced datasets results in a detector biased toward a non-polyp class. In this paper, we propose a data sampling-based boosting framework to learn an unbiased polyp detector from the imbalanced datasets. In our learning scheme, we learn multiple weak classifiers with the datasets rebalanced by up/down sampling, and generate a polyp detector by combining them. In addition, for enhancing discriminability between polyps and non-polyps that have similar appearances, we propose an effective feature learning method using partial least square analysis, and use it for learning compact and discriminative features. Experimental results using challenging datasets show obvious performance improvement over other detectors. We further prove effectiveness and usefulness of the proposed methods with extensive evaluation.


Asunto(s)
Pólipos del Colon/diagnóstico , Colonoscopía/métodos , Interpretación de Imagen Asistida por Computador/métodos , Aprendizaje Automático , Algoritmos , Colon/patología , Pólipos del Colon/patología , Humanos , Análisis de los Mínimos Cuadrados
19.
IEEE Trans Image Process ; 23(7): 2820-33, 2014 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-24801247

RESUMEN

In this paper, we consider a multiobject tracking problem in complex scenes. Unlike batch tracking systems using detections of the entire sequence, we propose a novel online multiobject tracking system in order to build tracks sequentially using online provided detections. To track objects robustly even under frequent occlusions, the proposed system consists of three main parts: 1) visual tracking with a novel data association with a track existence probability by associating online detections with the corresponding tracks under partial occlusions; 2) track management to associate terminated tracks for linking tracks fragmented by long-term occlusions; and 3) online model learning to generate discriminative appearance models for successful associations in other two parts. Experimental results using challenging public data sets show the obvious performance improvement of the proposed system, compared with other state-of-the-art tracking systems. Furthermore, extensive performance analysis of the three main parts demonstrates effects and usefulness of the each component for multiobject tracking.

20.
IEEE Trans Pattern Anal Mach Intell ; 28(4): 650-6, 2006 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-16566513

RESUMEN

We present a new window-based method for correspondence search using varying support-weights. We adjust the support-weights of the pixels in a given support window based on color similarity and geometric proximity to reduce the image ambiguity. Our method outperforms other local methods on standard stereo benchmarks.


Asunto(s)
Algoritmos , Colorimetría/métodos , Aumento de la Imagen/métodos , Interpretación de Imagen Asistida por Computador/métodos , Imagenología Tridimensional/métodos , Reconocimiento de Normas Patrones Automatizadas/métodos , Fotogrametría/métodos , Inteligencia Artificial , Biomimética/métodos , Retroalimentación , Humanos , Procesamiento de Señales Asistido por Computador
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...