Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 23
Filtrar
Más filtros

Bases de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Sensors (Basel) ; 24(10)2024 May 18.
Artículo en Inglés | MEDLINE | ID: mdl-38794068

RESUMEN

Most facial analysis methods perform well in standardized testing but not in real-world testing. The main reason is that training models cannot easily learn various human features and background noise, especially for facial landmark detection and head pose estimation tasks with limited and noisy training datasets. To alleviate the gap between standardized and real-world testing, we propose a pseudo-labeling technique using a face recognition dataset consisting of various people and background noise. The use of our pseudo-labeled training dataset can help to overcome the lack of diversity among the people in the dataset. Our integrated framework is constructed using complementary multitask learning methods to extract robust features for each task. Furthermore, introducing pseudo-labeling and multitask learning improves the face recognition performance by enabling the learning of pose-invariant features. Our method achieves state-of-the-art (SOTA) or near-SOTA performance on the AFLW2000-3D and BIWI datasets for facial landmark detection and head pose estimation, with competitive face verification performance on the IJB-C test dataset for face recognition. We demonstrate this through a novel testing methodology that categorizes cases as soft, medium, and hard based on the pose values of IJB-C. The proposed method achieves stable performance even when the dataset lacks diverse face identifications.


Asunto(s)
Reconocimiento Facial Automatizado , Cara , Cabeza , Humanos , Cara/anatomía & histología , Cara/diagnóstico por imagen , Cabeza/diagnóstico por imagen , Reconocimiento Facial Automatizado/métodos , Algoritmos , Aprendizaje Automático , Reconocimiento Facial , Bases de Datos Factuales , Procesamiento de Imagen Asistido por Computador/métodos
2.
Sensors (Basel) ; 24(7)2024 Mar 30.
Artículo en Inglés | MEDLINE | ID: mdl-38610439

RESUMEN

Video-based person re-identification (ReID) aims to exploit relevant features from spatial and temporal knowledge. Widely used methods include the part- and attention-based approaches for suppressing irrelevant spatial-temporal features. However, it is still challenging to overcome inconsistencies across video frames due to occlusion and imperfect detection. These mismatches make temporal processing ineffective and create an imbalance of crucial spatial information. To address these problems, we propose the Spatiotemporal Multi-Granularity Aggregation (ST-MGA) method, which is specifically designed to accumulate relevant features with spatiotemporally consistent cues. The proposed framework consists of three main stages: extraction, which extracts spatiotemporally consistent partial information; augmentation, which augments the partial information with different granularity levels; and aggregation, which effectively aggregates the augmented spatiotemporal information. We first introduce the consistent part-attention (CPA) module, which extracts spatiotemporally consistent and well-aligned attentive parts. Sub-parts derived from CPA provide temporally consistent semantic information, solving misalignment problems in videos due to occlusion or inaccurate detection, and maximize the efficiency of aggregation through uniform partial information. To enhance the diversity of spatial and temporal cues, we introduce the Multi-Attention Part Augmentation (MA-PA) block, which incorporates fine parts at various granular levels, and the Long-/Short-term Temporal Augmentation (LS-TA) block, designed to capture both long- and short-term temporal relations. Using densely separated part cues, ST-MGA fully exploits and aggregates the spatiotemporal multi-granular patterns by comparing relations between parts and scales. In the experiments, the proposed ST-MGA renders state-of-the-art performance on several video-based ReID benchmarks (i.e., MARS, DukeMTMC-VideoReID, and LS-VID).

3.
Molecules ; 28(20)2023 Oct 10.
Artículo en Inglés | MEDLINE | ID: mdl-37894485

RESUMEN

Lowering blood cholesterol levels is crucial for reducing the risk of cardiovascular disease in patients with familial hypercholesterolemia. To develop Perilla frutescens (L.) Britt. leaves as a functional food with a cholesterol-lowering effect, in this study, we collected P. frutescens (L.) Britt. leaves from different regions of China and Republic of Korea. On the basis of the extraction yield (all components; g/kg), we selected P. frutescens (L.) Britt. leaves from Hebei Province, China with an extract yield of 60.9 g/kg. After evaluating different concentrations of ethanol/water solvent for P. frutescens (L.) Britt. leaves, with luteolin 7-glucuronide as the indicator component, we selected a 30% ethanol/water solvent with a high luteolin 7-glucuronide content of 0.548 mg/g in Perilla. frutescens (L.) Britt. leaves. Subsequently, we evaluated the cholesterol-lowering effects of P. frutescens (L.) Britt. leaf extract and luteolin 7-glucuronide by detecting total cholesterol in HepG2 cells. The 30% ethanol extract lowered cholesterol levels significantly by downregulating 3-hydroxy-3-methyl-glutaryl-coenzyme A reductase expression. This suggests that P. frutescens (L.) Britt leaves have significant health benefits and can be explored as a potentially promising food additive for the prevention of hypercholesterolemia-related diseases.


Asunto(s)
Perilla frutescens , Humanos , Glucurónidos , Luteolina , Extractos Vegetales/farmacología , Solventes , Etanol , Colesterol , Agua , Hojas de la Planta
4.
Sensors (Basel) ; 21(22)2021 Nov 11.
Artículo en Inglés | MEDLINE | ID: mdl-34833572

RESUMEN

In recent times, as interest in stress control has increased, many studies on stress recognition have been conducted. Several studies have been based on physiological signals, but the disadvantage of this strategy is that it requires physiological-signal-acquisition devices. Another strategy employs facial-image-based stress-recognition methods, which do not require devices, but predominantly use handcrafted features. However, such features have low discriminating power. We propose a deep-learning-based stress-recognition method using facial images to address these challenges. Given that deep-learning methods require extensive data, we constructed a large-capacity image database for stress recognition. Furthermore, we used temporal attention, which assigns a high weight to frames that are highly related to stress, as well as spatial attention, which assigns a high weight to regions that are highly related to stress. By adding a network that inputs the facial landmark information closely related to stress, we supplemented the network that receives only facial images as the input. Experimental results on our newly constructed database indicated that the proposed method outperforms contemporary deep-learning-based recognition methods.


Asunto(s)
Aprendizaje Profundo , Reconocimiento Facial , Bases de Datos Factuales , Cara , Expresión Facial
5.
Sensors (Basel) ; 21(22)2021 Nov 17.
Artículo en Inglés | MEDLINE | ID: mdl-34833717

RESUMEN

Multi-person pose estimation has been gaining considerable interest due to its use in several real-world applications, such as activity recognition, motion capture, and augmented reality. Although the improvement of the accuracy and speed of multi-person pose estimation techniques has been recently studied, limitations still exist in balancing these two aspects. In this paper, a novel knowledge distilled lightweight top-down pose network (KDLPN) is proposed that balances computational complexity and accuracy. For the first time in multi-person pose estimation, a network that reduces computational complexity by applying a "Pelee" structure and shuffles pixels in the dense upsampling convolution layer to reduce the number of channels is presented. Furthermore, to prevent performance degradation because of the reduced computational complexity, knowledge distillation is applied to establish the pose estimation network as a teacher network. The method performance is evaluated on the MSCOCO dataset. Experimental results demonstrate that our KDLPN network significantly reduces 95% of the parameters required by state-of-the-art methods with minimal performance degradation. Moreover, our method is compared with other pose estimation methods to substantiate the importance of computational complexity reduction and its effectiveness.


Asunto(s)
Postura , Humanos
6.
Sensors (Basel) ; 18(1)2018 Jan 20.
Artículo en Inglés | MEDLINE | ID: mdl-29361699

RESUMEN

The decision tree is one of the most effective tools for deriving meaningful outcomes from image data acquired from the visual sensors. Owing to its reliability, superior generalization abilities, and easy implementation, the tree model has been widely used in various applications. However, in image classification problems, conventional tree methods use only a few sparse attributes as the splitting criterion. Consequently, they suffer from several drawbacks in terms of performance and environmental sensitivity. To overcome these limitations, this paper introduces a new tree induction algorithm that classifies images on the basis of local area learning. To train our predictive model, we extract a random local area within the image and use it as a feature for classification. In addition, the self-organizing map, which is a clustering technique, is used for node learning. We also adopt a random sampled optimization technique to search for the optimal node. Finally, each trained node stores the weights that represent the training data and class probabilities. Thus, a recursively trained tree classifies the data hierarchically based on the local similarity at each node. The proposed tree is a type of predictive model that offers benefits in terms of image's semantic energy conservation compared with conventional tree methods. Consequently, it exhibits improved performance under various conditions, such as noise and illumination changes. Moreover, the proposed algorithm can improve the generalization ability owing to its randomness. In addition, it can be easily applied to ensemble techniques. To evaluate the performance of the proposed algorithm, we perform quantitative and qualitative comparisons with various tree-based methods using four image datasets. The results show that our algorithm not only involves a lower classification error than the conventional methods but also exhibits stable performance even under unfavorable conditions such as noise and illumination changes.

7.
Psychiatry Clin Neurosci ; 71(10): 725-732, 2017 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-28547882

RESUMEN

AIM: The current cut-off score of the Korean version of the Childhood Autism Rating Scale (K-CARS) does not seem to be sensitive enough to precisely diagnose high-functioning autism. The aim of this study was to identify the optimal cut-off score of K-CARS for diagnosing high-functioning individuals with autism spectrum disorders (ASD). METHODS: A total of 329 participants were assessed by the Korean versions of the Autism Diagnostic Interview - Revised (K-ADI-R), Autism Diagnostic Observation Schedule (K-ADOS), and K-CARS. IQ and Social Maturity Scale scores were also obtained. RESULTS: The true positive and false negative rates of K-CARS were 77.2% and 22.8%, respectively. Verbal IQ (VIQ) and Social Quotient (SQ) were significant predictors of misclassification. The false negative rate increased to 36.0% from 19.8% when VIQ was >69.5, and the rate increased to 44.1% for participants with VIQ > 69.5 and SQ > 75.5. In addition, if SQ was >83.5, the false negative rate increased to 46.7%, even if the participant's VIQ was ≤69.5. Optimal cut-off scores were 28.5 (for VIQ ≤ 69.5 and SQ ≤ 75.5), 24.25 (for VIQ > 69.5 and SQ > 75.5), and 24.5 (for SQ > 83.5), respectively. CONCLUSION: The likelihood of a false negative error increases when K-CARS is used to diagnose high-functioning autism and Asperger's syndrome. For subjects with ASD and substantial verbal ability, the cut-off score for K-CARS should be re-adjusted and/or supplementary diagnostic tools might be needed to enhance diagnostic accuracy for ASD.


Asunto(s)
Trastorno del Espectro Autista/diagnóstico , Escalas de Valoración Psiquiátrica/normas , Adolescente , Niño , Preescolar , Reacciones Falso Negativas , Femenino , Humanos , Corea (Geográfico) , Masculino , Adulto Joven
8.
Sensors (Basel) ; 17(7)2017 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-28671565

RESUMEN

Studies on depth images containing three-dimensional information have been performed for many practical applications. However, the depth images acquired from depth sensors have inherent problems, such as missing values and noisy boundaries. These problems significantly affect the performance of applications that use a depth image as their input. This paper describes a depth enhancement algorithm based on a combination of color and depth information. To fill depth holes and recover object shapes, asynchronous cellular automata with neighborhood distance maps are used. Image segmentation and a weighted linear combination of spatial filtering algorithms are applied to extract object regions and fill disocclusion in the object regions. Experimental results on both real-world and public datasets show that the proposed method enhances the quality of the depth image with low computational complexity, outperforming conventional methods on a number of metrics. Furthermore, to verify the performance of the proposed method, we present stereoscopic images generated by the enhanced depth image to illustrate the improvement in quality.

9.
Sensors (Basel) ; 17(1)2017 Jan 17.
Artículo en Inglés | MEDLINE | ID: mdl-28106716

RESUMEN

The research on hand gestures has attracted many image processing-related studies, as it intuitively conveys the intention of a human as it pertains to motional meaning. Various sensors have been used to exploit the advantages of different modalities for the extraction of important information conveyed by the hand gesture of a user. Although many works have focused on learning the benefits of thermal information from thermal cameras, most have focused on face recognition or human body detection, rather than hand gesture recognition. Additionally, the majority of the works that take advantage of multiple modalities (e.g., the combination of a thermal sensor and a visual sensor), usually adopting simple fusion approaches between the two modalities. As both thermal sensors and visual sensors have their own shortcomings and strengths, we propose a novel joint filter-based hand gesture recognition method to simultaneously exploit the strengths and compensate the shortcomings of each. Our study is motivated by the investigation of the mutual supplementation between thermal and visual information in low feature level for the consistent representation of a hand in the presence of varying lighting conditions. Accordingly, our proposed method leverages the thermal sensor's stability against luminance and the visual sensors textural detail, while complementing the low resolution and halo effect of thermal sensors and the weakness against illumination of visual sensors. A conventional region tracking method and a deep convolutional neural network have been leveraged to track the trajectory of a hand gesture and to recognize the hand gesture, respectively. Our experimental results show stability in recognizing a hand gesture against varying lighting conditions based on the contribution of the joint kernels of spatial adjacency and thermal range similarity.


Asunto(s)
Gestos , Mano , Humanos , Procesamiento de Imagen Asistido por Computador , Movimiento (Física) , Reconocimiento de Normas Patrones Automatizadas , Estimulación Luminosa
10.
Sensors (Basel) ; 15(1): 1537-63, 2015 Jan 14.
Artículo en Inglés | MEDLINE | ID: mdl-25594594

RESUMEN

In order to develop security systems for identity authentication, face recognition (FR) technology has been applied. One of the main problems of applying FR technology is that the systems are especially vulnerable to attacks with spoofing faces (e.g., 2D pictures). To defend from these attacks and to enhance the reliability of FR systems, many anti-spoofing approaches have been recently developed. In this paper, we propose a method for face liveness detection using the effect of defocus. From two images sequentially taken at different focuses, three features, focus, power histogram and gradient location and orientation histogram (GLOH), are extracted. Afterwards, we detect forged faces through the feature-level fusion approach. For reliable performance verification, we develop two databases with a handheld digital camera and a webcam. The proposed method achieves a 3.29% half total error rate (HTER) at a given depth of field (DoF) and can be extended to camera-equipped devices, like smartphones.

11.
Sensors (Basel) ; 15(1): 1022-46, 2015 Jan 08.
Artículo en Inglés | MEDLINE | ID: mdl-25580901

RESUMEN

Vision-based hand gesture interactions are natural and intuitive when interacting with computers, since we naturally exploit gestures to communicate with other people. However, it is agreed that users suffer from discomfort and fatigue when using gesture-controlled interfaces, due to the lack of physical feedback. To solve the problem, we propose a novel complete solution of a hand gesture control system employing immersive tactile feedback to the user's hand. For this goal, we first developed a fast and accurate hand-tracking algorithm with a Kinect sensor using the proposed MLBP (modified local binary pattern) that can efficiently analyze 3D shapes in depth images. The superiority of our tracking method was verified in terms of tracking accuracy and speed by comparing with existing methods, Natural Interaction Technology for End-user (NITE), 3D Hand Tracker and CamShift. As the second step, a new tactile feedback technology with a piezoelectric actuator has been developed and integrated into the developed hand tracking algorithm, including the DTW (dynamic time warping) gesture recognition algorithm for a complete solution of an immersive gesture control system. The quantitative and qualitative evaluations of the integrated system were conducted with human subjects, and the results demonstrate that our gesture control with tactile feedback is a promising technology compared to a vision-based gesture control system that has typically no feedback for the user's gesture inputs. Our study provides researchers and designers with informative guidelines to develop more natural gesture control systems or immersive user interfaces with haptic feedback.


Asunto(s)
Aire , Retroalimentación Fisiológica , Gestos , Mano/fisiología , Fotograbar/instrumentación , Tacto/fisiología , Adulto , Algoritmos , Femenino , Humanos , Masculino , Procesamiento de Señales Asistido por Computador , Encuestas y Cuestionarios
12.
Sensors (Basel) ; 14(6): 10412-31, 2014 Jun 13.
Artículo en Inglés | MEDLINE | ID: mdl-24932864

RESUMEN

In this paper, we propose a new haptic-assisted virtual cane system operated by a simple finger pointing gesture. The system is developed by two stages: development of visual information delivery assistant (VIDA) with a stereo camera and adding a tactile feedback interface with dual actuators for guidance and distance feedbacks. In the first stage, user's pointing finger is automatically detected using color and disparity data from stereo images and then a 3D pointing direction of the finger is estimated with its geometric and textural features. Finally, any object within the estimated pointing trajectory in 3D space is detected and the distance is then estimated in real time. For the second stage, identifiable tactile signals are designed through a series of identification experiments, and an identifiable tactile feedback interface is developed and integrated into the VIDA system. Our approach differs in that navigation guidance is provided by a simple finger pointing gesture and tactile distance feedbacks are perfectly identifiable to the blind.


Asunto(s)
Ceguera/rehabilitación , Bastones , Imagenología Tridimensional/instrumentación , Estimulación Física/instrumentación , Dispositivos de Autoayuda , Tacto , Interfaz Usuario-Computador , Diseño de Equipo , Análisis de Falla de Equipo , Retroalimentación Fisiológica , Humanos , Orientación , Reconocimiento de Normas Patrones Automatizadas/métodos , Procesamiento de Señales Asistido por Computador/instrumentación , Transductores
13.
Sensors (Basel) ; 14(12): 22471-99, 2014 Nov 27.
Artículo en Inglés | MEDLINE | ID: mdl-25436651

RESUMEN

A light field camera is a sensor that can record the directions as well as the colors of incident rays. This camera is widely utilized from 3D reconstruction to face and iris recognition. In this paper, we suggest a novel approach for defending spoofing face attacks, like printed 2D facial photos (hereinafter 2D photos) and HD tablet images, using the light field camera. By viewing the raw light field photograph from a different standpoint, we extract two special features which cannot be obtained from the conventional camera. To verify the performance, we compose light field photograph databases and conduct experiments. Our proposed method achieves at least 94.78% accuracy or up to 99.36% accuracy under different types of spoofing attacks.

14.
Sensors (Basel) ; 14(4): 6279-301, 2014 Mar 31.
Artículo en Inglés | MEDLINE | ID: mdl-24691101

RESUMEN

In this paper, a noble nonintrusive three-dimensional (3D) face modeling system for random-profile-based 3D face recognition is presented. Although recent two-dimensional (2D) face recognition systems can achieve a reliable recognition rate under certain conditions, their performance is limited by internal and external changes, such as illumination and pose variation. To address these issues, 3D face recognition, which uses 3D face data, has recently received much attention. However, the performance of 3D face recognition highly depends on the precision of acquired 3D face data, while also requiring more computational power and storage capacity than 2D face recognition systems. In this paper, we present a developed nonintrusive 3D face modeling system composed of a stereo vision system and an invisible near-infrared line laser, which can be directly applied to profile-based 3D face recognition. We further propose a novel random-profile-based 3D face recognition method that is memory-efficient and pose-invariant. The experimental results demonstrate that the reconstructed 3D face data consists of more than 50 k 3D point clouds and a reliable recognition rate against pose variation.


Asunto(s)
Cara/anatomía & histología , Reconocimiento de Normas Patrones Automatizadas/métodos , Algoritmos , Humanos , Procesamiento de Imagen Asistido por Computador , Modelos Teóricos , Sistemas en Línea
15.
J Nanosci Nanotechnol ; 13(1): 294-9, 2013 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-23646729

RESUMEN

High efficiency blue organic light emitting diodes (OLEDs), based on 2-me-thyl-9,10-di(2-naphthyl) anthracene (MADN) doped with 4,4'-bis(9-ethyl-3-carbazovinylene)-1,1'-biphenyl (BCzVBi), were fabricated using two different electron transport layers (ETLs) of tris(8-hydroxyquinolino)-aluminum (Alq3) and 4,7-di-phenyl-1,10-phenanthroline (Bphen). Bphen ETL layers favored the efficient hole-electron recombination in the emissive layer of the BCzVBi-doped blue OLEDs, leading to high luminous efficiency and quantum efficiency of 8.34 cd/A at 100 mA/cm2 and 5.73% at 100 cd/m2, respectively. Maximum luminance of blue OLED with Bphen ETL and Alq3 ETL were 10670 cd/m2, and CIExy coordinates of blue OLEDs were (0.180, 0279) and (0.155, 0.212) at 100 cd/m2.


Asunto(s)
Iluminación/instrumentación , Semiconductores , Color , Transporte de Electrón , Diseño de Equipo , Análisis de Falla de Equipo , Fluorescencia
16.
Sensors (Basel) ; 13(10): 12804-29, 2013 Sep 25.
Artículo en Inglés | MEDLINE | ID: mdl-24072025

RESUMEN

This paper presents a novel three-dimensional (3D) multi-spectrum sensor system, which combines a 3D depth sensor and multiple optical sensors for different wavelengths. Various image sensors, such as visible, infrared (IR) and 3D sensors, have been introduced into the commercial market. Since each sensor has its own advantages under various environmental conditions, the performance of an application depends highly on selecting the correct sensor or combination of sensors. In this paper, a sensor system, which we will refer to as a 3D multi-spectrum sensor system, which comprises three types of sensors, visible, thermal-IR and time-of-flight (ToF), is proposed. Since the proposed system integrates information from each sensor into one calibrated framework, the optimal sensor combination for an application can be easily selected, taking into account all combinations of sensors information. To demonstrate the effectiveness of the proposed system, a face recognition system with light and pose variation is designed. With the proposed sensor system, the optimal sensor combination, which provides new effectively fused features for a face recognition system, is obtained.


Asunto(s)
Biometría/instrumentación , Colorimetría/instrumentación , Cara/anatomía & histología , Interpretación de Imagen Asistida por Computador/instrumentación , Imagenología Tridimensional/instrumentación , Reconocimiento de Normas Patrones Automatizadas/métodos , Transductores , Diseño de Equipo , Análisis de Falla de Equipo , Humanos
17.
Foods ; 12(6)2023 Mar 16.
Artículo en Inglés | MEDLINE | ID: mdl-36981186

RESUMEN

In this study, we developed a novel offline high-performance liquid chromatography (HPLC) method based on 2,2'-azobis(2-amidinopropane) dihydrochloride (AAPH) radicals for antioxidant screening in 20 polyphenolic compounds and used the Trolox equivalent antioxidant capacity assay to evaluate their antioxidant activity. Compared to the existing offline HPLC methods based on 2,2'-azino-bis(3-ethylbenzothiazoline-6-sulfonic acid (ABTS) and 2,2-diphenyl-1-picrylhydrazyl (DPPH), the offline HPLC method based on the AAPH radical is more sensitive. Additionally, we applied this method to Lepechinia meyenii (Walp.) Epling extract and screened out seven antioxidants, caffeic acid, hesperidin, rosmarinic acid, diosmin, methyl rosmarinate, diosmetin, and n-butyl rosmarinate, which are known antioxidants. Therefore, this study provides new insights into the screening of antioxidants in natural extracts.

18.
Sensors (Basel) ; 12(10): 12870-89, 2012 Sep 25.
Artículo en Inglés | MEDLINE | ID: mdl-23201976

RESUMEN

In this paper, we focus on the problem of the accuracy performance of 3D face modeling techniques using corresponding features in multiple views, which is quite sensitive to feature extraction errors. To solve the problem, we adopt a statistical model-based 3D face modeling approach in a mirror system consisting of two mirrors and a camera. The overall procedure of our 3D facial modeling method has two primary steps: 3D facial shape estimation using a multiple 3D face deformable model and texture mapping using seamless cloning that is a type of gradient-domain blending. To evaluate our method's performance, we generate 3D faces of 30 individuals and then carry out two tests: accuracy test and robustness test. Our method shows not only highly accurate 3D face shape results when compared with the ground truth, but also robustness to feature extraction errors. Moreover, 3D face rendering results intuitively show that our method is more robust to feature extraction errors than other 3D face modeling methods. An additional contribution of our method is that a wide range of face textures can be acquired by the mirror system. By using this texture map, we generate realistic 3D face for individuals at the end of the paper.


Asunto(s)
Algoritmos , Cara , Imagenología Tridimensional/métodos , Reconocimiento de Normas Patrones Automatizadas/métodos , Simulación por Computador , Humanos , Imagenología Tridimensional/estadística & datos numéricos , Modelos Anatómicos , Reconocimiento de Normas Patrones Automatizadas/estadística & datos numéricos , Somatotipos
19.
IEEE Trans Image Process ; 31: 1176-1189, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-34995189

RESUMEN

Blurring in videos is a frequent phenomenon in real-world video data owing to camera shake or object movement at different scene depths. Hence, video deblurring is an ill-posed problem that requires understanding of geometric and temporal information. Traditional model-based optimization methods first define a degradation model and then solve an optimization problem to recover the latent frames with a variational model for additional external information, such as optical flow, segmentation, depth, or camera movement. Recent deep-learning-based approaches learn from numerous training pairs of blurred and clean latent frames, with the powerful representation ability of deep convolutional neural networks. Although deep models have achieved remarkable performances without the explicit model, existing deep methods do not utilize geometrical information as strong priors. Therefore, they cannot handle extreme blurring caused by large camera shake or scene depth variations. In this paper, we propose a geometry-aware deep video deblurring method via a recurrent feature refinement module that exploits optimization-based and deep-learning-based schemes. In addition to the off-the-shelf deep geometry estimation modules, we design an effective fusion module for geometrical information with deep video features. Specifically, similar to model-based optimization, our proposed module recurrently refines video features as well as geometrical information to restore more precise latent frames. To evaluate the effectiveness and generalization of our framework, we perform tests on eight baseline networks whose structures are motivated by the previous research. The experimental results show that our framework offers greater performances than the eight baselines and produces state-of-the-art performance on four video deblurring benchmark datasets.

20.
IEEE Trans Image Process ; 31: 664-677, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-34914591

RESUMEN

Recent deep neural network-based research to enhance image compression performance can be divided into three categories: learnable codecs, postprocessing networks, and compact representation networks. The learnable codec has been designed for end-to-end learning beyond the conventional compression modules. The postprocessing network increases the quality of decoded images using example-based learning. The compact representation network is learned to reduce the capacity of an input image, reducing the bit rate while maintaining the quality of the decoded image. However, these approaches are not compatible with existing codecs or are not optimal for increasing coding efficiency. Specifically, it is difficult to achieve optimal learning in previous studies using a compact representation network due to the inaccurate consideration of the codecs. In this paper, we propose a novel standard compatible image compression framework based on auxiliary codec networks (ACNs). In addition, ACNs are designed to imitate image degradation operations of the existing codec, which delivers more accurate gradients to the compact representation network. Therefore, compact representation and postprocessing networks can be learned effectively and optimally. We demonstrate that the proposed framework based on the JPEG and High Efficiency Video Coding standard substantially outperforms existing image compression algorithms in a standard compatible manner.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA