Búsqueda | Portal Regional de la BVS

Multitask Learning Strategy with Pseudo-Labeling: Face Recognition, Facial Landmark Detection, and Head Pose Estimation.

Lee, Yongju; Jang, Sungjun; Bae, Han Byeol; Jeon, Taejae; Lee, Sangyoun.

Sensors (Basel) ; 24(10)2024 May 18.

Artículo en Inglés | MEDLINE | ID: mdl-38794068

RESUMEN

Most facial analysis methods perform well in standardized testing but not in real-world testing. The main reason is that training models cannot easily learn various human features and background noise, especially for facial landmark detection and head pose estimation tasks with limited and noisy training datasets. To alleviate the gap between standardized and real-world testing, we propose a pseudo-labeling technique using a face recognition dataset consisting of various people and background noise. The use of our pseudo-labeled training dataset can help to overcome the lack of diversity among the people in the dataset. Our integrated framework is constructed using complementary multitask learning methods to extract robust features for each task. Furthermore, introducing pseudo-labeling and multitask learning improves the face recognition performance by enabling the learning of pose-invariant features. Our method achieves state-of-the-art (SOTA) or near-SOTA performance on the AFLW2000-3D and BIWI datasets for facial landmark detection and head pose estimation, with competitive face verification performance on the IJB-C test dataset for face recognition. We demonstrate this through a novel testing methodology that categorizes cases as soft, medium, and hard based on the pose values of IJB-C. The proposed method achieves stable performance even when the dataset lacks diverse face identifications.

Asunto(s)

Reconocimiento Facial Automatizado , Cara , Cabeza , Humanos , Cara/anatomía & histología , Cara/diagnóstico por imagen , Cabeza/diagnóstico por imagen , Reconocimiento Facial Automatizado/métodos , Algoritmos , Aprendizaje Automático , Reconocimiento Facial , Bases de Datos Factuales , Procesamiento de Imagen Asistido por Computador/métodos

Multi-Granularity Aggregation with Spatiotemporal Consistency for Video-Based Person Re-Identification.

Lee, Hean Sung; Kim, Minjung; Jang, Sungjun; Bae, Han Byeol; Lee, Sangyoun.

Sensors (Basel) ; 24(7)2024 Mar 30.

Artículo en Inglés | MEDLINE | ID: mdl-38610439

RESUMEN

Video-based person re-identification (ReID) aims to exploit relevant features from spatial and temporal knowledge. Widely used methods include the part- and attention-based approaches for suppressing irrelevant spatial-temporal features. However, it is still challenging to overcome inconsistencies across video frames due to occlusion and imperfect detection. These mismatches make temporal processing ineffective and create an imbalance of crucial spatial information. To address these problems, we propose the Spatiotemporal Multi-Granularity Aggregation (ST-MGA) method, which is specifically designed to accumulate relevant features with spatiotemporally consistent cues. The proposed framework consists of three main stages: extraction, which extracts spatiotemporally consistent partial information; augmentation, which augments the partial information with different granularity levels; and aggregation, which effectively aggregates the augmented spatiotemporal information. We first introduce the consistent part-attention (CPA) module, which extracts spatiotemporally consistent and well-aligned attentive parts. Sub-parts derived from CPA provide temporally consistent semantic information, solving misalignment problems in videos due to occlusion or inaccurate detection, and maximize the efficiency of aggregation through uniform partial information. To enhance the diversity of spatial and temporal cues, we introduce the Multi-Attention Part Augmentation (MA-PA) block, which incorporates fine parts at various granular levels, and the Long-/Short-term Temporal Augmentation (LS-TA) block, designed to capture both long- and short-term temporal relations. Using densely separated part cues, ST-MGA fully exploits and aggregates the spatiotemporal multi-granular patterns by comparing relations between parts and scales. In the experiments, the proposed ST-MGA renders state-of-the-art performance on several video-based ReID benchmarks (i.e., MARS, DukeMTMC-VideoReID, and LS-VID).

A Study on Wheel Member Condition Recognition Using Machine Learning (Support Vector Machine).

Lee, Jin-Han; Lee, Jun-Hee; Yun, Kwang-Su; Bae, Han Byeol; Kim, Sun Young; Jeong, Jae-Hoon; Kim, Jin-Pyung.

Sensors (Basel) ; 23(20)2023 Oct 13.

Artículo en Inglés | MEDLINE | ID: mdl-37896551

RESUMEN

The wheels of railway vehicles are of paramount importance in relation to railroad operations and safety. Currently, the management of railway vehicle wheels is restricted to post-event inspections of the wheels whenever physical phenomena, such as abnormal vibrations and noise, occur during the operation of railway vehicles. To address this issue, this paper proposes a method for predicting abnormalities in railway wheels in advance and enhancing the learning and prediction performance of machine learning algorithms. Data were collected during the operation of Line 4 of the Busan Metro in South Korea by directly attaching sensors to the railway vehicles. Through the analysis of key factors in the collected data, factors that can be used for tire condition classification were derived. Additionally, through data distribution analysis and correlation analysis, factors for classifying tire conditions were identified. As a result, it was determined that the z-axis of acceleration has a significant impact, and machine learning techniques such as SVM (Linear Kernel, RBF Kernel) and Random Forest were utilized based on acceleration data to classify tire conditions into in-service and defective states. The SVM (Linear Kernel) yielded the highest recognition rate at 98.70%.

Deep-Learning-Based Stress Recognition with Spatial-Temporal Facial Information.

Jeon, Taejae; Bae, Han Byeol; Lee, Yongju; Jang, Sungjun; Lee, Sangyoun.

Sensors (Basel) ; 21(22)2021 Nov 11.

Artículo en Inglés | MEDLINE | ID: mdl-34833572

RESUMEN

In recent times, as interest in stress control has increased, many studies on stress recognition have been conducted. Several studies have been based on physiological signals, but the disadvantage of this strategy is that it requires physiological-signal-acquisition devices. Another strategy employs facial-image-based stress-recognition methods, which do not require devices, but predominantly use handcrafted features. However, such features have low discriminating power. We propose a deep-learning-based stress-recognition method using facial images to address these challenges. Given that deep-learning methods require extensive data, we constructed a large-capacity image database for stress recognition. Furthermore, we used temporal attention, which assigns a high weight to frames that are highly related to stress, as well as spatial attention, which assigns a high weight to regions that are highly related to stress. By adding a network that inputs the facial landmark information closely related to stress, we supplemented the network that receives only facial images as the input. Experimental results on our newly constructed database indicated that the proposed method outperforms contemporary deep-learning-based recognition methods.

Asunto(s)

Aprendizaje Profundo , Reconocimiento Facial , Bases de Datos Factuales , Cara , Expresión Facial

An Efficient Approach Using Knowledge Distillation Methods to Stabilize Performance in a Lightweight Top-Down Posture Estimation Network.

Park, Changhyun; Lee, Hean Sung; Kim, Woo Jin; Bae, Han Byeol; Lee, Jaeho; Lee, Sangyoun.

Sensors (Basel) ; 21(22)2021 Nov 17.

Artículo en Inglés | MEDLINE | ID: mdl-34833717

RESUMEN

Multi-person pose estimation has been gaining considerable interest due to its use in several real-world applications, such as activity recognition, motion capture, and augmented reality. Although the improvement of the accuracy and speed of multi-person pose estimation techniques has been recently studied, limitations still exist in balancing these two aspects. In this paper, a novel knowledge distilled lightweight top-down pose network (KDLPN) is proposed that balances computational complexity and accuracy. For the first time in multi-person pose estimation, a network that reduces computational complexity by applying a "Pelee" structure and shuffles pixels in the dense upsampling convolution layer to reduce the number of channels is presented. Furthermore, to prevent performance degradation because of the reduced computational complexity, knowledge distillation is applied to establish the pose estimation network as a teacher network. The method performance is evaluated on the MSCOCO dataset. Experimental results demonstrate that our KDLPN network significantly reduces 95% of the parameters required by state-of-the-art methods with minimal performance degradation. Moreover, our method is compared with other pose estimation methods to substantiate the importance of computational complexity reduction and its effectiveness.

Asunto(s)

Postura , Humanos

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA