Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 58
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Eur J Pediatr ; 183(9): 3797-3808, 2024 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-38871980

RESUMEN

Williams-Beuren syndrome (WBS) is a rare genetic disorder characterized by special facial gestalt, delayed development, and supravalvular aortic stenosis or/and stenosis of the branches of the pulmonary artery. We aim to develop and optimize accurate models of facial recognition to assist in the diagnosis of WBS, and to evaluate their effectiveness by using both five-fold cross-validation and an external test set. We used a total of 954 images from 135 patients with WBS, 124 patients suffering from other genetic disorders, and 183 healthy children. The training set comprised 852 images of 104 WBS cases, 91 cases of other genetic disorders, and 145 healthy children from September 2017 to December 2021 at the Guangdong Provincial People's Hospital. We constructed six binary classification models of facial recognition for WBS by using EfficientNet-b3, ResNet-50, VGG-16, VGG-16BN, VGG-19, and VGG-19BN. Transfer learning was used to pre-train the models, and each model was modified with a variable cosine learning rate. Each model was first evaluated by using five-fold cross-validation and then assessed on the external test set. The latter contained 102 images of 31 children suffering from WBS, 33 children with other genetic disorders, and 38 healthy children. To compare the capabilities of these models of recognition with those of human experts in terms of identifying cases of WBS, we recruited two pediatricians, a pediatric cardiologist, and a pediatric geneticist to identify the WBS patients based solely on their facial images. We constructed six models of facial recognition for diagnosing WBS using EfficientNet-b3, ResNet-50, VGG-16, VGG-16BN, VGG-19, and VGG-19BN. The model based on VGG-19BN achieved the best performance in terms of five-fold cross-validation, with an accuracy of 93.74% ± 3.18%, precision of 94.93% ± 4.53%, specificity of 96.10% ± 4.30%, and F1 score of 91.65% ± 4.28%, while the VGG-16BN model achieved the highest recall value of 91.63% ± 5.96%. The VGG-19BN model also achieved the best performance on the external test set, with an accuracy of 95.10%, precision of 100%, recall of 83.87%, specificity of 93.42%, and F1 score of 91.23%. The best performance by human experts on the external test set yielded values of accuracy, precision, recall, specificity, and F1 scores of 77.45%, 60.53%, 77.42%, 83.10%, and 66.67%, respectively. The F1 score of each human expert was lower than those of the EfficientNet-b3 (84.21%), ResNet-50 (74.51%), VGG-16 (85.71%), VGG-16BN (85.71%), VGG-19 (83.02%), and VGG-19BN (91.23%) models. CONCLUSION: The results showed that facial recognition technology can be used to accurately diagnose patients with WBS. Facial recognition models based on VGG-19BN can play a crucial role in its clinical diagnosis. Their performance can be improved by expanding the size of the training dataset, optimizing the CNN architectures applied, and modifying them with a variable cosine learning rate. WHAT IS KNOWN: • The facial gestalt of WBS, often described as "elfin," includes a broad forehead, periorbital puffiness, a flat nasal bridge, full cheeks, and a small chin. • Recent studies have demonstrated the potential of deep convolutional neural networks for facial recognition as a diagnostic tool for WBS. WHAT IS NEW: • This study develops six models of facial recognition, EfficientNet-b3, ResNet-50, VGG-16, VGG-16BN, VGG-19, and VGG-19BN, to improve WBS diagnosis. • The VGG-19BN model achieved the best performance, with an accuracy of 95.10% and specificity of 93.42%. The facial recognition model based on VGG-19BN can play a crucial role in the clinical diagnosis of WBS.


Asunto(s)
Síndrome de Williams , Humanos , Síndrome de Williams/diagnóstico , Síndrome de Williams/genética , Niño , Femenino , Masculino , Preescolar , Lactante , Estudios de Casos y Controles , Adolescente , Reconocimiento Facial , Reconocimiento Facial Automatizado/métodos
2.
BMC Pediatr ; 24(1): 361, 2024 May 24.
Artículo en Inglés | MEDLINE | ID: mdl-38783283

RESUMEN

BACKGROUND: Noonan syndrome (NS) is a rare genetic disease, and patients who suffer from it exhibit a facial morphology that is characterized by a high forehead, hypertelorism, ptosis, inner epicanthal folds, down-slanting palpebral fissures, a highly arched palate, a round nasal tip, and posteriorly rotated ears. Facial analysis technology has recently been applied to identify many genetic syndromes (GSs). However, few studies have investigated the identification of NS based on the facial features of the subjects. OBJECTIVES: This study develops advanced models to enhance the accuracy of diagnosis of NS. METHODS: A total of 1,892 people were enrolled in this study, including 233 patients with NS, 863 patients with other GSs, and 796 healthy children. We took one to 10 frontal photos of each subject to build a dataset, and then applied the multi-task convolutional neural network (MTCNN) for data pre-processing to generate standardized outputs with five crucial facial landmarks. The ImageNet dataset was used to pre-train the network so that it could capture generalizable features and minimize data wastage. We subsequently constructed seven models for facial identification based on the VGG16, VGG19, VGG16-BN, VGG19-BN, ResNet50, MobileNet-V2, and squeeze-and-excitation network (SENet) architectures. The identification performance of seven models was evaluated and compared with that of six physicians. RESULTS: All models exhibited a high accuracy, precision, and specificity in recognizing NS patients. The VGG19-BN model delivered the best overall performance, with an accuracy of 93.76%, precision of 91.40%, specificity of 98.73%, and F1 score of 78.34%. The VGG16-BN model achieved the highest AUC value of 0.9787, while all models based on VGG architectures were superior to the others on the whole. The highest scores of six physicians in terms of accuracy, precision, specificity, and the F1 score were 74.00%, 75.00%, 88.33%, and 61.76%, respectively. The performance of each model of facial recognition was superior to that of the best physician on all metrics. CONCLUSION: Models of computer-assisted facial recognition can improve the rate of diagnosis of NS. The models based on VGG19-BN and VGG16-BN can play an important role in diagnosing NS in clinical practice.


Asunto(s)
Síndrome de Noonan , Humanos , Síndrome de Noonan/diagnóstico , Niño , Femenino , Masculino , Preescolar , Redes Neurales de la Computación , Lactante , Adolescente , Reconocimiento Facial Automatizado/métodos , Diagnóstico por Computador/métodos , Sensibilidad y Especificidad , Estudios de Casos y Controles
3.
Sensors (Basel) ; 24(17)2024 Sep 03.
Artículo en Inglés | MEDLINE | ID: mdl-39275635

RESUMEN

In this paper, we study facial expression recognition (FER) using three modalities obtained from a light field camera: sub-aperture (SA), depth map, and all-in-focus (AiF) images. Our objective is to construct a more comprehensive and effective FER system by investigating multimodal fusion strategies. For this purpose, we employ EfficientNetV2-S, pre-trained on AffectNet, as our primary convolutional neural network. This model, combined with a BiGRU, is used to process SA images. We evaluate various fusion techniques at both decision and feature levels to assess their effectiveness in enhancing FER accuracy. Our findings show that the model using SA images surpasses state-of-the-art performance, achieving 88.13% ± 7.42% accuracy under the subject-specific evaluation protocol and 91.88% ± 3.25% under the subject-independent evaluation protocol. These results highlight our model's potential in enhancing FER accuracy and robustness, outperforming existing methods. Furthermore, our multimodal fusion approach, integrating SA, AiF, and depth images, demonstrates substantial improvements over unimodal models. The decision-level fusion strategy, particularly using average weights, proved most effective, achieving 90.13% ± 4.95% accuracy under the subject-specific evaluation protocol and 93.33% ± 4.92% under the subject-independent evaluation protocol. This approach leverages the complementary strengths of each modality, resulting in a more comprehensive and accurate FER system.


Asunto(s)
Expresión Facial , Redes Neurales de la Computación , Humanos , Procesamiento de Imagen Asistido por Computador/métodos , Reconocimiento Facial Automatizado/métodos , Algoritmos , Reconocimiento de Normas Patrones Automatizadas/métodos
4.
Sensors (Basel) ; 24(13)2024 Jun 26.
Artículo en Inglés | MEDLINE | ID: mdl-39000930

RESUMEN

Convolutional neural networks (CNNs) have made significant progress in the field of facial expression recognition (FER). However, due to challenges such as occlusion, lighting variations, and changes in head pose, facial expression recognition in real-world environments remains highly challenging. At the same time, methods solely based on CNN heavily rely on local spatial features, lack global information, and struggle to balance the relationship between computational complexity and recognition accuracy. Consequently, the CNN-based models still fall short in their ability to address FER adequately. To address these issues, we propose a lightweight facial expression recognition method based on a hybrid vision transformer. This method captures multi-scale facial features through an improved attention module, achieving richer feature integration, enhancing the network's perception of key facial expression regions, and improving feature extraction capabilities. Additionally, to further enhance the model's performance, we have designed the patch dropping (PD) module. This module aims to emulate the attention allocation mechanism of the human visual system for local features, guiding the network to focus on the most discriminative features, reducing the influence of irrelevant features, and intuitively lowering computational costs. Extensive experiments demonstrate that our approach significantly outperforms other methods, achieving an accuracy of 86.51% on RAF-DB and nearly 70% on FER2013, with a model size of only 3.64 MB. These results demonstrate that our method provides a new perspective for the field of facial expression recognition.


Asunto(s)
Expresión Facial , Redes Neurales de la Computación , Humanos , Reconocimiento Facial Automatizado/métodos , Algoritmos , Procesamiento de Imagen Asistido por Computador/métodos , Cara , Reconocimiento de Normas Patrones Automatizadas/métodos
5.
Sensors (Basel) ; 24(16)2024 Aug 21.
Artículo en Inglés | MEDLINE | ID: mdl-39205085

RESUMEN

In recent years, significant progress has been made in facial expression recognition methods. However, tasks related to facial expression recognition in real environments still require further research. This paper proposes a tri-cross-attention transformer with a multi-feature fusion network (TriCAFFNet) to improve facial expression recognition performance under challenging conditions. By combining LBP (Local Binary Pattern) features, HOG (Histogram of Oriented Gradients) features, landmark features, and CNN (convolutional neural network) features from facial images, the model is provided with a rich input to improve its ability to discern subtle differences between images. Additionally, tri-cross-attention blocks are designed to facilitate information exchange between different features, enabling mutual guidance among different features to capture salient attention. Extensive experiments on several widely used datasets show that our TriCAFFNet achieves the SOTA performance on RAF-DB with 92.17%, AffectNet (7 cls) with 67.40%, and AffectNet (8 cls) with 63.49%, respectively.


Asunto(s)
Expresión Facial , Redes Neurales de la Computación , Humanos , Algoritmos , Procesamiento de Imagen Asistido por Computador/métodos , Cara/anatomía & histología , Reconocimiento Facial Automatizado/métodos , Reconocimiento de Normas Patrones Automatizadas/métodos
6.
Sensors (Basel) ; 24(17)2024 Aug 31.
Artículo en Inglés | MEDLINE | ID: mdl-39275593

RESUMEN

It is estimated that 10% to 20% of road accidents are related to fatigue, with accidents caused by drowsiness up to twice as deadly as those caused by other factors. In order to reduce these numbers, strategies such as advertising campaigns, the implementation of driving recorders in vehicles used for road transport of goods and passengers, or the use of drowsiness detection systems in cars have been implemented. Within the scope of the latter area, the technologies used are diverse. They can be based on the measurement of signals such as steering wheel movement, vehicle position on the road, or driver monitoring. Driver monitoring is a technology that has been exploited little so far and can be implemented in many different approaches. This work addresses the evaluation of a multidimensional drowsiness index based on the recording of facial expressions, gaze direction, and head position and studies the feasibility of its implementation in a low-cost electronic package. Specifically, the aim is to determine the driver's state by monitoring their facial expressions, such as the frequency of blinking, yawning, eye-opening, gaze direction, and head position. For this purpose, an algorithm capable of detecting drowsiness has been developed. Two approaches are compared: Facial recognition based on Haar features and facial recognition based on Histograms of Oriented Gradients (HOG). The implementation has been carried out on a Raspberry Pi, a low-cost device that allows the creation of a prototype that can detect drowsiness and interact with peripherals such as cameras or speakers. The results show that the proposed multi-index methodology performs better in detecting drowsiness than algorithms based on one-index detection.


Asunto(s)
Algoritmos , Conducción de Automóvil , Humanos , Expresión Facial , Reconocimiento Facial/fisiología , Fases del Sueño/fisiología , Accidentes de Tránsito/prevención & control , Masculino , Adulto , Reconocimiento Facial Automatizado/métodos , Femenino
7.
Sensors (Basel) ; 24(14)2024 Jul 15.
Artículo en Inglés | MEDLINE | ID: mdl-39065979

RESUMEN

By leveraging artificial intelligence and big data to analyze and assess classroom conditions, we can significantly enhance teaching quality. Nevertheless, numerous existing studies primarily concentrate on evaluating classroom conditions for student groups, often neglecting the need for personalized instructional support for individual students. To address this gap and provide a more focused analysis of individual students in the classroom environment, we implemented an embedded application design using face recognition technology and target detection algorithms. The Insightface face recognition algorithm was employed to identify students by constructing a classroom face dataset and training it; simultaneously, classroom behavioral data were collected and trained, utilizing the YOLOv5 algorithm to detect students' body regions and correlate them with their facial regions to identify students accurately. Subsequently, these modeling algorithms were deployed onto an embedded device, the Atlas 200 DK, for application development, enabling the recording of both overall classroom conditions and individual student behaviors. Test results show that the detection precision for various types of behaviors is above 0.67. The average false detection rate for face recognition is 41.5%. The developed embedded application can reliably detect student behavior in a classroom setting, identify students, and capture image sequences of body regions associated with negative behavior for better management. These data empower teachers to gain a deeper understanding of their students, which is crucial for enhancing teaching quality and addressing the individual needs of students.


Asunto(s)
Algoritmos , Humanos , Estudiantes , Inteligencia Artificial , Cara/fisiología , Reconocimiento Facial/fisiología , Reconocimiento Facial Automatizado/métodos , Procesamiento de Imagen Asistido por Computador/métodos , Femenino , Reconocimiento de Normas Patrones Automatizadas/métodos
8.
Sensors (Basel) ; 24(10)2024 May 18.
Artículo en Inglés | MEDLINE | ID: mdl-38794068

RESUMEN

Most facial analysis methods perform well in standardized testing but not in real-world testing. The main reason is that training models cannot easily learn various human features and background noise, especially for facial landmark detection and head pose estimation tasks with limited and noisy training datasets. To alleviate the gap between standardized and real-world testing, we propose a pseudo-labeling technique using a face recognition dataset consisting of various people and background noise. The use of our pseudo-labeled training dataset can help to overcome the lack of diversity among the people in the dataset. Our integrated framework is constructed using complementary multitask learning methods to extract robust features for each task. Furthermore, introducing pseudo-labeling and multitask learning improves the face recognition performance by enabling the learning of pose-invariant features. Our method achieves state-of-the-art (SOTA) or near-SOTA performance on the AFLW2000-3D and BIWI datasets for facial landmark detection and head pose estimation, with competitive face verification performance on the IJB-C test dataset for face recognition. We demonstrate this through a novel testing methodology that categorizes cases as soft, medium, and hard based on the pose values of IJB-C. The proposed method achieves stable performance even when the dataset lacks diverse face identifications.


Asunto(s)
Reconocimiento Facial Automatizado , Cara , Cabeza , Humanos , Cara/anatomía & histología , Cara/diagnóstico por imagen , Cabeza/diagnóstico por imagen , Reconocimiento Facial Automatizado/métodos , Algoritmos , Aprendizaje Automático , Reconocimiento Facial , Bases de Datos Factuales , Procesamiento de Imagen Asistido por Computador/métodos
9.
Neuroimage ; 231: 117845, 2021 05 01.
Artículo en Inglés | MEDLINE | ID: mdl-33582276

RESUMEN

Recent advances in automated face recognition algorithms have increased the risk that de-identified research MRI scans may be re-identifiable by matching them to identified photographs using face recognition. A variety of software exist to de-face (remove faces from) MRI, but their ability to prevent face recognition has never been measured and their image modifications can alter automated brain measurements. In this study, we compared three popular de-facing techniques and introduce our mri_reface technique designed to minimize effects on brain measurements by replacing the face with a population average, rather than removing it. For each technique, we measured 1) how well it prevented automated face recognition (i.e. effects on exceptionally-motivated individuals) and 2) how it altered brain measurements from SPM12, FreeSurfer, and FSL (i.e. effects on the average user of de-identified data). Before de-facing, 97% of scans from a sample of 157 volunteers were correctly matched to photographs using automated face recognition. After de-facing with popular software, 28-38% of scans still retained enough data for successful automated face matching. Our proposed mri_reface had similar performance with the best existing method (fsl_deface) at preventing face recognition (28-30%) and it had the smallest effects on brain measurements in more pipelines than any other, but these differences were modest.


Asunto(s)
Reconocimiento Facial Automatizado/métodos , Investigación Biomédica/métodos , Encéfalo/diagnóstico por imagen , Procesamiento de Imagen Asistido por Computador/métodos , Imagen por Resonancia Magnética/métodos , Neuroimagen/métodos , Adulto , Anciano , Anciano de 80 o más Años , Algoritmos , Reconocimiento Facial Automatizado/tendencias , Encéfalo/fisiología , Femenino , Humanos , Procesamiento de Imagen Asistido por Computador/tendencias , Imagen por Resonancia Magnética/tendencias , Masculino , Persona de Mediana Edad , Neuroimagen/tendencias , Programas Informáticos/tendencias
10.
Plast Surg Nurs ; 41(2): 112-116, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34033638

RESUMEN

The number of applications for facial recognition technology is increasing due to the improvement in image quality, artificial intelligence, and computer processing power that has occurred during the last decades. Algorithms can be used to convert facial anthropometric landmarks into a computer representation, which can be used to help identify nonverbal information about an individual's health status. This article discusses the potential ways a facial recognition tool can perform a health assessment. Because facial attributes may be considered biometric data, clinicians should be informed about the clinical, ethical, and legal issues associated with its use.


Asunto(s)
Reconocimiento Facial Automatizado/instrumentación , Estado de Salud , Evaluación en Enfermería/métodos , Inteligencia Artificial/tendencias , Reconocimiento Facial Automatizado/métodos , Humanos , Evaluación en Enfermería/normas
11.
PLoS One ; 19(7): e0306250, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-39046954

RESUMEN

With the continuous progress of technology, facial recognition technology is widely used in various scenarios as a mature biometric technology. However, the accuracy of facial feature recognition has become a major challenge. This study proposes a face length feature and angle feature recognition method for digital libraries, targeting the recognition of different facial features. Firstly, an in-depth study is conducted on the architecture of facial action networks based on attention mechanisms to provide more accurate and comprehensive facial features. Secondly, a network architecture based on length and angle features of facial expressions, the expression recognition network is explored to improve the recognition rate of different expressions. Finally, an end-to-end network framework based on attention mechanism for facial feature points is constructed to improve the accuracy and stability of facial feature recognition network. To verify the effectiveness of the proposed method, experiments were conducted using the facial expression dataset FER-2013. The experimental results showed that the average recognition rate for the seven common expressions was 97.28% to 99.97%. The highest recognition rate for happiness and surprise was 99.97%, while the relatively low recognition rate for anger, fear, and neutrality was 97.18%. The data has verified that the research method can effectively recognize and distinguish different facial expressions, with high accuracy and robustness. The recognition method based on attention mechanism for facial feature points has effectively optimized the recognition process of facial length and angle features, significantly improving the stability of facial expression recognition, especially in complex environments, providing reliable technical support for digital libraries and other fields. This study aims to promote the development of facial recognition technology in digital libraries, improve the service quality and user experience of digital libraries.


Asunto(s)
Cara , Expresión Facial , Bibliotecas Digitales , Humanos , Cara/anatomía & histología , Reconocimiento Facial Automatizado/métodos
12.
Neural Netw ; 175: 106275, 2024 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-38653078

RESUMEN

Face Anti-Spoofing (FAS) seeks to protect face recognition systems from spoofing attacks, which is applied extensively in scenarios such as access control, electronic payment, and security surveillance systems. Face anti-spoofing requires the integration of local details and global semantic information. Existing CNN-based methods rely on small stride or image patch-based feature extraction structures, which struggle to capture spatial and cross-layer feature correlations effectively. Meanwhile, Transformer-based methods have limitations in extracting discriminative detailed features. To address the aforementioned issues, we introduce a multi-stage CNN-Transformer-based framework, which extracts local features through the convolutional layer and long-distance feature relationships via self-attention. Based on this, we proposed a cross-attention multi-stage feature fusion, employing semantically high-stage features to query task-relevant features in low-stage features for further cross-stage feature fusion. To enhance the discrimination of local features for subtle differences, we design pixel-wise material classification supervision and add a auxiliary branch in the intermediate layers of the model. Moreover, to address the limitations of a single acquisition environment and scarcity of acquisition devices in the existing Near-Infrared dataset, we create a large-scale Near-Infrared Face Anti-Spoofing dataset with 380k pictures of 1040 identities. The proposed method could achieve the state-of-the-art in OULU-NPU and our proposed Near-Infrared dataset at just 1.3GFlops and 3.2M parameter numbers, which demonstrate the effective of the proposed method.


Asunto(s)
Redes Neurales de la Computación , Humanos , Reconocimiento Facial Automatizado/métodos , Procesamiento de Imagen Asistido por Computador/métodos , Cara , Seguridad Computacional , Algoritmos
13.
PLoS One ; 19(7): e0301908, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38990958

RESUMEN

Real-time security surveillance and identity matching using face detection and recognition are central research areas within computer vision. The classical facial detection techniques include Haar-like, MTCNN, AdaBoost, and others. These techniques employ template matching and geometric facial features for detecting faces, striving for a balance between detection time and accuracy. To address this issue, the current research presents an enhanced FaceNet network. The RetinaFace is employed to perform expeditious face detection and alignment. Subsequently, FaceNet, with an improved loss function is used to achieve face verification and recognition with high accuracy. The presented work involves a comparative evaluation of the proposed network framework against both traditional and deep learning techniques in terms of face detection and recognition performance. The experimental findings demonstrate that an enhanced FaceNet can successfully meet the real-time facial recognition requirements, and the accuracy of face recognition is 99.86% which fulfills the actual requirement. Consequently, the proposed solution holds significant potential for applications in face detection and recognition within the education sector for real-time security surveillance.


Asunto(s)
Aprendizaje Profundo , Humanos , Cara , Seguridad Computacional , Medidas de Seguridad , Reconocimiento Facial Automatizado/métodos , Reconocimiento Facial , Algoritmos
14.
Neural Netw ; 178: 106421, 2024 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-38850638

RESUMEN

Micro-expression recognition (MER) has drawn increasing attention due to its wide application in lie detection, criminal detection and psychological consultation. However, the best recognition accuracy on recent public dataset is still low compared to the accuracy of macro-expression recognition. In this paper, we propose a novel graph convolution network (GCN) for MER achieving state-of-the-art accuracy. Different to existing GCN with fixed graph structure, we define a stochastic graph structure in which some neighbors are selected randomly. As shown by numerical examples, randomness enables better feature characterization while reducing computational complexity. The whole network consists of two branches, one is the spatial branch taking micro-expression images as input, the other is the temporal branch taking optical flow images as input. Because the micro-expression dataset does not have enough images for training the GCN, we employ the transfer learning mechanism. That is, different stochastic GCNs (SGCN) have been trained by the macro-expression dataset in the source network. Then the well-trained SGCNs are transferred to the target network. It is shown that our proposed method achieves the state-of-art performance on all four well-known datasets. This paper explores stochastic GCN and transfer learning with this random structure in the MER task, which is of great importance to improve the recognition performance.


Asunto(s)
Redes Neurales de la Computación , Procesos Estocásticos , Humanos , Expresión Facial , Procesamiento de Imagen Asistido por Computador/métodos , Algoritmos , Aprendizaje Automático , Reconocimiento Facial Automatizado/métodos
15.
PLoS One ; 19(8): e0308852, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-39172814

RESUMEN

In this paper, we propose a method to reduce the model architecture searching time. We consider MobileNetV2 for 3D face recognition tasks as a case study and introducing the layer replication to enhance accuracy. For a given network, various layers can be replicated, and effective replication can yield better accuracy. Our proposed algorithm identifies the optimal layer replication configuration for the model. We considered two acceleration methods: distributed data-parallel training and concurrent model training. Our experiments demonstrate the effectiveness of the automatic model finding process for layer replication, using both distributed data-parallel and concurrent training under different conditions. The accuracy of our model improved by up to 6% compared to the previous work on 3D MobileNetV2, and by 8% compared to the vanilla MobileNetV2. Training models with distributed data-parallel across four GPUs reduced model training time by up to 75% compared to traditional training on a single GPU. Additionally, the automatic model finding process with concurrent training was 1,932 minutes faster than the distributed training approach in finding an optimal solution.


Asunto(s)
Algoritmos , Humanos , Redes Neurales de la Computación , Reconocimiento Facial Automatizado/métodos , Imagenología Tridimensional/métodos
16.
PLoS One ; 19(5): e0304610, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38820451

RESUMEN

Face Morphing Attacks pose a threat to the security of identity documents, especially with respect to a subsequent access control process, because they allow both involved individuals to use the same document. Several algorithms are currently being developed to detect Morphing Attacks, often requiring large data sets of morphed face images for training. In the present study, face embeddings are used for two different purposes: first, to pre-select images for the subsequent large-scale generation of Morphing Attacks, and second, to detect potential Morphing Attacks. Previous studies have demonstrated the power of embeddings in both use cases. However, we aim to build on these studies by adding the more powerful MagFace model to both use cases, and by performing comprehensive analyses of the role of embeddings in pre-selection and attack detection in terms of the vulnerability of face recognition systems and attack detection algorithms. In particular, we use recent developments to assess the attack potential, but also investigate the influence of morphing algorithms. For the first objective, an algorithm is developed that pairs individuals based on the similarity of their face embeddings. Different state-of-the-art face recognition systems are used to extract embeddings in order to pre-select the face images and different morphing algorithms are used to fuse the face images. The attack potential of the differently generated morphed face images will be quantified to compare the usability of the embeddings for automatically generating a large number of successful Morphing Attacks. For the second objective, we compare the performance of the embeddings of two state-of-the-art face recognition systems with respect to their ability to detect morphed face images. Our results demonstrate that ArcFace and MagFace provide valuable face embeddings for image pre-selection. Various open-source and commercial-off-the-shelf face recognition systems are vulnerable to the generated Morphing Attacks, and their vulnerability increases when image pre-selection is based on embeddings compared to random pairing. In particular, landmark-based closed-source morphing algorithms generate attacks that pose a high risk to any tested face recognition system. Remarkably, more accurate face recognition systems show a higher vulnerability to Morphing Attacks. Among the systems tested, commercial-off-the-shelf systems were the most vulnerable to Morphing Attacks. In addition, MagFace embeddings stand out as a robust alternative for detecting morphed face images compared to the previously used ArcFace embeddings. The results endorse the benefits of face embeddings for more effective image pre-selection for face morphing and for more accurate detection of morphed face images, as demonstrated by extensive analysis of various designed attacks. The MagFace model is a powerful alternative to the often-used ArcFace model in detecting attacks and can increase performance depending on the use case. It also highlights the usability of embeddings to generate large-scale morphed face databases for various purposes, such as training Morphing Attack Detection algorithms as a countermeasure against attacks.


Asunto(s)
Algoritmos , Seguridad Computacional , Humanos , Cara , Procesamiento de Imagen Asistido por Computador/métodos , Reconocimiento Facial Automatizado/métodos , Reconocimiento Facial
17.
Appl Ergon ; 121: 104364, 2024 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-39121521

RESUMEN

Carragher and Hancock (2023) investigated how individuals performed in a one-to-one face matching task when assisted by an Automated Facial Recognition System (AFRS). Across five pre-registered experiments they found evidence of suboptimal aided performance, with AFRS-assisted individuals consistently failing to reach the level of performance the AFRS achieved alone. The current study reanalyses these data (Carragher and Hancock, 2023), to benchmark automation-aided performance against a series of statistical models of collaborative decision making, spanning a range of efficiency levels. Analyses using a Bayesian hierarchical signal detection model revealed that collaborative performance was highly inefficient, falling closest to the most suboptimal models of automation dependence tested. This pattern of results generalises previous reports of suboptimal human-automation interaction across a range of visual search, target detection, sensory discrimination, and numeric estimation decision-making tasks. The current study is the first to provide benchmarks of automation-aided performance in the one-to-one face matching task.


Asunto(s)
Reconocimiento Facial Automatizado , Automatización , Benchmarking , Análisis y Desempeño de Tareas , Humanos , Masculino , Femenino , Adulto , Reconocimiento Facial Automatizado/métodos , Teorema de Bayes , Toma de Decisiones , Adulto Joven , Ciencias Forenses/métodos , Reconocimiento Facial
18.
Neural Netw ; 179: 106573, 2024 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-39096753

RESUMEN

Recognizing expressions from dynamic facial videos can find more natural affect states of humans, and it becomes a more challenging task in real-world scenes due to pose variations of face, partial occlusions and subtle dynamic changes of emotion sequences. Existing transformer-based methods often focus on self-attention to model the global relations among spatial features or temporal features, which cannot well focus on important expression-related locality structures from both spatial and temporal features for the in-the-wild expression videos. To this end, we incorporate diverse graph structures into transformers and propose a CDGT method to construct diverse graph transformers for efficient emotion recognition from in-the-wild videos. Specifically, our method contains a spatial dual-graphs transformer and a temporal hyperbolic-graph transformer. The former deploys a dual-graph constrained attention to capture latent emotion-related graph geometry structures among local spatial tokens for efficient feature representation, especially for the video frames with pose variations and partial occlusions. The latter adopts a hyperbolic-graph constrained self-attention that explores important temporal graph structure information under hyperbolic space to model more subtle changes of dynamic emotion. Extensive experimental results on in-the-wild video-based facial expression databases show that our proposed CDGT outperforms other state-of-the-art methods.


Asunto(s)
Emociones , Expresión Facial , Grabación en Video , Humanos , Emociones/fisiología , Algoritmos , Redes Neurales de la Computación , Reconocimiento Facial/fisiología , Reconocimiento de Normas Patrones Automatizadas/métodos , Reconocimiento Facial Automatizado/métodos
19.
PLoS One ; 19(10): e0308566, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-39365809

RESUMEN

Heterogeneity of a probe image is one of the most complex challenges faced by researchers and implementers of current surveillance systems. This is due to existence of multiple cameras working in different spectral ranges in a single surveillance setup. This paper proposes two different approaches including spatial sparse representations (SSR) and frequency sparse representation (FSR) to recognize on-the-move heterogeneous face images with database of single sample per person (SSPP). SCface database, with five visual and two Infrared (IR) cameras, is taken as a benchmark for experiments, which is further confirmed using CASIA NIR-VIS 2.0 face database with 17580 visual and IR images. Similarity, comparison is performed for different scenarios such as, variation of distances from a camera and variation in sizes of face images and various visual and infrared (IR) modalities. Least square minimization based approach for finding the solution is used to match face images as it makes the recognition process simpler. A side by side comparison of both the proposed approaches with the state-of-the-art, classical, principal component analysis (PCA), kernel fisher analysis (KFA) and coupled kernel embedding (CKE) methods, along with modern low-rank preserving projection via graph regularized reconstruction (LRPP-GRR) method, is also presented. Experimental results suggest that the proposed approaches achieve superior performance.


Asunto(s)
Algoritmos , Humanos , Cara/anatomía & histología , Bases de Datos Factuales , Análisis de Componente Principal , Reconocimiento Facial Automatizado/métodos , Procesamiento de Imagen Asistido por Computador/métodos , Reconocimiento de Normas Patrones Automatizadas/métodos , Reconocimiento Facial
20.
PLoS One ; 19(8): e0307446, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-39178187

RESUMEN

Facial expression recognition(FER) is a hot topic in computer vision, especially as deep learning based methods are gaining traction in this field. However, traditional convolutional neural networks (CNN) ignore the relative position relationship of key facial features (mouth, eyebrows, eyes, etc.) due to changes of facial expressions in real-world environments such as rotation, displacement or partial occlusion. In addition, most of the works in the literature do not take visual tempos into account when recognizing facial expressions that possess higher similarities. To address these issues, we propose a visual tempos 3D-CapsNet framework(VT-3DCapsNet). First, we propose 3D-CapsNet model for emotion recognition, in which we introduced improved 3D-ResNet architecture that integrated with AU-perceived attention module to enhance the ability of feature representation of capsule network, through expressing deeper hierarchical spatiotemporal features and extracting latent information (position, size, orientation) in key facial areas. Furthermore, we propose the temporal pyramid network(TPN)-based expression recognition module(TPN-ERM), which can learn high-level facial motion features from video frames to model differences in visual tempos, further improving the recognition accuracy of 3D-CapsNet. Extensive experiments are conducted on extended Kohn-Kanada (CK+) database and Acted Facial Expression in Wild (AFEW) database. The results demonstrate competitive performance of our approach compared with other state-of-the-art methods.


Asunto(s)
Expresión Facial , Redes Neurales de la Computación , Humanos , Grabación en Video/métodos , Reconocimiento Facial Automatizado/métodos , Aprendizaje Profundo , Emociones/fisiología , Reconocimiento Facial/fisiología , Imagenología Tridimensional/métodos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA