Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 104
Filtrar
1.
PLoS One ; 19(8): e0308852, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-39172814

RESUMO

In this paper, we propose a method to reduce the model architecture searching time. We consider MobileNetV2 for 3D face recognition tasks as a case study and introducing the layer replication to enhance accuracy. For a given network, various layers can be replicated, and effective replication can yield better accuracy. Our proposed algorithm identifies the optimal layer replication configuration for the model. We considered two acceleration methods: distributed data-parallel training and concurrent model training. Our experiments demonstrate the effectiveness of the automatic model finding process for layer replication, using both distributed data-parallel and concurrent training under different conditions. The accuracy of our model improved by up to 6% compared to the previous work on 3D MobileNetV2, and by 8% compared to the vanilla MobileNetV2. Training models with distributed data-parallel across four GPUs reduced model training time by up to 75% compared to traditional training on a single GPU. Additionally, the automatic model finding process with concurrent training was 1,932 minutes faster than the distributed training approach in finding an optimal solution.


Assuntos
Algoritmos , Humanos , Redes Neurais de Computação , Reconhecimento Facial Automatizado/métodos , Imageamento Tridimensional/métodos
2.
Sensors (Basel) ; 24(16)2024 Aug 21.
Artigo em Inglês | MEDLINE | ID: mdl-39205085

RESUMO

In recent years, significant progress has been made in facial expression recognition methods. However, tasks related to facial expression recognition in real environments still require further research. This paper proposes a tri-cross-attention transformer with a multi-feature fusion network (TriCAFFNet) to improve facial expression recognition performance under challenging conditions. By combining LBP (Local Binary Pattern) features, HOG (Histogram of Oriented Gradients) features, landmark features, and CNN (convolutional neural network) features from facial images, the model is provided with a rich input to improve its ability to discern subtle differences between images. Additionally, tri-cross-attention blocks are designed to facilitate information exchange between different features, enabling mutual guidance among different features to capture salient attention. Extensive experiments on several widely used datasets show that our TriCAFFNet achieves the SOTA performance on RAF-DB with 92.17%, AffectNet (7 cls) with 67.40%, and AffectNet (8 cls) with 63.49%, respectively.


Assuntos
Expressão Facial , Redes Neurais de Computação , Humanos , Algoritmos , Processamento de Imagem Assistida por Computador/métodos , Face/anatomia & histologia , Reconhecimento Facial Automatizado/métodos , Reconhecimento Automatizado de Padrão/métodos
3.
PLoS One ; 19(8): e0307446, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-39178187

RESUMO

Facial expression recognition(FER) is a hot topic in computer vision, especially as deep learning based methods are gaining traction in this field. However, traditional convolutional neural networks (CNN) ignore the relative position relationship of key facial features (mouth, eyebrows, eyes, etc.) due to changes of facial expressions in real-world environments such as rotation, displacement or partial occlusion. In addition, most of the works in the literature do not take visual tempos into account when recognizing facial expressions that possess higher similarities. To address these issues, we propose a visual tempos 3D-CapsNet framework(VT-3DCapsNet). First, we propose 3D-CapsNet model for emotion recognition, in which we introduced improved 3D-ResNet architecture that integrated with AU-perceived attention module to enhance the ability of feature representation of capsule network, through expressing deeper hierarchical spatiotemporal features and extracting latent information (position, size, orientation) in key facial areas. Furthermore, we propose the temporal pyramid network(TPN)-based expression recognition module(TPN-ERM), which can learn high-level facial motion features from video frames to model differences in visual tempos, further improving the recognition accuracy of 3D-CapsNet. Extensive experiments are conducted on extended Kohn-Kanada (CK+) database and Acted Facial Expression in Wild (AFEW) database. The results demonstrate competitive performance of our approach compared with other state-of-the-art methods.


Assuntos
Expressão Facial , Redes Neurais de Computação , Humanos , Gravação em Vídeo/métodos , Reconhecimento Facial Automatizado/métodos , Aprendizado Profundo , Emoções/fisiologia , Reconhecimento Facial/fisiologia , Imageamento Tridimensional/métodos
4.
PLoS One ; 19(7): e0301908, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38990958

RESUMO

Real-time security surveillance and identity matching using face detection and recognition are central research areas within computer vision. The classical facial detection techniques include Haar-like, MTCNN, AdaBoost, and others. These techniques employ template matching and geometric facial features for detecting faces, striving for a balance between detection time and accuracy. To address this issue, the current research presents an enhanced FaceNet network. The RetinaFace is employed to perform expeditious face detection and alignment. Subsequently, FaceNet, with an improved loss function is used to achieve face verification and recognition with high accuracy. The presented work involves a comparative evaluation of the proposed network framework against both traditional and deep learning techniques in terms of face detection and recognition performance. The experimental findings demonstrate that an enhanced FaceNet can successfully meet the real-time facial recognition requirements, and the accuracy of face recognition is 99.86% which fulfills the actual requirement. Consequently, the proposed solution holds significant potential for applications in face detection and recognition within the education sector for real-time security surveillance.


Assuntos
Aprendizado Profundo , Humanos , Face , Segurança Computacional , Medidas de Segurança , Reconhecimento Facial Automatizado/métodos , Reconhecimento Facial , Algoritmos
5.
Sci Justice ; 64(4): 421-442, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-39025567

RESUMO

In today's biometric and commercial settings, state-of-the-art image processing relies solely on artificial intelligence and machine learning which provides a high level of accuracy. However, these principles are deeply rooted in abstract, complex "black-box systems". When applied to forensic image identification, concerns about transparency and accountability emerge. This study explores the impact of two challenging factors in automated facial identification: facial expressions and head poses. The sample comprised 3D faces with nine prototype expressions, collected from 41 participants (13 males, 28 females) of European descent aged 19.96 to 50.89 years. Pre-processing involved converting 3D models to 2D color images (256 × 256 px). Probes included a set of 9 images per individual with head poses varying by 5° in both left-to-right (yaw) and up-and-down (pitch) directions for neutral expressions. A second set of 3,610 images per individual covered viewpoints in 5° increments from -45° to 45° for head movements and different facial expressions, forming the targets. Pair-wise comparisons using ArcFace, a state-of-the-art face identification algorithm yielded 54,615,690 dissimilarity scores. Results indicate that minor head deviations in probes have minimal impact. However, the performance diminished as targets deviated from the frontal position. Right-to-left movements were less influential than up and down, with downward pitch showing less impact than upward movements. The lowest accuracy was for upward pitch at 45°. Dissimilarity scores were consistently higher for males than for females across all studied factors. The performance particularly diverged in upward movements, starting at 15°. Among tested facial expressions, happiness and contempt performed best, while disgust exhibited the lowest AUC values.


Assuntos
Algoritmos , Reconhecimento Facial Automatizado , Expressão Facial , Humanos , Masculino , Feminino , Adulto , Reconhecimento Facial Automatizado/métodos , Adulto Jovem , Pessoa de Meia-Idade , Imageamento Tridimensional , Processamento de Imagem Assistida por Computador/métodos , Identificação Biométrica/métodos , Face/anatomia & histologia , Movimentos da Cabeça/fisiologia , Postura/fisiologia
6.
Sensors (Basel) ; 24(13)2024 Jun 26.
Artigo em Inglês | MEDLINE | ID: mdl-39000930

RESUMO

Convolutional neural networks (CNNs) have made significant progress in the field of facial expression recognition (FER). However, due to challenges such as occlusion, lighting variations, and changes in head pose, facial expression recognition in real-world environments remains highly challenging. At the same time, methods solely based on CNN heavily rely on local spatial features, lack global information, and struggle to balance the relationship between computational complexity and recognition accuracy. Consequently, the CNN-based models still fall short in their ability to address FER adequately. To address these issues, we propose a lightweight facial expression recognition method based on a hybrid vision transformer. This method captures multi-scale facial features through an improved attention module, achieving richer feature integration, enhancing the network's perception of key facial expression regions, and improving feature extraction capabilities. Additionally, to further enhance the model's performance, we have designed the patch dropping (PD) module. This module aims to emulate the attention allocation mechanism of the human visual system for local features, guiding the network to focus on the most discriminative features, reducing the influence of irrelevant features, and intuitively lowering computational costs. Extensive experiments demonstrate that our approach significantly outperforms other methods, achieving an accuracy of 86.51% on RAF-DB and nearly 70% on FER2013, with a model size of only 3.64 MB. These results demonstrate that our method provides a new perspective for the field of facial expression recognition.


Assuntos
Expressão Facial , Redes Neurais de Computação , Humanos , Reconhecimento Facial Automatizado/métodos , Algoritmos , Processamento de Imagem Assistida por Computador/métodos , Face , Reconhecimento Automatizado de Padrão/métodos
7.
PLoS One ; 19(7): e0306250, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-39046954

RESUMO

With the continuous progress of technology, facial recognition technology is widely used in various scenarios as a mature biometric technology. However, the accuracy of facial feature recognition has become a major challenge. This study proposes a face length feature and angle feature recognition method for digital libraries, targeting the recognition of different facial features. Firstly, an in-depth study is conducted on the architecture of facial action networks based on attention mechanisms to provide more accurate and comprehensive facial features. Secondly, a network architecture based on length and angle features of facial expressions, the expression recognition network is explored to improve the recognition rate of different expressions. Finally, an end-to-end network framework based on attention mechanism for facial feature points is constructed to improve the accuracy and stability of facial feature recognition network. To verify the effectiveness of the proposed method, experiments were conducted using the facial expression dataset FER-2013. The experimental results showed that the average recognition rate for the seven common expressions was 97.28% to 99.97%. The highest recognition rate for happiness and surprise was 99.97%, while the relatively low recognition rate for anger, fear, and neutrality was 97.18%. The data has verified that the research method can effectively recognize and distinguish different facial expressions, with high accuracy and robustness. The recognition method based on attention mechanism for facial feature points has effectively optimized the recognition process of facial length and angle features, significantly improving the stability of facial expression recognition, especially in complex environments, providing reliable technical support for digital libraries and other fields. This study aims to promote the development of facial recognition technology in digital libraries, improve the service quality and user experience of digital libraries.


Assuntos
Face , Expressão Facial , Bibliotecas Digitais , Humanos , Face/anatomia & histologia , Reconhecimento Facial Automatizado/métodos
8.
Sensors (Basel) ; 24(14)2024 Jul 15.
Artigo em Inglês | MEDLINE | ID: mdl-39065979

RESUMO

By leveraging artificial intelligence and big data to analyze and assess classroom conditions, we can significantly enhance teaching quality. Nevertheless, numerous existing studies primarily concentrate on evaluating classroom conditions for student groups, often neglecting the need for personalized instructional support for individual students. To address this gap and provide a more focused analysis of individual students in the classroom environment, we implemented an embedded application design using face recognition technology and target detection algorithms. The Insightface face recognition algorithm was employed to identify students by constructing a classroom face dataset and training it; simultaneously, classroom behavioral data were collected and trained, utilizing the YOLOv5 algorithm to detect students' body regions and correlate them with their facial regions to identify students accurately. Subsequently, these modeling algorithms were deployed onto an embedded device, the Atlas 200 DK, for application development, enabling the recording of both overall classroom conditions and individual student behaviors. Test results show that the detection precision for various types of behaviors is above 0.67. The average false detection rate for face recognition is 41.5%. The developed embedded application can reliably detect student behavior in a classroom setting, identify students, and capture image sequences of body regions associated with negative behavior for better management. These data empower teachers to gain a deeper understanding of their students, which is crucial for enhancing teaching quality and addressing the individual needs of students.


Assuntos
Algoritmos , Humanos , Estudantes , Inteligência Artificial , Face/fisiologia , Reconhecimento Facial/fisiologia , Reconhecimento Facial Automatizado/métodos , Processamento de Imagem Assistida por Computador/métodos , Feminino , Reconhecimento Automatizado de Padrão/métodos
9.
Cogn Res Princ Implic ; 9(1): 41, 2024 Jun 21.
Artigo em Inglês | MEDLINE | ID: mdl-38902539

RESUMO

The human face is commonly used for identity verification. While this task was once exclusively performed by humans, technological advancements have seen automated facial recognition systems (AFRS) integrated into many identification scenarios. Although many state-of-the-art AFRS are exceptionally accurate, they often require human oversight or involvement, such that a human operator actions the final decision. Previously, we have shown that on average, humans assisted by a simulated AFRS (sAFRS) failed to reach the level of accuracy achieved by the same sAFRS alone, due to overturning the system's correct decisions and/or failing to correct sAFRS errors. The aim of the current study was to investigate whether participants' trust in automation was related to their performance on a one-to-one face matching task when assisted by a sAFRS. Participants (n = 160) completed a standard face matching task in two phases: an unassisted baseline phase, and an assisted phase where they were shown the identification decision (95% accurate) made by a sAFRS prior to submitting their own decision. While most participants improved with sAFRS assistance, those with greater relative trust in automation achieved larger gains in performance. However, the average aided performance of participants still failed to reach that of the sAFRS alone, regardless of trust status. Nonetheless, further analysis revealed a small sample of participants who achieved 100% accuracy when aided by the sAFRS. Our results speak to the importance of considering individual differences when selecting employees for roles requiring human-algorithm interaction, including identity verification tasks that incorporate facial recognition technologies.


Assuntos
Reconhecimento Facial Automatizado , Automação , Confiança , Humanos , Masculino , Feminino , Adulto , Adulto Jovem , Reconhecimento Facial/fisiologia , Algoritmos
10.
Neural Netw ; 178: 106421, 2024 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-38850638

RESUMO

Micro-expression recognition (MER) has drawn increasing attention due to its wide application in lie detection, criminal detection and psychological consultation. However, the best recognition accuracy on recent public dataset is still low compared to the accuracy of macro-expression recognition. In this paper, we propose a novel graph convolution network (GCN) for MER achieving state-of-the-art accuracy. Different to existing GCN with fixed graph structure, we define a stochastic graph structure in which some neighbors are selected randomly. As shown by numerical examples, randomness enables better feature characterization while reducing computational complexity. The whole network consists of two branches, one is the spatial branch taking micro-expression images as input, the other is the temporal branch taking optical flow images as input. Because the micro-expression dataset does not have enough images for training the GCN, we employ the transfer learning mechanism. That is, different stochastic GCNs (SGCN) have been trained by the macro-expression dataset in the source network. Then the well-trained SGCNs are transferred to the target network. It is shown that our proposed method achieves the state-of-art performance on all four well-known datasets. This paper explores stochastic GCN and transfer learning with this random structure in the MER task, which is of great importance to improve the recognition performance.


Assuntos
Redes Neurais de Computação , Processos Estocásticos , Humanos , Expressão Facial , Processamento de Imagem Assistida por Computador/métodos , Algoritmos , Aprendizado de Máquina , Reconhecimento Facial Automatizado/métodos
11.
Eur J Pediatr ; 183(9): 3797-3808, 2024 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-38871980

RESUMO

Williams-Beuren syndrome (WBS) is a rare genetic disorder characterized by special facial gestalt, delayed development, and supravalvular aortic stenosis or/and stenosis of the branches of the pulmonary artery. We aim to develop and optimize accurate models of facial recognition to assist in the diagnosis of WBS, and to evaluate their effectiveness by using both five-fold cross-validation and an external test set. We used a total of 954 images from 135 patients with WBS, 124 patients suffering from other genetic disorders, and 183 healthy children. The training set comprised 852 images of 104 WBS cases, 91 cases of other genetic disorders, and 145 healthy children from September 2017 to December 2021 at the Guangdong Provincial People's Hospital. We constructed six binary classification models of facial recognition for WBS by using EfficientNet-b3, ResNet-50, VGG-16, VGG-16BN, VGG-19, and VGG-19BN. Transfer learning was used to pre-train the models, and each model was modified with a variable cosine learning rate. Each model was first evaluated by using five-fold cross-validation and then assessed on the external test set. The latter contained 102 images of 31 children suffering from WBS, 33 children with other genetic disorders, and 38 healthy children. To compare the capabilities of these models of recognition with those of human experts in terms of identifying cases of WBS, we recruited two pediatricians, a pediatric cardiologist, and a pediatric geneticist to identify the WBS patients based solely on their facial images. We constructed six models of facial recognition for diagnosing WBS using EfficientNet-b3, ResNet-50, VGG-16, VGG-16BN, VGG-19, and VGG-19BN. The model based on VGG-19BN achieved the best performance in terms of five-fold cross-validation, with an accuracy of 93.74% ± 3.18%, precision of 94.93% ± 4.53%, specificity of 96.10% ± 4.30%, and F1 score of 91.65% ± 4.28%, while the VGG-16BN model achieved the highest recall value of 91.63% ± 5.96%. The VGG-19BN model also achieved the best performance on the external test set, with an accuracy of 95.10%, precision of 100%, recall of 83.87%, specificity of 93.42%, and F1 score of 91.23%. The best performance by human experts on the external test set yielded values of accuracy, precision, recall, specificity, and F1 scores of 77.45%, 60.53%, 77.42%, 83.10%, and 66.67%, respectively. The F1 score of each human expert was lower than those of the EfficientNet-b3 (84.21%), ResNet-50 (74.51%), VGG-16 (85.71%), VGG-16BN (85.71%), VGG-19 (83.02%), and VGG-19BN (91.23%) models. CONCLUSION: The results showed that facial recognition technology can be used to accurately diagnose patients with WBS. Facial recognition models based on VGG-19BN can play a crucial role in its clinical diagnosis. Their performance can be improved by expanding the size of the training dataset, optimizing the CNN architectures applied, and modifying them with a variable cosine learning rate. WHAT IS KNOWN: • The facial gestalt of WBS, often described as "elfin," includes a broad forehead, periorbital puffiness, a flat nasal bridge, full cheeks, and a small chin. • Recent studies have demonstrated the potential of deep convolutional neural networks for facial recognition as a diagnostic tool for WBS. WHAT IS NEW: • This study develops six models of facial recognition, EfficientNet-b3, ResNet-50, VGG-16, VGG-16BN, VGG-19, and VGG-19BN, to improve WBS diagnosis. • The VGG-19BN model achieved the best performance, with an accuracy of 95.10% and specificity of 93.42%. The facial recognition model based on VGG-19BN can play a crucial role in the clinical diagnosis of WBS.


Assuntos
Síndrome de Williams , Humanos , Síndrome de Williams/diagnóstico , Síndrome de Williams/genética , Criança , Feminino , Masculino , Pré-Escolar , Lactente , Estudos de Casos e Controles , Adolescente , Reconhecimento Facial , Reconhecimento Facial Automatizado/métodos
12.
Forensic Sci Int ; 361: 112108, 2024 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-38908069

RESUMO

Mass disaster events can result in high levels of casualties that need to be identified. Whilst disaster victim identification (DVI) relies on primary identifiers of DNA, fingerprints, and dental, these require ante-mortem data that may not exist or be easily obtainable. Facial recognition technology may be able to assist. Automated facial recognition has advanced considerably and access to ante-mortem facial images are readily available. Facial recognition could therefore be used to expedite the DVI process by narrowing down leads before primary identifiers are made available. This research explores the feasibility of using automated facial recognition technology to support DVI. We evaluated the performance of a commercial-off-the-self facial recognition algorithm on post-mortem images (representing images taken after a mass disaster) against ante-mortem images (representing a database that may exist within agencies who hold face databases for identity documents (such as passports or driver's licenses). We explored facial recognition performance for different operational scenarios, with different levels of face image quality, and by cause of death. Our research is the largest facial recognition evaluation of post-mortem and ante-mortem images to date. We demonstrated that facial recognition technology would be valuable for DVI and that the performance varies by image quality and cause of death. We provide recommendations for future research.


Assuntos
Algoritmos , Reconhecimento Facial Automatizado , Vítimas de Desastres , Humanos , Face/anatomia & histologia , Face/diagnóstico por imagem , Processamento de Imagem Assistida por Computador , Masculino , Feminino , Fotografação
13.
Sci Rep ; 14(1): 12763, 2024 06 04.
Artigo em Inglês | MEDLINE | ID: mdl-38834661

RESUMO

With the continuous progress of technology, the subject of life science plays an increasingly important role, among which the application of artificial intelligence in the medical field has attracted more and more attention. Bell facial palsy, a neurological ailment characterized by facial muscle weakness or paralysis, exerts a profound impact on patients' facial expressions and masticatory abilities, thereby inflicting considerable distress upon their overall quality of life and mental well-being. In this study, we designed a facial attribute recognition model specifically for individuals with Bell's facial palsy. The model utilizes an enhanced SSD network and scientific computing to perform a graded assessment of the patients' condition. By replacing the VGG network with a more efficient backbone, we improved the model's accuracy and significantly reduced its computational burden. The results show that the improved SSD network has an average precision of 87.9% in the classification of light, middle and severe facial palsy, and effectively performs the classification of patients with facial palsy, where scientific calculations also increase the precision of the classification. This is also one of the most significant contributions of this article, which provides intelligent means and objective data for future research on intelligent diagnosis and treatment as well as progressive rehabilitation.


Assuntos
Paralisia de Bell , Humanos , Paralisia de Bell/diagnóstico , Paralisia de Bell/fisiopatologia , Redes Neurais de Computação , Feminino , Masculino , Expressão Facial , Adulto , Inteligência Artificial , Pessoa de Meia-Idade , Paralisia Facial/diagnóstico , Paralisia Facial/fisiopatologia , Paralisia Facial/psicologia , Reconhecimento Facial , Reconhecimento Facial Automatizado/métodos
14.
Traffic Inj Prev ; 25(6): 842-851, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38717829

RESUMO

OBJECTIVE: One of the main causes of death worldwide among young people are car crashes, and most of these fatalities occur to children who are seated in the front passenger seat and who, at the time of an accident, receive a direct impact from the airbags, which is lethal for children under 13 years of age. The present study seeks to raise awareness of this risk by interior monitoring with a child face detection system that serves to alert the driver that the child should not be sitting in the front passenger seat. METHODS: The system incorporates processing of data collected, elements of deep learning such as transfer learning, fine-tunning and facial detection to identify the presence of children in a robust way, which was achieved by training with a dataset generated from scratch for this specific purpose. The MobileNetV2 architecture was used based on the good performance shown when compared with the Inception architecture for this task; and its low computational cost, which facilitates implementing the final model on a Raspberry Pi 4B. RESULTS: The resulting image dataset consisted of 102 empty seats, 71 children (0-13 years), and 96 adults (14-75 years). From the data augmentation, there were 2,496 images for adults and 2,310 for children. The classification of faces without sliding window gave a result of 98% accuracy and 100% precision. Finally, using the proposed methodology, it was possible to detect children in the front passenger seat in real time, with a delay of 1 s per decision and sliding window criterion, reaching an accuracy of 100%. CONCLUSIONS: Although our 100% accuracy in an experimental environment is somewhat idealized in that the sensor was not blocked by direct sunlight, nor was it partially or completely covered by dirt or other debris common in vehicles transporting children. The present study showed that is possible the implementation of a robust noninvasive classification system made on Raspberry Pi 4 Model B in any automobile for the detection of a child in the front seat through deep learning methods such as Deep CNN.


Assuntos
Acidentes de Trânsito , Aprendizado Profundo , Humanos , Criança , Pré-Escolar , Adolescente , Lactente , Acidentes de Trânsito/prevenção & controle , Adulto , Adulto Jovem , Pessoa de Meia-Idade , Idoso , Recém-Nascido , Feminino , Masculino , Sistemas de Proteção para Crianças/estatística & dados numéricos , Reconhecimento Facial Automatizado , Face
15.
Sensors (Basel) ; 24(10)2024 May 18.
Artigo em Inglês | MEDLINE | ID: mdl-38794068

RESUMO

Most facial analysis methods perform well in standardized testing but not in real-world testing. The main reason is that training models cannot easily learn various human features and background noise, especially for facial landmark detection and head pose estimation tasks with limited and noisy training datasets. To alleviate the gap between standardized and real-world testing, we propose a pseudo-labeling technique using a face recognition dataset consisting of various people and background noise. The use of our pseudo-labeled training dataset can help to overcome the lack of diversity among the people in the dataset. Our integrated framework is constructed using complementary multitask learning methods to extract robust features for each task. Furthermore, introducing pseudo-labeling and multitask learning improves the face recognition performance by enabling the learning of pose-invariant features. Our method achieves state-of-the-art (SOTA) or near-SOTA performance on the AFLW2000-3D and BIWI datasets for facial landmark detection and head pose estimation, with competitive face verification performance on the IJB-C test dataset for face recognition. We demonstrate this through a novel testing methodology that categorizes cases as soft, medium, and hard based on the pose values of IJB-C. The proposed method achieves stable performance even when the dataset lacks diverse face identifications.


Assuntos
Reconhecimento Facial Automatizado , Face , Cabeça , Humanos , Face/anatomia & histologia , Face/diagnóstico por imagem , Cabeça/diagnóstico por imagem , Reconhecimento Facial Automatizado/métodos , Algoritmos , Aprendizado de Máquina , Reconhecimento Facial , Bases de Dados Factuais , Processamento de Imagem Assistida por Computador/métodos
16.
BMC Pediatr ; 24(1): 361, 2024 May 24.
Artigo em Inglês | MEDLINE | ID: mdl-38783283

RESUMO

BACKGROUND: Noonan syndrome (NS) is a rare genetic disease, and patients who suffer from it exhibit a facial morphology that is characterized by a high forehead, hypertelorism, ptosis, inner epicanthal folds, down-slanting palpebral fissures, a highly arched palate, a round nasal tip, and posteriorly rotated ears. Facial analysis technology has recently been applied to identify many genetic syndromes (GSs). However, few studies have investigated the identification of NS based on the facial features of the subjects. OBJECTIVES: This study develops advanced models to enhance the accuracy of diagnosis of NS. METHODS: A total of 1,892 people were enrolled in this study, including 233 patients with NS, 863 patients with other GSs, and 796 healthy children. We took one to 10 frontal photos of each subject to build a dataset, and then applied the multi-task convolutional neural network (MTCNN) for data pre-processing to generate standardized outputs with five crucial facial landmarks. The ImageNet dataset was used to pre-train the network so that it could capture generalizable features and minimize data wastage. We subsequently constructed seven models for facial identification based on the VGG16, VGG19, VGG16-BN, VGG19-BN, ResNet50, MobileNet-V2, and squeeze-and-excitation network (SENet) architectures. The identification performance of seven models was evaluated and compared with that of six physicians. RESULTS: All models exhibited a high accuracy, precision, and specificity in recognizing NS patients. The VGG19-BN model delivered the best overall performance, with an accuracy of 93.76%, precision of 91.40%, specificity of 98.73%, and F1 score of 78.34%. The VGG16-BN model achieved the highest AUC value of 0.9787, while all models based on VGG architectures were superior to the others on the whole. The highest scores of six physicians in terms of accuracy, precision, specificity, and the F1 score were 74.00%, 75.00%, 88.33%, and 61.76%, respectively. The performance of each model of facial recognition was superior to that of the best physician on all metrics. CONCLUSION: Models of computer-assisted facial recognition can improve the rate of diagnosis of NS. The models based on VGG19-BN and VGG16-BN can play an important role in diagnosing NS in clinical practice.


Assuntos
Síndrome de Noonan , Humanos , Síndrome de Noonan/diagnóstico , Criança , Feminino , Masculino , Pré-Escolar , Redes Neurais de Computação , Lactente , Adolescente , Reconhecimento Facial Automatizado/métodos , Diagnóstico por Computador/métodos , Sensibilidade e Especificidade , Estudos de Casos e Controles
17.
PLoS One ; 19(5): e0304610, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38820451

RESUMO

Face Morphing Attacks pose a threat to the security of identity documents, especially with respect to a subsequent access control process, because they allow both involved individuals to use the same document. Several algorithms are currently being developed to detect Morphing Attacks, often requiring large data sets of morphed face images for training. In the present study, face embeddings are used for two different purposes: first, to pre-select images for the subsequent large-scale generation of Morphing Attacks, and second, to detect potential Morphing Attacks. Previous studies have demonstrated the power of embeddings in both use cases. However, we aim to build on these studies by adding the more powerful MagFace model to both use cases, and by performing comprehensive analyses of the role of embeddings in pre-selection and attack detection in terms of the vulnerability of face recognition systems and attack detection algorithms. In particular, we use recent developments to assess the attack potential, but also investigate the influence of morphing algorithms. For the first objective, an algorithm is developed that pairs individuals based on the similarity of their face embeddings. Different state-of-the-art face recognition systems are used to extract embeddings in order to pre-select the face images and different morphing algorithms are used to fuse the face images. The attack potential of the differently generated morphed face images will be quantified to compare the usability of the embeddings for automatically generating a large number of successful Morphing Attacks. For the second objective, we compare the performance of the embeddings of two state-of-the-art face recognition systems with respect to their ability to detect morphed face images. Our results demonstrate that ArcFace and MagFace provide valuable face embeddings for image pre-selection. Various open-source and commercial-off-the-shelf face recognition systems are vulnerable to the generated Morphing Attacks, and their vulnerability increases when image pre-selection is based on embeddings compared to random pairing. In particular, landmark-based closed-source morphing algorithms generate attacks that pose a high risk to any tested face recognition system. Remarkably, more accurate face recognition systems show a higher vulnerability to Morphing Attacks. Among the systems tested, commercial-off-the-shelf systems were the most vulnerable to Morphing Attacks. In addition, MagFace embeddings stand out as a robust alternative for detecting morphed face images compared to the previously used ArcFace embeddings. The results endorse the benefits of face embeddings for more effective image pre-selection for face morphing and for more accurate detection of morphed face images, as demonstrated by extensive analysis of various designed attacks. The MagFace model is a powerful alternative to the often-used ArcFace model in detecting attacks and can increase performance depending on the use case. It also highlights the usability of embeddings to generate large-scale morphed face databases for various purposes, such as training Morphing Attack Detection algorithms as a countermeasure against attacks.


Assuntos
Algoritmos , Segurança Computacional , Humanos , Face , Processamento de Imagem Assistida por Computador/métodos , Reconhecimento Facial Automatizado/métodos , Reconhecimento Facial
18.
Neural Netw ; 175: 106275, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38653078

RESUMO

Face Anti-Spoofing (FAS) seeks to protect face recognition systems from spoofing attacks, which is applied extensively in scenarios such as access control, electronic payment, and security surveillance systems. Face anti-spoofing requires the integration of local details and global semantic information. Existing CNN-based methods rely on small stride or image patch-based feature extraction structures, which struggle to capture spatial and cross-layer feature correlations effectively. Meanwhile, Transformer-based methods have limitations in extracting discriminative detailed features. To address the aforementioned issues, we introduce a multi-stage CNN-Transformer-based framework, which extracts local features through the convolutional layer and long-distance feature relationships via self-attention. Based on this, we proposed a cross-attention multi-stage feature fusion, employing semantically high-stage features to query task-relevant features in low-stage features for further cross-stage feature fusion. To enhance the discrimination of local features for subtle differences, we design pixel-wise material classification supervision and add a auxiliary branch in the intermediate layers of the model. Moreover, to address the limitations of a single acquisition environment and scarcity of acquisition devices in the existing Near-Infrared dataset, we create a large-scale Near-Infrared Face Anti-Spoofing dataset with 380k pictures of 1040 identities. The proposed method could achieve the state-of-the-art in OULU-NPU and our proposed Near-Infrared dataset at just 1.3GFlops and 3.2M parameter numbers, which demonstrate the effective of the proposed method.


Assuntos
Redes Neurais de Computação , Humanos , Reconhecimento Facial Automatizado/métodos , Processamento de Imagem Assistida por Computador/métodos , Face , Segurança Computacional , Algoritmos
19.
Auris Nasus Larynx ; 51(3): 460-464, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38520978

RESUMO

OBJECTIVE: While subjective methods like the Yanagihara system and the House-Brackmann system are standard in evaluating facial paralysis, they are limited by intra- and inter-observer variability. Meanwhile, quantitative objective methods such as electroneurography and electromyography are time-consuming. Our aim was to introduce a swift, objective, and quantitative method for evaluating facial movements. METHODS: We developed an application software (app) that utilizes the facial recognition functionality of the iPhone (Apple Inc., Cupertino, USA) for facial movement evaluation. This app leverages the phone's front camera, infrared radiation, and infrared camera to provide detailed three-dimensional facial topology. It quantitatively compares left and right facial movements by region and displays the movement ratio of the affected side to the opposite side. Evaluations using the app were conducted on both normal and facial palsy subjects and were compared with conventional methods. RESULTS: Our app provided an intuitive user experience, completing evaluations in under a minute, and thus proving practical for regular use. Its evaluation scores correlated highly with the Yanagihara system, the House-Brackmann system, and electromyography. Furthermore, the app outperformed conventional methods in assessing detailed facial movements. CONCLUSION: Our novel iPhone app offers a valuable tool for the comprehensive and efficient evaluation of facial palsy.


Assuntos
Reconhecimento Facial Automatizado , Doenças do Nervo Facial , Aplicativos Móveis , Paralisia , Aplicativos Móveis/normas , Doenças do Nervo Facial/diagnóstico , Paralisia/diagnóstico , Reconhecimento Facial Automatizado/instrumentação , Fatores de Tempo , Reprodutibilidade dos Testes , Humanos
20.
IEEE Trans Pattern Anal Mach Intell ; 46(8): 5209-5226, 2024 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-38315605

RESUMO

Demographic biases in source datasets have been shown as one of the causes of unfairness and discrimination in the predictions of Machine Learning models. One of the most prominent types of demographic bias are statistical imbalances in the representation of demographic groups in the datasets. In this article, we study the measurement of these biases by reviewing the existing metrics, including those that can be borrowed from other disciplines. We develop a taxonomy for the classification of these metrics, providing a practical guide for the selection of appropriate metrics. To illustrate the utility of our framework, and to further understand the practical characteristics of the metrics, we conduct a case study of 20 datasets used in Facial Emotion Recognition (FER), analyzing the biases present in them. Our experimental results show that many metrics are redundant and that a reduced subset of metrics may be sufficient to measure the amount of demographic bias. The article provides valuable insights for researchers in AI and related fields to mitigate dataset bias and improve the fairness and accuracy of AI models.


Assuntos
Bases de Dados Factuais , Expressão Facial , Humanos , Reconhecimento Facial Automatizado/métodos , Algoritmos , Viés , Aprendizado de Máquina , Processamento de Imagem Assistida por Computador/métodos , Demografia , Face/anatomia & histologia , Face/diagnóstico por imagem , Reconhecimento Automatizado de Padrão/métodos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA