Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 26.743
Filtrar
1.
Med Eng Phys ; 130: 104198, 2024 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-39160026

RESUMO

Intention detection of the reaching movement is considerable for myoelectric human and machine collaboration applications. A comprehensive set of handcrafted features was mined from windows of electromyogram (EMG) of the upper-limb muscles while reaching nine nearby targets like activities of daily living. The feature selection-based scoring method, neighborhood component analysis (NCA), selected the relevant feature subset. Finally, the target was recognized by the support vector machine (SVM) model. The classification performance was generalized by a nested cross-validation structure that selected the optimal feature subset in the inner loop. According to the low spatial resolution of the target location on display and following the slight discrimination of signals between targets, the best classification accuracy of 77.11 % was achieved for concatenating the features of two segments with a length of 2 and 0.25 s. Due to the lack of subtle variation in EMG, while reaching different targets, a wide range of features was applied to consider additional aspects of the knowledge contained in EMG signals. Furthermore, since NCA selected features that provided more discriminant power, it became achievable to employ various combinations of features and even concatenated features extracted from different movement parts to improve classification performance.


Assuntos
Eletromiografia , Movimento , Reconhecimento Automatizado de Padrão , Processamento de Sinais Assistido por Computador , Máquina de Vetores de Suporte , Humanos , Masculino , Adulto , Feminino , Adulto Jovem , Atividades Cotidianas
2.
Sensors (Basel) ; 24(15)2024 Jul 25.
Artigo em Inglês | MEDLINE | ID: mdl-39123885

RESUMO

Pattern recognition (PR)-based myoelectric control systems can naturally provide multifunctional and intuitive control of upper limb prostheses and restore lost limb function, but understanding their robustness remains an open scientific question. This study investigates how limb positions and electrode shifts-two factors that have been suggested to cause classification deterioration-affect classifiers' performance by quantifying changes in the class distribution using each factor as a class and computing the repeatability and modified separability indices. Ten intact-limb participants took part in the study. Linear discriminant analysis (LDA) was used as the classifier. The results confirmed previous studies that limb positions and electrode shifts deteriorate classification performance (14-21% decrease) with no difference between factors (p > 0.05). When considering limb positions and electrode shifts as classes, we could classify them with an accuracy of 96.13 ± 1.44% and 65.40 ± 8.23% for single and all motions, respectively. Testing on five amputees corroborated the above findings. We have demonstrated that each factor introduces changes in the feature space that are statistically new class instances. Thus, the feature space contains two statistically classifiable clusters when the same motion is collected in two different limb positions or electrode shifts. Our results are a step forward in understanding PR schemes' challenges for myoelectric control of prostheses and further validation needs be conducted on more amputee-related datasets.


Assuntos
Amputados , Membros Artificiais , Eletrodos , Eletromiografia , Reconhecimento Automatizado de Padrão , Humanos , Eletromiografia/métodos , Masculino , Adulto , Reconhecimento Automatizado de Padrão/métodos , Amputados/reabilitação , Feminino , Análise Discriminante , Adulto Jovem , Extremidades/fisiologia
3.
Sensors (Basel) ; 24(15)2024 Jul 25.
Artigo em Inglês | MEDLINE | ID: mdl-39123896

RESUMO

For successful human-robot collaboration, it is crucial to establish and sustain quality interaction between humans and robots, making it essential to facilitate human-robot interaction (HRI) effectively. The evolution of robot intelligence now enables robots to take a proactive role in initiating and sustaining HRI, thereby allowing humans to concentrate more on their primary tasks. In this paper, we introduce a system known as the Robot-Facilitated Interaction System (RFIS), where mobile robots are employed to perform identification, tracking, re-identification, and gesture recognition in an integrated framework to ensure anytime readiness for HRI. We implemented the RFIS on an autonomous mobile robot used for transporting a patient, to demonstrate proactive, real-time, and user-friendly interaction with a caretaker involved in monitoring and nursing the patient. In the implementation, we focused on the efficient and robust integration of various interaction facilitation modules within a real-time HRI system that operates in an edge computing environment. Experimental results show that the RFIS, as a comprehensive system integrating caretaker recognition, tracking, re-identification, and gesture recognition, can provide an overall high quality of interaction in HRI facilitation with average accuracies exceeding 90% during real-time operations at 5 FPS.


Assuntos
Gestos , Robótica , Robótica/métodos , Humanos , Reconhecimento Automatizado de Padrão/métodos , Algoritmos , Inteligência Artificial
4.
Sensors (Basel) ; 24(15)2024 Jul 26.
Artigo em Inglês | MEDLINE | ID: mdl-39123907

RESUMO

Skeleton-based action recognition, renowned for its computational efficiency and indifference to lighting variations, has become a focal point in the realm of motion analysis. However, most current methods typically only extract global skeleton features, overlooking the potential semantic relationships among various partial limb motions. For instance, the subtle differences between actions such as "brush teeth" and "brush hair" are mainly distinguished by specific elements. Although combining limb movements provides a more holistic representation of an action, relying solely on skeleton points proves inadequate for capturing these nuances. Therefore, integrating detailed linguistic descriptions into the learning process of skeleton features is essential. This motivates us to explore integrating fine-grained language descriptions into the learning process of skeleton features to capture more discriminative skeleton behavior representations. To this end, we introduce a new Linguistic-Driven Partial Semantic Relevance Learning framework (LPSR) in this work. While using state-of-the-art large language models to generate linguistic descriptions of local limb motions and further constrain the learning of local motions, we also aggregate global skeleton point representations and textual representations (which generated from an LLM) to obtain a more generalized cross-modal behavioral representation. On this basis, we propose a cyclic attentional interaction module to model the implicit correlations between partial limb motions. Numerous ablation experiments demonstrate the effectiveness of the method proposed in this paper, and our method also obtains state-of-the-art results.


Assuntos
Semântica , Humanos , Linguística , Movimento/fisiologia , Reconhecimento Automatizado de Padrão/métodos , Algoritmos , Aprendizagem/fisiologia
5.
Sensors (Basel) ; 24(15)2024 Jul 30.
Artigo em Inglês | MEDLINE | ID: mdl-39123986

RESUMO

Human action recognition (HAR) technology based on radar signals has garnered significant attention from both industry and academia due to its exceptional privacy-preserving capabilities, noncontact sensing characteristics, and insensitivity to lighting conditions. However, the scarcity of accurately labeled human radar data poses a significant challenge in meeting the demand for large-scale training datasets required by deep model-based HAR technology, thus substantially impeding technological advancements in this field. To address this issue, a semi-supervised learning algorithm, MF-Match, is proposed in this paper. This algorithm computes pseudo-labels for larger-scale unsupervised radar data, enabling the model to extract embedded human behavioral information and enhance the accuracy of HAR algorithms. Furthermore, the method incorporates contrastive learning principles to improve the quality of model-generated pseudo-labels and mitigate the impact of mislabeled pseudo-labels on recognition performance. Experimental results demonstrate that this method achieves action recognition accuracies of 86.69% and 91.48% on two widely used radar spectrum datasets, respectively, utilizing only 10% labeled data, thereby validating the effectiveness of the proposed approach.


Assuntos
Algoritmos , Humanos , Radar , Aprendizado de Máquina Supervisionado , Reconhecimento Automatizado de Padrão/métodos , Atividades Humanas
6.
Sensors (Basel) ; 24(15)2024 Aug 05.
Artigo em Inglês | MEDLINE | ID: mdl-39124111

RESUMO

Due to the increasing severity of aging populations in modern society, the accurate and timely identification of, and responses to, sudden abnormal behaviors of the elderly have become an urgent and important issue. In the current research on computer vision-based abnormal behavior recognition, most algorithms have shown poor generalization and recognition abilities in practical applications, as well as issues with recognizing single actions. To address these problems, an MSCS-DenseNet-LSTM model based on a multi-scale attention mechanism is proposed. This model integrates the MSCS (Multi-Scale Convolutional Structure) module into the initial convolutional layer of the DenseNet model to form a multi-scale convolution structure. It introduces the improved Inception X module into the Dense Block to form an Inception Dense structure, and gradually performs feature fusion through each Dense Block module. The CBAM attention mechanism module is added to the dual-layer LSTM to enhance the model's generalization ability while ensuring the accurate recognition of abnormal actions. Furthermore, to address the issue of single-action abnormal behavior datasets, the RGB image dataset RIDS (RGB image dataset) and the contour image dataset CIDS (contour image dataset) containing various abnormal behaviors were constructed. The experimental results validate that the proposed MSCS-DenseNet-LSTM model achieved an accuracy, sensitivity, and specificity of 98.80%, 98.75%, and 98.82% on the two datasets, and 98.30%, 98.28%, and 98.38%, respectively.


Assuntos
Algoritmos , Redes Neurais de Computação , Humanos , Reconhecimento Automatizado de Padrão/métodos , Comportamento/fisiologia , Processamento de Imagem Assistida por Computador/métodos
7.
Brain Behav ; 14(8): e3519, 2024 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-39169422

RESUMO

BACKGROUND: Neurological disorders pose a significant health challenge, and their early detection is critical for effective treatment planning and prognosis. Traditional classification of neural disorders based on causes, symptoms, developmental stage, severity, and nervous system effects has limitations. Leveraging artificial intelligence (AI) and machine learning (ML) for pattern recognition provides a potent solution to address these challenges. Therefore, this study focuses on proposing an innovative approach-the Aggregated Pattern Classification Method (APCM)-for precise identification of neural disorder stages. METHOD: The APCM was introduced to address prevalent issues in neural disorder detection, such as overfitting, robustness, and interoperability. This method utilizes aggregative patterns and classification learning functions to mitigate these challenges and enhance overall recognition accuracy, even in imbalanced data. The analysis involves neural images using observations from healthy individuals as a reference. Action response patterns from diverse inputs are mapped to identify similar features, establishing the disorder ratio. The stages are correlated based on available responses and associated neural data, with a preference for classification learning. This classification necessitates image and labeled data to prevent additional flaws in pattern recognition. Recognition and classification occur through multiple iterations, incorporating similar and diverse neural features. The learning process is finely tuned for minute classifications using labeled and unlabeled input data. RESULTS: The proposed APCM demonstrates notable achievements, with high pattern recognition (15.03%) and controlled classification errors (CEs) (10.61% less). The method effectively addresses overfitting, robustness, and interoperability issues, showcasing its potential as a powerful tool for detecting neural disorders at different stages. The ability to handle imbalanced data contributes to the overall success of the algorithm. CONCLUSION: The APCM emerges as a promising and effective approach for identifying precise neural disorder stages. By leveraging AI and ML, the method successfully resolves key challenges in pattern recognition. The high pattern recognition and reduced CEs underscore the method's potential for clinical applications. However, it is essential to acknowledge the reliance on high-quality neural image data, which may limit the generalizability of the approach. The proposed method allows future research to refine further and enhance its interpretability, providing valuable insights into neural disorder progression and underlying biological mechanisms.


Assuntos
Aprendizado de Máquina , Humanos , Doenças do Sistema Nervoso/classificação , Doenças do Sistema Nervoso/diagnóstico , Reconhecimento Automatizado de Padrão/métodos , Inteligência Artificial
8.
Math Biosci Eng ; 21(7): 6631-6657, 2024 Jul 17.
Artigo em Inglês | MEDLINE | ID: mdl-39176412

RESUMO

Facial emotion recognition (FER) is largely utilized to analyze human emotion in order to address the needs of many real-time applications such as computer-human interfaces, emotion detection, forensics, biometrics, and human-robot collaboration. Nonetheless, existing methods are mostly unable to offer correct predictions with a minimum error rate. In this paper, an innovative facial emotion recognition framework, termed extended walrus-based deep learning with Botox feature selection network (EWDL-BFSN), was designed to accurately detect facial emotions. The main goals of the EWDL-BFSN are to identify facial emotions automatically and effectively by choosing the optimal features and adjusting the hyperparameters of the classifier. The gradient wavelet anisotropic filter (GWAF) can be used for image pre-processing in the EWDL-BFSN model. Additionally, SqueezeNet is used to extract significant features. The improved Botox optimization algorithm (IBoA) is then used to choose the best features. Lastly, FER and classification are accomplished through the use of an enhanced optimization-based kernel residual 50 (EK-ResNet50) network. Meanwhile, a nature-inspired metaheuristic, walrus optimization algorithm (WOA) is utilized to pick the hyperparameters of EK-ResNet50 network model. The EWDL-BFSN model was trained and tested with publicly available CK+ and FER-2013 datasets. The Python platform was applied for implementation, and various performance metrics such as accuracy, sensitivity, specificity, and F1-score were analyzed with state-of-the-art methods. The proposed EWDL-BFSN model acquired an overall accuracy of 99.37 and 99.25% for both CK+ and FER-2013 datasets and proved its superiority in predicting facial emotions over state-of-the-art methods.


Assuntos
Algoritmos , Aprendizado Profundo , Emoções , Expressão Facial , Humanos , Processamento de Imagem Assistida por Computador/métodos , Redes Neurais de Computação , Bases de Dados Factuais , Reconhecimento Automatizado de Padrão/métodos , Face , Reprodutibilidade dos Testes
9.
Sensors (Basel) ; 24(14)2024 Jul 15.
Artigo em Inglês | MEDLINE | ID: mdl-39065968

RESUMO

Human action recognition based on optical and infrared video data is greatly affected by the environment, and feature extraction in traditional machine learning classification methods is complex; therefore, this paper proposes a method for human action recognition using Frequency Modulated Continuous Wave (FMCW) radar based on an asymmetric convolutional residual network. First, the radar echo data are analyzed and processed to extract the micro-Doppler time domain spectrograms of different actions. Second, a strategy combining asymmetric convolution and the Mish activation function is adopted in the residual block of the ResNet18 network to address the limitations of linear and nonlinear transformations in the residual block for micro-Doppler spectrum recognition. This approach aims to enhance the network's ability to learn features effectively. Finally, the Improved Convolutional Block Attention Module (ICBAM) is integrated into the residual block to enhance the model's attention and comprehension of input data. The experimental results demonstrate that the proposed method achieves a high accuracy of 98.28% in action recognition and classification within complex scenes, surpassing classic deep learning approaches. Moreover, this method significantly improves the recognition accuracy for actions with similar micro-Doppler features and demonstrates excellent anti-noise recognition performance.


Assuntos
Redes Neurais de Computação , Radar , Humanos , Algoritmos , Aprendizado de Máquina , Atividades Humanas/classificação , Aprendizado Profundo , Reconhecimento Automatizado de Padrão/métodos
10.
Sensors (Basel) ; 24(14)2024 Jul 15.
Artigo em Inglês | MEDLINE | ID: mdl-39065979

RESUMO

By leveraging artificial intelligence and big data to analyze and assess classroom conditions, we can significantly enhance teaching quality. Nevertheless, numerous existing studies primarily concentrate on evaluating classroom conditions for student groups, often neglecting the need for personalized instructional support for individual students. To address this gap and provide a more focused analysis of individual students in the classroom environment, we implemented an embedded application design using face recognition technology and target detection algorithms. The Insightface face recognition algorithm was employed to identify students by constructing a classroom face dataset and training it; simultaneously, classroom behavioral data were collected and trained, utilizing the YOLOv5 algorithm to detect students' body regions and correlate them with their facial regions to identify students accurately. Subsequently, these modeling algorithms were deployed onto an embedded device, the Atlas 200 DK, for application development, enabling the recording of both overall classroom conditions and individual student behaviors. Test results show that the detection precision for various types of behaviors is above 0.67. The average false detection rate for face recognition is 41.5%. The developed embedded application can reliably detect student behavior in a classroom setting, identify students, and capture image sequences of body regions associated with negative behavior for better management. These data empower teachers to gain a deeper understanding of their students, which is crucial for enhancing teaching quality and addressing the individual needs of students.


Assuntos
Algoritmos , Humanos , Estudantes , Inteligência Artificial , Face/fisiologia , Reconhecimento Facial/fisiologia , Reconhecimento Facial Automatizado/métodos , Processamento de Imagem Assistida por Computador/métodos , Feminino , Reconhecimento Automatizado de Padrão/métodos
11.
Sensors (Basel) ; 24(13)2024 Jun 25.
Artigo em Inglês | MEDLINE | ID: mdl-39000889

RESUMO

Emotions in speech are expressed in various ways, and the speech emotion recognition (SER) model may perform poorly on unseen corpora that contain different emotional factors from those expressed in training databases. To construct an SER model robust to unseen corpora, regularization approaches or metric losses have been studied. In this paper, we propose an SER method that incorporates relative difficulty and labeling reliability of each training sample. Inspired by the Proxy-Anchor loss, we propose a novel loss function which gives higher gradients to the samples for which the emotion labels are more difficult to estimate among those in the given minibatch. Since the annotators may label the emotion based on the emotional expression which resides in the conversational context or other modality but is not apparent in the given speech utterance, some of the emotional labels may not be reliable and these unreliable labels may affect the proposed loss function more severely. In this regard, we propose to apply label smoothing for the samples misclassified by a pre-trained SER model. Experimental results showed that the performance of the SER on unseen corpora was improved by adopting the proposed loss function with label smoothing on the misclassified data.


Assuntos
Emoções , Fala , Humanos , Emoções/fisiologia , Fala/fisiologia , Algoritmos , Reprodutibilidade dos Testes , Reconhecimento Automatizado de Padrão/métodos , Bases de Dados Factuais
12.
Sensors (Basel) ; 24(13)2024 Jun 26.
Artigo em Inglês | MEDLINE | ID: mdl-39000930

RESUMO

Convolutional neural networks (CNNs) have made significant progress in the field of facial expression recognition (FER). However, due to challenges such as occlusion, lighting variations, and changes in head pose, facial expression recognition in real-world environments remains highly challenging. At the same time, methods solely based on CNN heavily rely on local spatial features, lack global information, and struggle to balance the relationship between computational complexity and recognition accuracy. Consequently, the CNN-based models still fall short in their ability to address FER adequately. To address these issues, we propose a lightweight facial expression recognition method based on a hybrid vision transformer. This method captures multi-scale facial features through an improved attention module, achieving richer feature integration, enhancing the network's perception of key facial expression regions, and improving feature extraction capabilities. Additionally, to further enhance the model's performance, we have designed the patch dropping (PD) module. This module aims to emulate the attention allocation mechanism of the human visual system for local features, guiding the network to focus on the most discriminative features, reducing the influence of irrelevant features, and intuitively lowering computational costs. Extensive experiments demonstrate that our approach significantly outperforms other methods, achieving an accuracy of 86.51% on RAF-DB and nearly 70% on FER2013, with a model size of only 3.64 MB. These results demonstrate that our method provides a new perspective for the field of facial expression recognition.


Assuntos
Expressão Facial , Redes Neurais de Computação , Humanos , Reconhecimento Facial Automatizado/métodos , Algoritmos , Processamento de Imagem Assistida por Computador/métodos , Face , Reconhecimento Automatizado de Padrão/métodos
13.
Sensors (Basel) ; 24(13)2024 Jun 28.
Artigo em Inglês | MEDLINE | ID: mdl-39000981

RESUMO

This work presents a novel approach for elbow gesture recognition using an array of inductive sensors and a machine learning algorithm (MLA). This paper describes the design of the inductive sensor array integrated into a flexible and wearable sleeve. The sensor array consists of coils sewn onto the sleeve, which form an LC tank circuit along with the externally connected inductors and capacitors. Changes in the elbow position modulate the inductance of these coils, allowing the sensor array to capture a range of elbow movements. The signal processing and random forest MLA to recognize 10 different elbow gestures are described. Rigorous evaluation on 8 subjects and data augmentation, which leveraged the dataset to 1270 trials per gesture, enabled the system to achieve remarkable accuracy of 98.3% and 98.5% using 5-fold cross-validation and leave-one-subject-out cross-validation, respectively. The test performance was then assessed using data collected from five new subjects. The high classification accuracy of 94% demonstrates the generalizability of the designed system. The proposed solution addresses the limitations of existing elbow gesture recognition designs and offers a practical and effective approach for intuitive human-machine interaction.


Assuntos
Algoritmos , Cotovelo , Gestos , Aprendizado de Máquina , Humanos , Cotovelo/fisiologia , Dispositivos Eletrônicos Vestíveis , Reconhecimento Automatizado de Padrão/métodos , Processamento de Sinais Assistido por Computador , Masculino , Adulto , Feminino
14.
Sci Rep ; 14(1): 15310, 2024 07 03.
Artigo em Inglês | MEDLINE | ID: mdl-38961136

RESUMO

Human activity recognition has a wide range of applications in various fields, such as video surveillance, virtual reality and human-computer intelligent interaction. It has emerged as a significant research area in computer vision. GCN (Graph Convolutional networks) have recently been widely used in these fields and have made great performance. However, there are still some challenges including over-smoothing problem caused by stack graph convolutions and deficient semantics correlation to capture the large movements between time sequences. Vision Transformer (ViT) is utilized in many 2D and 3D image fields and has surprised results. In our work, we propose a novel human activity recognition method based on ViT (HAR-ViT). We integrate enhanced AGCL (eAGCL) in 2s-AGCN to ViT to make it process spatio-temporal data (3D skeleton) and make full use of spatial features. The position encoder module orders the non-sequenced information while the transformer encoder efficiently compresses sequence data features to enhance calculation speed. Human activity recognition is accomplished through multi-layer perceptron (MLP) classifier. Experimental results demonstrate that the proposed method achieves SOTA performance on three extensively used datasets, NTU RGB+D 60, NTU RGB+D 120 and Kinetics-Skeleton 400.


Assuntos
Atividades Humanas , Humanos , Redes Neurais de Computação , Algoritmos , Reconhecimento Automatizado de Padrão/métodos , Processamento de Imagem Assistida por Computador/métodos , Imageamento Tridimensional/métodos
15.
Int J Neural Syst ; 34(9): 2450049, 2024 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-39010725

RESUMO

Abnormal behavior recognition is an important technology used to detect and identify activities or events that deviate from normal behavior patterns. It has wide applications in various fields such as network security, financial fraud detection, and video surveillance. In recent years, Deep Convolution Networks (ConvNets) have been widely applied in abnormal behavior recognition algorithms and have achieved significant results. However, existing abnormal behavior detection algorithms mainly focus on improving the accuracy of the algorithms and have not explored the real-time nature of abnormal behavior recognition. This is crucial to quickly identify abnormal behavior in public places and improve urban public safety. Therefore, this paper proposes an abnormal behavior recognition algorithm based on three-dimensional (3D) dense connections. The proposed algorithm uses a multi-instance learning strategy to classify various types of abnormal behaviors, and employs dense connection modules and soft-threshold attention mechanisms to reduce the model's parameter count and enhance network computational efficiency. Finally, redundant information in the sequence is reduced by attention allocation to mitigate its negative impact on recognition results. Experimental verification shows that our method achieves a recognition accuracy of 95.61% on the UCF-crime dataset. Comparative experiments demonstrate that our model has strong performance in terms of recognition accuracy and speed.


Assuntos
Redes Neurais de Computação , Humanos , Reconhecimento Automatizado de Padrão/métodos , Aprendizado Profundo , Algoritmos , Crime , Comportamento/fisiologia
16.
IEEE J Biomed Health Inform ; 28(7): 3872-3881, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38954558

RESUMO

Electroencephalogram (EEG) has been widely utilized in emotion recognition due to its high temporal resolution and reliability. However, the individual differences and non-stationary characteristics of EEG, along with the complexity and variability of emotions, pose challenges in generalizing emotion recognition models across subjects. In this paper, an end-to-end framework is proposed to improve the performance of cross-subject emotion recognition. A novel evolutionary programming (EP)-based optimization strategy with neural network (NN) as the base classifier termed NN ensemble with EP (EPNNE) is designed for cross-subject emotion recognition. The effectiveness of the proposed method is evaluated on the publicly available DEAP, FACED, SEED, and SEED-IV datasets. Numerical results demonstrate that the proposed method is superior to state-of-the-art cross-subject emotion recognition methods. The proposed end-to-end framework for cross-subject emotion recognition aids biomedical researchers in effectively assessing individual emotional states, thereby enabling efficient treatment and interventions.


Assuntos
Eletroencefalografia , Emoções , Processamento de Sinais Assistido por Computador , Humanos , Eletroencefalografia/métodos , Emoções/fisiologia , Redes Neurais de Computação , Aprendizado de Máquina , Algoritmos , Reconhecimento Automatizado de Padrão/métodos , Bases de Dados Factuais , Adulto , Feminino , Masculino
17.
Sensors (Basel) ; 24(14)2024 Jul 17.
Artigo em Inglês | MEDLINE | ID: mdl-39066043

RESUMO

Human activity recognition (HAR) is pivotal in advancing applications ranging from healthcare monitoring to interactive gaming. Traditional HAR systems, primarily relying on single data sources, face limitations in capturing the full spectrum of human activities. This study introduces a comprehensive approach to HAR by integrating two critical modalities: RGB imaging and advanced pose estimation features. Our methodology leverages the strengths of each modality to overcome the drawbacks of unimodal systems, providing a richer and more accurate representation of activities. We propose a two-stream network that processes skeletal and RGB data in parallel, enhanced by pose estimation techniques for refined feature extraction. The integration of these modalities is facilitated through advanced fusion algorithms, significantly improving recognition accuracy. Extensive experiments conducted on the UTD multimodal human action dataset (UTD MHAD) demonstrate that the proposed approach exceeds the performance of existing state-of-the-art algorithms, yielding improved outcomes. This study not only sets a new benchmark for HAR systems but also highlights the importance of feature engineering in capturing the complexity of human movements and the integration of optimal features. Our findings pave the way for more sophisticated, reliable, and applicable HAR systems in real-world scenarios.


Assuntos
Algoritmos , Atividades Humanas , Humanos , Processamento de Imagem Assistida por Computador/métodos , Movimento/fisiologia , Postura/fisiologia , Reconhecimento Automatizado de Padrão/métodos
18.
Sci Rep ; 14(1): 17155, 2024 07 26.
Artigo em Inglês | MEDLINE | ID: mdl-39060307

RESUMO

Gait recognition has become an increasingly promising area of research in the search for noninvasive and effective methods of person identification. Its potential applications in security systems and medical diagnosis make it an exciting field with wide-ranging implications. However, precisely recognizing and assessing gait patterns is difficult, particularly in changing situations or from multiple perspectives. In this study, we utilized the widely used CASIA-B dataset to observe the performance of our proposed gait recognition model, with the aim of addressing some of the existing limitations in this field. Fifty individuals are randomly selected from the dataset, and the resulting data are split evenly for training and testing purposes. We begin by excerpting features from gait photos using two well-known deep learning networks, MobileNetV1 and Xception. We then combined these features and reduced their dimensionality via principal component analysis (PCA) to improve the model's performance. We subsequently assessed the model using two distinct classifiers: a random forest and a one against all support vector machine (OaA-SVM). The findings indicate that the OaA-SVM classifier manifests superior performance compared to the others, with a mean accuracy of 98.77% over eleven different viewing angles. This study is conducive to the development of effective gait recognition algorithms that can be applied to heighten people's security and promote their well-being.


Assuntos
Marcha , Análise de Componente Principal , Máquina de Vetores de Suporte , Humanos , Marcha/fisiologia , Algoritmos , Aprendizado Profundo , Feminino , Masculino , Reconhecimento Automatizado de Padrão/métodos , Adulto
19.
Artigo em Inglês | MEDLINE | ID: mdl-38869995

RESUMO

Gesture recognition is crucial for enhancing human-computer interaction and is particularly pivotal in rehabilitation contexts, aiding individuals recovering from physical impairments and significantly improving their mobility and interactive capabilities. However, current wearable hand gesture recognition approaches are often limited in detection performance, wearability, and generalization. We thus introduce EchoGest, a novel hand gesture recognition system based on soft, stretchable, transparent artificial skin with integrated ultrasonic waveguides. Our presented system is the first to use soft ultrasonic waveguides for hand gesture recognition. EcoflexTM 00-31 and EcoflexTM 00-45 Near ClearTM silicone elastomers were employed to fabricate the artificial skin and ultrasonic waveguides, while 0.1 mm diameter silver-plated copper wires connected the transducers in the waveguides to the electrical system. The wires are enclosed within an additional elastomer layer, achieving a sensing skin with a total thickness of around 500 µ m. Ten participants wore the EchoGest system and performed static hand gestures from two gesture sets: 8 daily life gestures and 10 American Sign Language (ASL) digits 0-9. Leave-One-Subject-Out Cross-Validation analysis demonstrated accuracies of 91.13% for daily life gestures and 88.5% for ASL gestures. The EchoGest system has significant potential in rehabilitation, particularly for tracking and evaluating hand mobility, which could substantially reduce the workload of therapists in both clinical and home-based settings. Integrating this technology could revolutionize hand gesture recognition applications, from real-time sign language translation to innovative rehabilitation techniques.


Assuntos
Gestos , Mãos , Reconhecimento Automatizado de Padrão , Dispositivos Eletrônicos Vestíveis , Humanos , Feminino , Mãos/fisiologia , Adulto , Masculino , Reconhecimento Automatizado de Padrão/métodos , Adulto Jovem , Ultrassom , Algoritmos , Elastômeros de Silicone , Pele , Reprodutibilidade dos Testes
20.
Opt Express ; 32(10): 16645-16656, 2024 May 06.
Artigo em Inglês | MEDLINE | ID: mdl-38858865

RESUMO

Single-Photon Avalanche Diode (SPAD) direct Time-of-Flight (dToF) sensors provide depth imaging over long distances, enabling the detection of objects even in the absence of contrast in colour or texture. However, distant objects are represented by just a few pixels and are subject to noise from solar interference, limiting the applicability of existing computer vision techniques for high-level scene interpretation. We present a new SPAD-based vision system for human activity recognition, based on convolutional and recurrent neural networks, which is trained entirely on synthetic data. In tests using real data from a 64×32 pixel SPAD, captured over a distance of 40 m, the scheme successfully overcomes the limited transverse resolution (in which human limbs are approximately one pixel across), achieving an average accuracy of 89% in distinguishing between seven different activities. The approach analyses continuous streams of video-rate depth data at a maximal rate of 66 FPS when executed on a GPU, making it well-suited for real-time applications such as surveillance or situational awareness in autonomous systems.


Assuntos
Fótons , Humanos , Atividades Humanas , Redes Neurais de Computação , Reconhecimento Automatizado de Padrão/métodos , Desenho de Equipamento
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...