Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 18.184
Filter
1.
Biomed Phys Eng Express ; 10(6)2024 Sep 12.
Article in English | MEDLINE | ID: mdl-39231462

ABSTRACT

Hand Movement Recognition (HMR) with sEMG is crucial for artificial hand prostheses. HMR performance mostly depends on the feature information that is fed to the classifiers. However, sEMG often captures noise like power line interference (PLI) and motion artifacts. This may extract redundant and insignificant feature information, which can degrade HMR performance and increase computational complexity. This study aims to address these issues by proposing a novel procedure for automatically removing PLI and motion artifacts from experimental sEMG signals. This will make it possible to extract better features from the signal and improve the categorization of various hand movements. Empirical mode decomposition and energy entropy thresholding are utilized to select relevant mode components for artifact removal. Time domain features are then used to train classifiers (kNN, LDA, SVM) for hand movement categorization, achieving average accuracies of 92.36%, 93.63%, and 98.12%, respectively, across subjects. Additionally, muscle contraction efforts are classified into low, medium, and high categories using this technique. Validation is performed on data from ten subjects performing eight hand movement classes and three muscle contraction efforts with three surface electrode channels. Results indicate that the proposed preprocessing improves average accuracy by 9.55% with the SVM classifier, significantly reducing computational time.


Subject(s)
Algorithms , Artifacts , Electromyography , Hand , Movement , Pattern Recognition, Automated , Signal Processing, Computer-Assisted , Humans , Electromyography/methods , Hand/physiology , Pattern Recognition, Automated/methods , Male , Muscle Contraction , Adult , Artificial Limbs , Female , Motion , Muscle, Skeletal/physiology
2.
Sensors (Basel) ; 24(17)2024 Aug 24.
Article in English | MEDLINE | ID: mdl-39275411

ABSTRACT

Gait recognition based on gait silhouette profiles is currently a major approach in the field of gait recognition. In previous studies, models typically used gait silhouette images sized at 64 × 64 pixels as input data. However, in practical applications, cases may arise where silhouette images are smaller than 64 × 64, leading to a loss in detail information and significantly affecting model accuracy. To address these challenges, we propose a gait recognition system named Multi-scale Feature Cross-Fusion Gait (MFCF-Gait). At the input stage of the model, we employ super-resolution algorithms to preprocess the data. During this process, we observed that different super-resolution algorithms applied to larger silhouette images also affect training outcomes. Improved super-resolution algorithms contribute to enhancing model performance. In terms of model architecture, we introduce a multi-scale feature cross-fusion network model. By integrating low-level feature information from higher-resolution images with high-level feature information from lower-resolution images, the model emphasizes smaller-scale details, thereby improving recognition accuracy for smaller silhouette images. The experimental results on the CASIA-B dataset demonstrate significant improvements. On 64 × 64 silhouette images, the accuracies for NM, BG, and CL states reached 96.49%, 91.42%, and 78.24%, respectively. On 32 × 32 silhouette images, the accuracies were 94.23%, 87.68%, and 71.57%, respectively, showing notable enhancements.


Subject(s)
Algorithms , Gait , Gait/physiology , Humans , Image Processing, Computer-Assisted/methods , Pattern Recognition, Automated/methods
3.
Sensors (Basel) ; 24(17)2024 Sep 02.
Article in English | MEDLINE | ID: mdl-39275615

ABSTRACT

Speech emotion recognition is key to many fields, including human-computer interaction, healthcare, and intelligent assistance. While acoustic features extracted from human speech are essential for this task, not all of them contribute to emotion recognition effectively. Thus, reduced numbers of features are required within successful emotion recognition models. This work aimed to investigate whether splitting the features into two subsets based on their distribution and then applying commonly used feature reduction methods would impact accuracy. Filter reduction was employed using the Kruskal-Wallis test, followed by principal component analysis (PCA) and independent component analysis (ICA). A set of features was investigated to determine whether the indiscriminate use of parametric feature reduction techniques affects the accuracy of emotion recognition. For this investigation, data from three databases-Berlin EmoDB, SAVEE, and RAVDES-were organized into subsets according to their distribution in applying both PCA and ICA. The results showed a reduction from 6373 features to 170 for the Berlin EmoDB database with an accuracy of 84.3%; a final size of 130 features for SAVEE, with a corresponding accuracy of 75.4%; and 150 features for RAVDESS, with an accuracy of 59.9%.


Subject(s)
Emotions , Principal Component Analysis , Speech , Humans , Emotions/physiology , Speech/physiology , Databases, Factual , Algorithms , Pattern Recognition, Automated/methods
4.
Sensors (Basel) ; 24(17)2024 Sep 03.
Article in English | MEDLINE | ID: mdl-39275635

ABSTRACT

In this paper, we study facial expression recognition (FER) using three modalities obtained from a light field camera: sub-aperture (SA), depth map, and all-in-focus (AiF) images. Our objective is to construct a more comprehensive and effective FER system by investigating multimodal fusion strategies. For this purpose, we employ EfficientNetV2-S, pre-trained on AffectNet, as our primary convolutional neural network. This model, combined with a BiGRU, is used to process SA images. We evaluate various fusion techniques at both decision and feature levels to assess their effectiveness in enhancing FER accuracy. Our findings show that the model using SA images surpasses state-of-the-art performance, achieving 88.13% ± 7.42% accuracy under the subject-specific evaluation protocol and 91.88% ± 3.25% under the subject-independent evaluation protocol. These results highlight our model's potential in enhancing FER accuracy and robustness, outperforming existing methods. Furthermore, our multimodal fusion approach, integrating SA, AiF, and depth images, demonstrates substantial improvements over unimodal models. The decision-level fusion strategy, particularly using average weights, proved most effective, achieving 90.13% ± 4.95% accuracy under the subject-specific evaluation protocol and 93.33% ± 4.92% under the subject-independent evaluation protocol. This approach leverages the complementary strengths of each modality, resulting in a more comprehensive and accurate FER system.


Subject(s)
Facial Expression , Neural Networks, Computer , Humans , Image Processing, Computer-Assisted/methods , Automated Facial Recognition/methods , Algorithms , Pattern Recognition, Automated/methods
5.
Sensors (Basel) ; 24(17)2024 Sep 06.
Article in English | MEDLINE | ID: mdl-39275707

ABSTRACT

Emotion recognition through speech is a technique employed in various scenarios of Human-Computer Interaction (HCI). Existing approaches have achieved significant results; however, limitations persist, with the quantity and diversity of data being more notable when deep learning techniques are used. The lack of a standard in feature selection leads to continuous development and experimentation. Choosing and designing the appropriate network architecture constitutes another challenge. This study addresses the challenge of recognizing emotions in the human voice using deep learning techniques, proposing a comprehensive approach, and developing preprocessing and feature selection stages while constructing a dataset called EmoDSc as a result of combining several available databases. The synergy between spectral features and spectrogram images is investigated. Independently, the weighted accuracy obtained using only spectral features was 89%, while using only spectrogram images, the weighted accuracy reached 90%. These results, although surpassing previous research, highlight the strengths and limitations when operating in isolation. Based on this exploration, a neural network architecture composed of a CNN1D, a CNN2D, and an MLP that fuses spectral features and spectogram images is proposed. The model, supported by the unified dataset EmoDSc, demonstrates a remarkable accuracy of 96%.


Subject(s)
Deep Learning , Emotions , Neural Networks, Computer , Humans , Emotions/physiology , Speech/physiology , Databases, Factual , Algorithms , Pattern Recognition, Automated/methods
6.
Sensors (Basel) ; 24(18)2024 Sep 22.
Article in English | MEDLINE | ID: mdl-39338868

ABSTRACT

Wearable technologies represent a significant advancement in facilitating communication between humans and machines. Powered by artificial intelligence (AI), human gestures detected by wearable sensors can provide people with seamless interaction with physical, digital, and mixed environments. In this paper, the foundations of a gesture-recognition framework for the teleoperation of infrared consumer electronics are established. This framework is based on force myography data of the upper forearm, acquired from a prototype novel soft pressure-based force myography (pFMG) armband. Here, the sub-processes of the framework are detailed, including the acquisition of infrared and force myography data; pre-processing; feature construction/selection; classifier selection; post-processing; and interfacing/actuation. The gesture recognition system is evaluated using 12 subjects' force myography data obtained whilst performing five classes of gestures. Our results demonstrate an inter-session and inter-trial gesture average recognition accuracy of approximately 92.2% and 88.9%, respectively. The gesture recognition framework was successfully able to teleoperate several infrared consumer electronics as a wearable, safe and affordable human-machine interface system. The contribution of this study centres around proposing and demonstrating a user-centred design methodology to allow direct human-machine interaction and interface for applications where humans and devices are in the same loop or coexist, as typified between users and infrared-communicating devices in this study.


Subject(s)
Gestures , Wearable Electronic Devices , Humans , Artificial Intelligence , Infrared Rays , Adult , Male , Female , User-Computer Interface , Pattern Recognition, Automated/methods
7.
Sensors (Basel) ; 24(18)2024 Sep 23.
Article in English | MEDLINE | ID: mdl-39338902

ABSTRACT

In the evolving field of human-computer interaction (HCI), gesture recognition has emerged as a critical focus, with smart gloves equipped with sensors playing one of the most important roles. Despite the significance of dynamic gesture recognition, most research on data gloves has concentrated on static gestures, with only a small percentage addressing dynamic gestures or both. This study explores the development of a low-cost smart glove prototype designed to capture and classify dynamic hand gestures for game control and presents a prototype of data gloves equipped with five flex sensors, five force sensors, and one inertial measurement unit (IMU) sensor. To classify dynamic gestures, we developed a neural network-based classifier, utilizing a convolutional neural network (CNN) with three two-dimensional convolutional layers and rectified linear unit (ReLU) activation where its accuracy was 90%. The developed glove effectively captures dynamic gestures for game control, achieving high classification accuracy, precision, and recall, as evidenced by the confusion matrix and training metrics. Despite limitations in the number of gestures and participants, the solution offers a cost-effective and accurate approach to gesture recognition, with potential applications in VR/AR environments.


Subject(s)
Gestures , Machine Learning , Neural Networks, Computer , Humans , Pattern Recognition, Automated/methods , Hand/physiology , User-Computer Interface
8.
Sci Rep ; 14(1): 22061, 2024 09 27.
Article in English | MEDLINE | ID: mdl-39333258

ABSTRACT

Hand gesture recognition based on sparse multichannel surface electromyography (sEMG) still poses a significant challenge to deployment as a muscle-computer interface. Many researchers have been working to develop an sEMG-based hand gesture recognition system. However, the existing system still faces challenges in achieving satisfactory performance due to ineffective feature enhancement, so the prediction is erratic and unstable. To comprehensively tackle these challenges, we introduce a novel approach: a lightweight sEMG-based hand gesture recognition system using a 4-stream deep learning architecture. Each stream strategically combines Temporal Convolutional Network (TCN)-based time-varying features with Convolutional Neural Network (CNN)-based frame-wise features. In the first stream, we harness the power of the TCN module to extract nuanced time-varying temporal features. The second stream integrates a hybrid Long short-term memory (LSTM)-TCN module. This stream extracts temporal features using LSTM and seamlessly enhances them with TCN to effectively capture intricate long-range temporal relations. The third stream adopts a spatio-temporal strategy, merging the CNN and TCN modules. This integration facilitates concurrent comprehension of both spatial and temporal features, enriching the model's understanding of the underlying dynamics of the data. The fourth stream uses a skip connection mechanism to alleviate potential problems of data loss, ensuring a robust information flow throughout the network and concatenating the 4 stream features, yielding a comprehensive and effective final feature representation. We employ a channel attention-based feature selection module to select the most effective features, aiming to reduce the computational complexity and feed them into the classification module. The proposed model achieves an average accuracy of 94.31% and 98.96% on the Ninapro DB1 and DB9 datasets, respectively. This high-performance accuracy proves the superiority of the proposed model, and its implications extend to enhancing the quality of life for individuals using prosthetic limbs and advancing control systems in the field of robotic human-machine interfaces.


Subject(s)
Electromyography , Gestures , Hand , Neural Networks, Computer , Humans , Electromyography/methods , Hand/physiology , Deep Learning , Pattern Recognition, Automated/methods , Signal Processing, Computer-Assisted , Algorithms , Male
9.
Sci Rep ; 14(1): 22373, 2024 09 27.
Article in English | MEDLINE | ID: mdl-39333621

ABSTRACT

Spintronic devices offer a promising avenue for the development of nanoscale, energy-efficient artificial neurons for neuromorphic computing. It has previously been shown that with antiferromagnetic (AFM) oscillators, ultra-fast spiking artificial neurons can be made that mimic many unique features of biological neurons. In this work, we train an artificial neural network of AFM neurons to perform pattern recognition. A simple machine learning algorithm called spike pattern association neuron (SPAN), which relies on the temporal position of neuron spikes, is used during training. In under a microsecond of physical time, the AFM neural network is trained to recognize symbols composed from a grid by producing a spike within a specified time window. We further achieve multi-symbol recognition with the addition of an output layer to suppress undesirable spikes. Through the utilization of AFM neurons and the SPAN algorithm, we create a neural network capable of high-accuracy recognition with overall power consumption on the order of picojoules.


Subject(s)
Algorithms , Neural Networks, Computer , Neurons , Neurons/physiology , Action Potentials/physiology , Machine Learning , Pattern Recognition, Automated/methods , Humans , Models, Neurological
10.
Article in English | MEDLINE | ID: mdl-39259642

ABSTRACT

Early-exiting has recently provided an ideal solution for accelerating activity inference by attaching internal classifiers to deep neural networks. It allows easy activity samples to be predicted at shallower layers, without executing deeper layers, hence leading to notable adaptiveness in terms of accuracy-speed trade-off under varying resource demands. However, prior most works typically optimize all the classifiers equally on all types of activity data. As a result, deeper classifiers will only see hard samples during test phase, which renders the model suboptimal due to the training-test data distribution mismatch. Such issue has been rarely explored in the context of activity recognition. In this paper, to close the gap, we propose to organize all these classifiers as a dynamic-depth network and jointly optimize them in a similar gradient-boosting manner. Specifically, a gradient-rescaling is employed to bound the gradients of parameters at different depths, that makes such training procedure more stable. Particularly, we perform a prediction reweighting to emphasize current deep classifier while weakening the ensemble of its previous classifiers, so as to relieve the shortage of training data at deeper classifiers. Comprehensive experiments on multiple HAR benchmarks including UCI-HAR, PAMAP2, UniMiB-SHAR, and USC-HAD verify that it is state-of-the-art in accuracy and speed. A real implementation is measured on an ARM-based mobile device.


Subject(s)
Algorithms , Neural Networks, Computer , Wearable Electronic Devices , Humans , Human Activities/classification , Deep Learning , Bees/physiology , Pattern Recognition, Automated/methods , Machine Learning
11.
PLoS Comput Biol ; 20(9): e1012423, 2024 Sep.
Article in English | MEDLINE | ID: mdl-39255309

ABSTRACT

Zebrafish have become an essential model organism in screening for developmental neurotoxic chemicals and their molecular targets. The success of zebrafish as a screening model is partially due to their physical characteristics including their relatively simple nervous system, rapid development, experimental tractability, and genetic diversity combined with technical advantages that allow for the generation of large amounts of high-dimensional behavioral data. These data are complex and require advanced machine learning and statistical techniques to comprehensively analyze and capture spatiotemporal responses. To accomplish this goal, we have trained semi-supervised deep autoencoders using behavior data from unexposed larval zebrafish to extract quintessential "normal" behavior. Following training, our network was evaluated using data from larvae shown to have significant changes in behavior (using a traditional statistical framework) following exposure to toxicants that include nanomaterials, aromatics, per- and polyfluoroalkyl substances (PFAS), and other environmental contaminants. Further, our model identified new chemicals (Perfluoro-n-octadecanoic acid, 8-Chloroperfluorooctylphosphonic acid, and Nonafluoropentanamide) as capable of inducing abnormal behavior at multiple chemical-concentrations pairs not captured using distance moved alone. Leveraging this deep learning model will allow for better characterization of the different exposure-induced behavioral phenotypes, facilitate improved genetic and neurobehavioral analysis in mechanistic determination studies and provide a robust framework for analyzing complex behaviors found in higher-order model systems.


Subject(s)
Behavior, Animal , Zebrafish , Animals , Zebrafish/physiology , Behavior, Animal/drug effects , Deep Learning , Larva/drug effects , Pattern Recognition, Automated/methods , Computational Biology/methods , Neural Networks, Computer
12.
Sensors (Basel) ; 24(18)2024 Sep 10.
Article in English | MEDLINE | ID: mdl-39338612

ABSTRACT

Facial expression recognition using convolutional neural networks (CNNs) is a prevalent research area, and the network's complexity poses obstacles for deployment on devices with limited computational resources, such as mobile devices. To address these challenges, researchers have developed lightweight networks with the aim of reducing model size and minimizing parameters without compromising accuracy. The LiteFer method introduced in this study incorporates depth-separable convolution and a lightweight attention mechanism, effectively reducing network parameters. Moreover, through comprehensive comparative experiments on the RAFDB and FERPlus datasets, its superior performance over various state-of-the-art lightweight expression-recognition methods is evident.


Subject(s)
Neural Networks, Computer , Humans , Algorithms , Facial Expression , Pattern Recognition, Automated/methods
13.
Sensors (Basel) ; 24(16)2024 Aug 13.
Article in English | MEDLINE | ID: mdl-39204927

ABSTRACT

This study delves into decoding hand gestures using surface electromyography (EMG) signals collected via a precision Myo-armband sensor, leveraging machine learning algorithms. The research entails rigorous data preprocessing to extract features and labels from raw EMG data. Following partitioning into training and testing sets, four traditional machine learning models are scrutinized for their efficacy in classifying finger movements across seven distinct gestures. The analysis includes meticulous parameter optimization and five-fold cross-validation to evaluate model performance. Among the models assessed, the Random Forest emerges as the top performer, consistently delivering superior precision, recall, and F1-score values across gesture classes, with ROC-AUC scores surpassing 99%. These findings underscore the Random Forest model as the optimal classifier for our EMG dataset, promising significant advancements in healthcare rehabilitation engineering and enhancing human-computer interaction technologies.


Subject(s)
Algorithms , Electromyography , Gestures , Hand , Machine Learning , Humans , Electromyography/methods , Hand/physiology , Male , Female , Adult , Signal Processing, Computer-Assisted , Young Adult , Pattern Recognition, Automated/methods , Movement/physiology
14.
Sensors (Basel) ; 24(16)2024 Aug 15.
Article in English | MEDLINE | ID: mdl-39204983

ABSTRACT

In cross-country skiing, ski poles play a crucial role in technique, propulsion, and overall performance. The kinematic parameters of ski poles can provide valuable information about the skier's technique, which is of great significance for coaches and athletes seeking to improve their skiing performance. In this work, a new smart ski pole is proposed, which combines the uniaxial load cell and the inertial measurement unit (IMU), aiming to provide comprehensive data measurement functions more easily and to play an auxiliary role in training. The ski pole can collect data directly related to skiing technical actions, such as the skier's pole force, pole angle, inertia data, etc., and the system's design, based on wireless transmission, makes the system more convenient to provide comprehensive data acquisition functions, in order to achieve a more simple and efficient use experience. In this experiment, the characteristic data obtained from the ski poles during the Double Poling of three skiers were extracted and the sample t-test was conducted. The results showed that the three skiers had significant differences in pole force, pole angle, and pole time. Spearman correlation analysis was used to analyze the sports data of the people with good performance, and the results showed that the pole force and speed (r = 0.71) and pole support angle (r = 0.76) were significantly correlated. In addition, this study adopted the commonly used inertial sensor data for action recognition, combined with the load cell data as the input of the ski technical action recognition algorithm, and the recognition accuracy of five kinds of cross-country skiing technical actions (Diagonal Stride (DS), Double Poling (DP), Kick Double Poling (KDP), Two-stroke Glide (G2) and Five-stroke Glide (G5)) reached 99.5%, and the accuracy was significantly improved compared with similar recognition systems. Therefore, the equipment is expected to be a valuable training tool for coaches and athletes, helping them to better understand and improve their ski maneuver technique.


Subject(s)
Skiing , Skiing/physiology , Humans , Biomechanical Phenomena/physiology , Pattern Recognition, Automated/methods , Athletic Performance/physiology
15.
Sensors (Basel) ; 24(16)2024 Aug 21.
Article in English | MEDLINE | ID: mdl-39205085

ABSTRACT

In recent years, significant progress has been made in facial expression recognition methods. However, tasks related to facial expression recognition in real environments still require further research. This paper proposes a tri-cross-attention transformer with a multi-feature fusion network (TriCAFFNet) to improve facial expression recognition performance under challenging conditions. By combining LBP (Local Binary Pattern) features, HOG (Histogram of Oriented Gradients) features, landmark features, and CNN (convolutional neural network) features from facial images, the model is provided with a rich input to improve its ability to discern subtle differences between images. Additionally, tri-cross-attention blocks are designed to facilitate information exchange between different features, enabling mutual guidance among different features to capture salient attention. Extensive experiments on several widely used datasets show that our TriCAFFNet achieves the SOTA performance on RAF-DB with 92.17%, AffectNet (7 cls) with 67.40%, and AffectNet (8 cls) with 63.49%, respectively.


Subject(s)
Facial Expression , Neural Networks, Computer , Humans , Algorithms , Image Processing, Computer-Assisted/methods , Face/anatomy & histology , Automated Facial Recognition/methods , Pattern Recognition, Automated/methods
16.
PLoS One ; 19(8): e0305118, 2024.
Article in English | MEDLINE | ID: mdl-39208254

ABSTRACT

In order to solve the problem of image quality and morphological characteristics of primary underglaze brown decorative pattern extraction, this paper proposes a method of primary underglaze brown decorative pattern extraction based on the coupling of single scale gamma correction and gray sharpening. The single-scale gamma correction is combined with the gray sharpening method. The single-scale gamma correction improves the contrast and brightness of the image by nonlinear transformation, but may lead to the loss of image detail. Gray sharpening can enhance the high frequency component and improve the clarity of the image, but it will introduce noise. Combining these two technologies can compensate for their shortcomings. The experimental results show that this method can improve the efficiency of last element underglaze brown decorative pattern extraction by enhancing the image retention detail and reducing the influence of noise. The experimental results showed that F1Score, Miou(%), Recall, Precision and Accuracy(%) were 0.92745, 0.82253, 0.97942, 0.92458 and 0.92745, respectively.


Subject(s)
Algorithms , Image Processing, Computer-Assisted/methods , Image Enhancement/methods , Pattern Recognition, Automated/methods
17.
J Neural Eng ; 21(5)2024 Sep 03.
Article in English | MEDLINE | ID: mdl-39178906

ABSTRACT

Objective. The decline in the performance of electromyography (EMG)-based silent speech recognition is widely attributed to disparities in speech patterns, articulation habits, and individual physiology among speakers. Feature alignment by learning a discriminative network that resolves domain offsets across speakers is an effective method to address this problem. The prevailing adversarial network with a branching discriminator specializing in domain discrimination renders insufficiently direct contribution to categorical predictions of the classifier.Approach. To this end, we propose a simplified discrepancy-based adversarial network with a streamlined end-to-end structure for EMG-based cross-subject silent speech recognition. Highly aligned features across subjects are obtained by introducing a Nuclear-norm Wasserstein discrepancy metric on the back end of the classification network, which could be utilized for both classification and domain discrimination. Given the low-level and implicitly noisy nature of myoelectric signals, we devise a cascaded adaptive rectification network as the front-end feature extraction network, adaptively reshaping the intermediate feature map with automatically learnable channel-wise thresholds. The resulting features effectively filter out domain-specific information between subjects while retaining domain-invariant features critical for cross-subject recognition.Main results. A series of sentence-level classification experiments with 100 Chinese sentences demonstrate the efficacy of our method, achieving an average accuracy of 89.46% tested on 40 new subjects by training with data from 60 subjects. Especially, our method achieves a remarkable 10.07% improvement compared to the state-of-the-art model when tested on 10 new subjects with 20 subjects employed for training, surpassing its result even with three times training subjects.Significance. Our study demonstrates an improved classification performance of the proposed adversarial architecture using cross-subject myoelectric signals, providing a promising prospect for EMG-based speech interactive application.


Subject(s)
Electromyography , Humans , Electromyography/methods , Male , Female , Neural Networks, Computer , Adult , Speech Recognition Software , Young Adult , Pattern Recognition, Automated/methods , Speech/physiology
18.
Article in English | MEDLINE | ID: mdl-39186426

ABSTRACT

Hand motor impairment has seriously affected the daily life of the elderly. We developed an electromyography (EMG) exosuit system with bidirectional hand support for bilateral coordination assistance based on a dynamic gesture recognition model using graph convolutional network (GCN) and long short-term memory network (LSTM). The system included a hardware subsystem and a software subsystem. The hardware subsystem included an exosuit jacket, a backpack module, an EMG recognition module, and a bidirectional support glove. The software subsystem based on the dynamic gesture recognition model was designed to identify dynamic and static gestures by extracting the spatio-temporal features of the patient's EMG signals and to control glove movement. The offline training experiment built the gesture recognition models for each subject and evaluated the feasibility of the recognition model; the online control experiments verified the effectiveness of the exosuit system. The experimental results showed that the proposed model achieve a gesture recognition rate of 96.42% ± 3.26 %, which is higher than the other three traditional recognition models. All subjects successfully completed two daily tasks within a short time and the success rate of bilateral coordination assistance are 88.75% and 86.88%. The exosuit system can effectively help patients by bidirectional hand support strategy for bilateral coordination assistance in daily tasks, and the proposed method can be applied to various limb assistance scenarios.


Subject(s)
Electromyography , Gestures , Hand , Humans , Hand/physiology , Male , Female , Exoskeleton Device , Adult , Algorithms , Neural Networks, Computer , Pattern Recognition, Automated/methods , Software , Activities of Daily Living , Young Adult , Feasibility Studies
19.
Article in English | MEDLINE | ID: mdl-39172614

ABSTRACT

Surface electromyography (sEMG), a human-machine interface for gesture recognition, has shown promising potential for decoding motor intentions, but a variety of nonideal factors restrict its practical application in assistive robots. In this paper, we summarized the current mainstream gesture recognition strategies and proposed a gesture recognition method based on multimodal canonical correlation analysis feature fusion classification (MCAFC) for a nonideal condition that occurs in daily life, i.e., posture variations. The deep features of the sEMG and acceleration signals were first extracted via convolutional neural networks. A canonical correlation analysis was subsequently performed to associate the deep features of the two modalities. The transformed features were utilized as inputs to a linear discriminant analysis classifier to recognize the corresponding gestures. Both offline and real-time experiments were conducted on eight non-disabled subjects. The experimental results indicated that MCAFC achieved an average classification accuracy, average motion completion rate, and average motion completion time of 93.44%, 94.05%, and 1.38 s, respectively, with multiple dynamic postures, indicating significantly better performance than that of comparable methods. The results demonstrate the feasibility and superiority of the proposed multimodal signal feature fusion method for gesture recognition with posture variations, providing a new scheme for myoelectric control.


Subject(s)
Algorithms , Electromyography , Gestures , Hand , Neural Networks, Computer , Pattern Recognition, Automated , Posture , Humans , Posture/physiology , Hand/physiology , Male , Pattern Recognition, Automated/methods , Adult , Female , Young Adult , Discriminant Analysis , Deep Learning , Healthy Volunteers
20.
Neural Netw ; 179: 106573, 2024 Nov.
Article in English | MEDLINE | ID: mdl-39096753

ABSTRACT

Recognizing expressions from dynamic facial videos can find more natural affect states of humans, and it becomes a more challenging task in real-world scenes due to pose variations of face, partial occlusions and subtle dynamic changes of emotion sequences. Existing transformer-based methods often focus on self-attention to model the global relations among spatial features or temporal features, which cannot well focus on important expression-related locality structures from both spatial and temporal features for the in-the-wild expression videos. To this end, we incorporate diverse graph structures into transformers and propose a CDGT method to construct diverse graph transformers for efficient emotion recognition from in-the-wild videos. Specifically, our method contains a spatial dual-graphs transformer and a temporal hyperbolic-graph transformer. The former deploys a dual-graph constrained attention to capture latent emotion-related graph geometry structures among local spatial tokens for efficient feature representation, especially for the video frames with pose variations and partial occlusions. The latter adopts a hyperbolic-graph constrained self-attention that explores important temporal graph structure information under hyperbolic space to model more subtle changes of dynamic emotion. Extensive experimental results on in-the-wild video-based facial expression databases show that our proposed CDGT outperforms other state-of-the-art methods.


Subject(s)
Emotions , Facial Expression , Video Recording , Humans , Emotions/physiology , Algorithms , Neural Networks, Computer , Facial Recognition/physiology , Pattern Recognition, Automated/methods , Automated Facial Recognition/methods
SELECTION OF CITATIONS
SEARCH DETAIL