Your browser doesn't support javascript.
loading
Montrer: 20 | 50 | 100
Résultats 1 - 20 de 22
Filtrer
Plus de filtres











Base de données
Gamme d'année
1.
Front Hum Neurosci ; 18: 1391531, 2024.
Article de Anglais | MEDLINE | ID: mdl-39099602

RÉSUMÉ

Hand gestures are a natural and intuitive form of communication, and integrating this communication method into robotic systems presents significant potential to improve human-robot collaboration. Recent advances in motor neuroscience have focused on replicating human hand movements from synergies also known as movement primitives. Synergies, fundamental building blocks of movement, serve as a potential strategy adapted by the central nervous system to generate and control movements. Identifying how synergies contribute to movement can help in dexterous control of robotics, exoskeletons, prosthetics and extend its applications to rehabilitation. In this paper, 33 static hand gestures were recorded through a single RGB camera and identified in real-time through the MediaPipe framework as participants made various postures with their dominant hand. Assuming an open palm as initial posture, uniform joint angular velocities were obtained from all these gestures. By applying a dimensionality reduction method, kinematic synergies were obtained from these joint angular velocities. Kinematic synergies that explain 98% of variance of movements were utilized to reconstruct new hand gestures using convex optimization. Reconstructed hand gestures and selected kinematic synergies were translated onto a humanoid robot, Mitra, in real-time, as the participants demonstrated various hand gestures. The results showed that by using only few kinematic synergies it is possible to generate various hand gestures, with 95.7% accuracy. Furthermore, utilizing low-dimensional synergies in control of high dimensional end effectors holds promise to enable near-natural human-robot collaboration.

2.
Sci Rep ; 14(1): 15879, 2024 07 10.
Article de Anglais | MEDLINE | ID: mdl-38982140

RÉSUMÉ

Spinal diseases and frozen shoulder are prevalent health problems in Asian populations. Early assessment and treatment are very important to prevent the disease from getting worse and reduce pain. In the field of computer vision, it is a challenging problem to assess the range of motion. In order to realize efficient, real-time and accurate assessment of the range of motion, an assessment system combining MediaPipe and YOLOv5 technologies was proposed in this study. On this basis, Convolutional Block Attention Module (CBAM) is introduced into the YOLOv5 target detection model, which can enhance the extraction of feature information, suppress background interference, and improve the generalization ability of the model. In order to meet the requirements of large-scale computing, a client/server (C/S) framework structure is adopted. The evaluation results can be obtained quickly after the client uploads the image data, providing a convenient and practical solution. In addition, a game of "Picking Bayberries" was developed as an auxiliary treatment method to provide patients with interesting rehabilitation training.


Sujet(s)
Bursite , Amplitude articulaire , Maladies du rachis , Humains , Bursite/physiopathologie , Bursite/thérapie , Bursite/diagnostic , Maladies du rachis/diagnostic , Maladies du rachis/physiopathologie , Maladies du rachis/thérapie , Mâle , Femelle , Adulte , Adulte d'âge moyen
3.
Biomimetics (Basel) ; 9(7)2024 Jul 02.
Article de Anglais | MEDLINE | ID: mdl-39056841

RÉSUMÉ

Physicians, physical therapists, and occupational therapists have traditionally assessed hand motor function in hemiplegic patients but often struggle to evaluate complex hand movements. To address this issue, in 2019, we developed Fahrenheit, a device and algorithm that uses infrared camera image processing to estimate hand paralysis. However, due to Fahrenheit's dependency on specialized equipment, we conceived a simpler solution: developing a smartphone app that integrates MediaPipe. The objective of this study was to measure hand movements in stroke patients using both MediaPipe and Fahrenheit and to assess their criterion-related validity. The analysis revealed moderate-to-high correlations between the two methods. Consistent results were also observed in the peak angle and velocity comparisons across the severity stages. Because Fahrenheit determines finger recovery status based on these measures, it has the potential to transfer this function to MediaPipe. This study highlighted the potential use of MediaPipe in paralysis estimation applications.

4.
PeerJ Comput Sci ; 10: e2110, 2024.
Article de Anglais | MEDLINE | ID: mdl-38983218

RÉSUMÉ

Recognizing hand-object interactions presents a significant challenge in computer vision. It arises due to the varying nature of hand-object interactions. Moreover, estimating the 3D position of a hand from a single frame can be problematic, especially when the hand obstructs the view of the object from the observer's perspective. In this article, we present a novel approach to recognizing objects and facilitating virtual interactions, using a steering wheel as an illustrative example. We propose a real-time solution for identifying hand-object interactions in eXtended reality (XR) environments. Our approach relies on data captured by a single RGB camera during a manipulation scenario involving a steering wheel. Our model pipeline consists of three key components: (a) a hand landmark detector based on the MediaPipe cross-platform hand tracking solution; (b) a three-spoke steering wheel model tracker implemented using the faster region-based convolutional neural network (Faster R-CNN) architecture; and (c) a gesture recognition module designed to analyze interactions between the hand and the steering wheel. This approach not only offers a realistic experience of interacting with steering-based mechanisms but also contributes to reducing emissions in the real-world environment. Our experimental results demonstrate the natural interaction between physical objects in virtual environments, showcasing precision and stability in our system.

5.
Sensors (Basel) ; 24(10)2024 May 15.
Article de Anglais | MEDLINE | ID: mdl-38793987

RÉSUMÉ

To meet the increased demand for home workouts owing to the COVID-19 pandemic, this study proposes a new approach to real-time exercise posture classification based on the convolutional neural network (CNN) in an ensemble learning system. By utilizing MediaPipe, the proposed system extracts the joint coordinates and angles of the human body, which the CNN uses to learn the complex patterns of various exercises. Additionally, this new approach enhances classification performance by combining predictions from multiple image frames using an ensemble learning method. Infinity AI's Fitness Basic Dataset is employed for validation, and the experiments demonstrate high accuracy in classifying exercises such as arm raises, squats, and overhead presses. The proposed model demonstrated its ability to effectively classify exercise postures in real time, achieving high rates in accuracy (92.12%), precision (91.62%), recall (91.64%), and F1 score (91.58%). This indicates its potential application in personalized fitness recommendations and physical therapy services, showcasing the possibility for beneficial use in these fields.


Sujet(s)
COVID-19 , , Humains , Exercice physique/physiologie , SARS-CoV-2 , Posture/physiologie , Apprentissage machine , Algorithmes , Pandémies
6.
Sensors (Basel) ; 24(4)2024 Feb 08.
Article de Anglais | MEDLINE | ID: mdl-38400263

RÉSUMÉ

Stroke represents a medical emergency and can lead to the development of movement disorders such as abnormal muscle tone, limited range of motion, or abnormalities in coordination and balance. In order to help stroke patients recover as soon as possible, rehabilitation training methods employ various movement modes such as ordinary movements and joint reactions to induce active reactions in the limbs and gradually restore normal functions. Rehabilitation effect evaluation can help physicians understand the rehabilitation needs of different patients, determine effective treatment methods and strategies, and improve treatment efficiency. In order to achieve real-time and accuracy of action detection, this article uses Mediapipe's action detection algorithm and proposes a model based on MPL-CNN. Mediapipe can be used to identify key point features of the patient's upper limbs and simultaneously identify key point features of the hand. In order to detect the effect of rehabilitation training for upper limb movement disorders, LSTM and CNN are combined to form a new LSTM-CNN model, which is used to identify the action features of upper limb rehabilitation training extracted by Medipipe. The MPL-CNN model can effectively identify the accuracy of rehabilitation movements during upper limb rehabilitation training for stroke patients. In order to ensure the scientific validity and unified standards of rehabilitation training movements, this article employs the postures in the Fugl-Meyer Upper Limb Rehabilitation Training Functional Assessment Form (FMA) and establishes an FMA upper limb rehabilitation data set for experimental verification. Experimental results show that in each stage of the Fugl-Meyer upper limb rehabilitation training evaluation effect detection, the MPL-CNN-based method's recognition accuracy of upper limb rehabilitation training actions reached 95%. At the same time, the average accuracy rate of various upper limb rehabilitation training actions reaches 97.54%. This shows that the model is highly robust across different action categories and proves that the MPL-CNN model is an effective and feasible solution. This method based on MPL-CNN can provide a high-precision detection method for the evaluation of rehabilitation effects of upper limb movement disorders after stroke, helping clinicians in evaluating the patient's rehabilitation progress and adjusting the rehabilitation plan based on the evaluation results. This will help improve the personalization and precision of rehabilitation treatment and promote patient recovery.


Sujet(s)
Troubles de la motricité , Réadaptation après un accident vasculaire cérébral , Accident vasculaire cérébral , Humains , Membre supérieur/physiologie , Main , Mouvement/physiologie , Résultat thérapeutique , Récupération fonctionnelle/physiologie , Récepteurs à la thrombopoïétine
7.
Data Brief ; 51: 109799, 2023 Dec.
Article de Anglais | MEDLINE | ID: mdl-38075615

RÉSUMÉ

Sign Language Recognition (SLR) is crucial for enabling communication between the deaf-mute and hearing communities. Nevertheless, the development of a comprehensive sign language dataset is a challenging task due to the complexity and variations in hand gestures. This challenge is particularly evident in the case of Bangla Sign Language (BdSL), where the limited availability of depth datasets impedes accurate recognition. To address this issue, we propose BdSL47, an open-access depth dataset for 47 one-handed static signs (10 digits, from ০ to ৯; and 37 letters, from অ to ँ) of BdSL. The dataset was created using the MediaPipe framework for extracting depth information. To classify the signs, we developed an Artificial Neural Network (ANN) model with a 63-node input layer, a 47-node output layer, and 4 hidden layers that included dropout in the last two hidden layers, an Adam optimizer, and a ReLU activation function. Based on the selected hyperparameters, the proposed ANN model effectively learns the spatial relationships and patterns from the depth-based gestural input features and gives an F1 score of 97.84 %, indicating the effectiveness of the approach compared to the baselines provided. The availability of BdSL47 as a comprehensive dataset can have an impact on improving the accuracy of SLR for BdSL using more advanced deep-learning models.

8.
Sensors (Basel) ; 23(23)2023 Nov 25.
Article de Anglais | MEDLINE | ID: mdl-38067779

RÉSUMÉ

Modern embedded systems have achieved relatively high processing power. They can be used for edge computing and computer vision, where data are collected and processed locally, without the need for network communication for decision-making and data analysis purposes. Face detection, face recognition, and pose detection algorithms can be executed with acceptable performance on embedded systems and are used for home security and monitoring. However, popular machine learning frameworks, such as MediaPipe, require relatively high usage of CPU while running, even when idle with no subject in the scene. Combined with the still present false detections, this wastes CPU time, elevates the power consumption and overall system temperature, and generates unnecessary data. In this study, a low-cost low-resolution infrared thermal sensor array was used to control the execution of MediaPipe's pose detection algorithm using single-board computers, which only runs when the thermal camera detects a possible subject in its field of view. A lightweight algorithm with several filtering layers was developed, which allowed the effective detection and isolation of a person in the thermal image. The resulting hybrid computer vision proved effective in reducing the average CPU workload, especially in environments with low activity, almost eliminating MediaPipe's false detections, and reaching up to 30% power saving in the best-case scenario.


Sujet(s)
Algorithmes , Charge de travail , Humains , Ordinateurs , Apprentissage machine
9.
Data Brief ; 51: 109771, 2023 Dec.
Article de Anglais | MEDLINE | ID: mdl-38053598

RÉSUMÉ

This dataset offers a comprehensive compilation of attention-related features captured during online classes. The dataset is generated through the integration of key components including face detection, hand tracking, head pose estimation, and mobile phone detection modules. The data collection process involves leveraging a web interface created using the Django web framework. Video frames of participating students are collected following institutional guidelines and informed consent through their webcams, subsequently decomposed into frames at a rate of 20 FPS, and transformed from BGR to RGB color model. The aforesaid modules subsequently process these video frames to extract raw data. The dataset consists of 16 features and one label column, encompassing numerical, categorical, and floating-point values. Inherent to its potential, the dataset enables researchers and practitioners to explore and examine attention-related patterns and characteristics exhibited by students during online classes. The composition and design of the dataset offer a unique opportunity to delve into the correlations and interactions among face presence, hand movements, head orientations, and phone interactions. Researchers can leverage this dataset to investigate and develop machine learning models aimed at automatic attention detection, thereby contributing to enhancing remote learning experiences and educational outcomes. The dataset in question also constitutes a highly valuable resource for the scientific community, enabling a thorough exploration of the multifaceted aspects pertaining to student attention levels during online classes. Its rich and diverse feature set, coupled with the underlying data collection methodology, provides ample opportunities for reuse and exploration across multiple domains including education, psychology, and computer vision research.

10.
Sensors (Basel) ; 23(24)2023 Dec 06.
Article de Anglais | MEDLINE | ID: mdl-38139492

RÉSUMÉ

This work addresses the design and implementation of a novel PhotoBiological Filter Classifier (PhBFC) to improve the accuracy of a static sign language translation system. The captured images are preprocessed by a contrast enhancement algorithm inspired by the capacity of retinal photoreceptor cells from mammals, which are responsible for capturing light and transforming it into electric signals that the brain can interpret as images. This sign translation system not only supports the effective communication between an agent and an operator but also between a community with hearing disabilities and other people. Additionally, this technology could be integrated into diverse devices and applications, further broadening its scope, and extending its benefits for the community in general. The bioinspired photoreceptor model is evaluated under different conditions. To validate the advantages of applying photoreceptors cells, 100 tests were conducted per letter to be recognized, on three different models (V1, V2, and V3), obtaining an average of 91.1% of accuracy on V3, compared to 63.4% obtained on V1, and an average of 55.5 Frames Per Second (FPS) in each letter classification iteration for V1, V2, and V3, demonstrating that the use of photoreceptor cells does not affect the processing time while also improving the accuracy. The great application potential of this system is underscored, as it can be employed, for example, in Deep Learning (DL) for pattern recognition or agent decision-making trained by reinforcement learning, etc.


Sujet(s)
Gestes , Langue des signes , Humains , Animaux , , Cellules photoréceptrices , Algorithmes , Mammifères
11.
Sensors (Basel) ; 23(16)2023 Aug 14.
Article de Anglais | MEDLINE | ID: mdl-37631693

RÉSUMÉ

Every one of us has a unique manner of communicating to explore the world, and such communication helps to interpret life. Sign language is the popular language of communication for hearing and speech-disabled people. When a sign language user interacts with a non-sign language user, it becomes difficult for a signer to express themselves to another person. A sign language recognition system can help a signer to interpret the sign of a non-sign language user. This study presents a sign language recognition system that is capable of recognizing Arabic Sign Language from recorded RGB videos. To achieve this, two datasets were considered, such as (1) the raw dataset and (2) the face-hand region-based segmented dataset produced from the raw dataset. Moreover, operational layer-based multi-layer perceptron "SelfMLP" is proposed in this study to build CNN-LSTM-SelfMLP models for Arabic Sign Language recognition. MobileNetV2 and ResNet18-based CNN backbones and three SelfMLPs were used to construct six different models of CNN-LSTM-SelfMLP architecture for performance comparison of Arabic Sign Language recognition. This study examined the signer-independent mode to deal with real-time application circumstances. As a result, MobileNetV2-LSTM-SelfMLP on the segmented dataset achieved the best accuracy of 87.69% with 88.57% precision, 87.69% recall, 87.72% F1 score, and 99.75% specificity. Overall, face-hand region-based segmentation and SelfMLP-infused MobileNetV2-LSTM-SelfMLP surpassed the previous findings on Arabic Sign Language recognition by 10.970% accuracy.


Sujet(s)
Apprentissage profond , Humains , Langage , Langue des signes , Communication ,
12.
Sensors (Basel) ; 23(14)2023 Jul 16.
Article de Anglais | MEDLINE | ID: mdl-37514738

RÉSUMÉ

Substantial advancements in markerless motion capture accuracy exist, but discrepancies persist when measuring joint angles compared to those taken with a goniometer. This study integrates machine learning techniques with markerless motion capture, with an aim to enhance this accuracy. Two artificial intelligence-based libraries-MediaPipe and LightGBM-were employed in executing markerless motion capture and shoulder abduction angle estimation. The motion of ten healthy volunteers was captured using smartphone cameras with right shoulder abduction angles ranging from 10° to 160°. The cameras were set diagonally at 45°, 30°, 15°, 0°, -15°, or -30° relative to the participant situated at a distance of 3 m. To estimate the abduction angle, machine learning models were developed considering the angle data from the goniometer as the ground truth. The model performance was evaluated using the coefficient of determination R2 and mean absolute percentage error, which were 0.988 and 1.539%, respectively, for the trained model. This approach could estimate the shoulder abduction angle, even if the camera was positioned diagonally with respect to the object. Thus, the proposed models can be utilized for the real-time estimation of shoulder motion during rehabilitation or sports motion.


Sujet(s)
Articulation glénohumérale , Épaule , Humains , Intelligence artificielle , Amplitude articulaire , Posture , Phénomènes biomécaniques
13.
J Pers Med ; 13(5)2023 May 22.
Article de Anglais | MEDLINE | ID: mdl-37241044

RÉSUMÉ

In this article, we introduce a new approach to human movement by defining the movement as a static super object represented by a single two-dimensional image. The described method is applicable in remote healthcare applications, such as physiotherapeutic exercises. It allows researchers to label and describe the entire exercise as a standalone object, isolated from the reference video. This approach allows us to perform various tasks, including detecting similar movements in a video, measuring and comparing movements, generating new similar movements, and defining choreography by controlling specific parameters in the human body skeleton. As a result of the presented approach, we can eliminate the need to label images manually, disregard the problem of finding the start and the end of an exercise, overcome synchronization issues between movements, and perform any deep learning network-based operation that processes super objects in images in general. As part of this article, we will demonstrate two application use cases: one illustrates how to verify and score a fitness exercise. In contrast, the other illustrates how to generate similar movements in the human skeleton space by addressing the challenge of supplying sufficient training data for deep learning applications (DL). A variational auto encoder (VAE) simulator and an EfficientNet-B7 classifier architecture embedded within a Siamese twin neural network are presented in this paper in order to demonstrate the two use cases. These use cases demonstrate the versatility of our innovative concept in measuring, categorizing, inferring human behavior, and generating gestures for other researchers.

14.
Sensors (Basel) ; 24(1)2023 Dec 29.
Article de Anglais | MEDLINE | ID: mdl-38203068

RÉSUMÉ

Musculoskeletal conditions affect millions of people globally; however, conventional treatments pose challenges concerning price, accessibility, and convenience. Many telerehabilitation solutions offer an engaging alternative but rely on complex hardware for body tracking. This work explores the feasibility of a model for 3D Human Pose Estimation (HPE) from monocular 2D videos (MediaPipe Pose) in a physiotherapy context, by comparing its performance to ground truth measurements. MediaPipe Pose was investigated in eight exercises typically performed in musculoskeletal physiotherapy sessions, where the Range of Motion (ROM) of the human joints was the evaluated parameter. This model showed the best performance for shoulder abduction, shoulder press, elbow flexion, and squat exercises. Results have shown a MAPE ranging between 14.9% and 25.0%, Pearson's coefficient ranging between 0.963 and 0.996, and cosine similarity ranging between 0.987 and 0.999. Some exercises (e.g., seated knee extension and shoulder flexion) posed challenges due to unusual poses, occlusions, and depth ambiguities, possibly related to a lack of training data. This study demonstrates the potential of HPE from monocular 2D videos, as a markerless, affordable, and accessible solution for musculoskeletal telerehabilitation approaches. Future work should focus on exploring variations of the 3D HPE models trained on physiotherapy-related datasets, such as the Fit3D dataset, and post-preprocessing techniques to enhance the model's performance.


Sujet(s)
Téléréadaptation , Humains , Études de faisabilité , Traitement par les exercices physiques , Exercice physique , Articulation du genou
15.
Sensors (Basel) ; 22(20)2022 Oct 20.
Article de Anglais | MEDLINE | ID: mdl-36298342

RÉSUMÉ

Tremor is one of the common symptoms of Parkinson's disease (PD). Thanks to the recent evolution of digital technologies, monitoring of PD patients' hand movements employing contactless methods gained momentum. Objective: We aimed to quantitatively assess hand movements in patients suffering from PD using the artificial intelligence (AI)-based hand-tracking technologies of MediaPipe. Method: High-frame-rate videos and accelerometer data were recorded from 11 PD patients, two of whom showed classical Parkinsonian-type tremor. In the OFF-state and 30 Minutes after taking their standard oral medication (ON-state), video recordings were obtained. First, we investigated the frequency and amplitude relationship between the video and accelerometer data. Then, we focused on quantifying the effect of taking standard oral treatments. Results: The data extracted from the video correlated well with the accelerometer-based measurement system. Our video-based approach identified the tremor frequency with a small error rate (mean absolute error 0.229 (±0.174) Hz) and an amplitude with a high correlation. The frequency and amplitude of the hand movement before and after medication in PD patients undergoing medication differ. PD Patients experienced a decrease in the mean value for frequency from 2.012 (±1.385) Hz to 1.526 (±1.007) Hz and in the mean value for amplitude from 8.167 (±15.687) a.u. to 4.033 (±5.671) a.u. Conclusions: Our work achieved an automatic estimation of the movement frequency, including the tremor frequency with a low error rate, and to the best of our knowledge, this is the first paper that presents automated tremor analysis before/after medication in PD, in particular using high-frame-rate video data.


Sujet(s)
Maladie de Parkinson , Tremblement , Humains , Tremblement/traitement médicamenteux , Tremblement/diagnostic , Maladie de Parkinson/traitement médicamenteux , Intelligence artificielle , Mouvement , Main
16.
Sensors (Basel) ; 22(17)2022 Sep 01.
Article de Anglais | MEDLINE | ID: mdl-36081076

RÉSUMÉ

Technologies for pattern recognition are used in various fields. One of the most relevant and important directions is the use of pattern recognition technology, such as gesture recognition, in socially significant tasks, to develop automatic sign language interpretation systems in real time. More than 5% of the world's population-about 430 million people, including 34 million children-are deaf-mute and not always able to use the services of a living sign language interpreter. Almost 80% of people with a disabling hearing loss live in low- and middle-income countries. The development of low-cost systems of automatic sign language interpretation, without the use of expensive sensors and unique cameras, would improve the lives of people with disabilities, contributing to their unhindered integration into society. To this end, in order to find an optimal solution to the problem, this article analyzes suitable methods of gesture recognition in the context of their use in automatic gesture recognition systems, to further determine the most optimal methods. From the analysis, an algorithm based on the palm definition model and linear models for recognizing the shapes of numbers and letters of the Kazakh sign language are proposed. The advantage of the proposed algorithm is that it fully recognizes 41 letters of the 42 in the Kazakh sign alphabet. Until this time, only Russian letters in the Kazakh alphabet have been recognized. In addition, a unified function has been integrated into our system to configure the frame depth map mode, which has improved recognition performance and can be used to create a multimodal database of video data of gesture words for the gesture recognition system.


Sujet(s)
Reconnaissance automatique des formes , Langue des signes , Algorithmes , Enfant , Gestes , Main , Humains , Reconnaissance automatique des formes/méthodes
17.
Sensors (Basel) ; 22(15)2022 Jul 29.
Article de Anglais | MEDLINE | ID: mdl-35957257

RÉSUMÉ

Fitness is important in people's lives. Good fitness habits can improve cardiopulmonary capacity, increase concentration, prevent obesity, and effectively reduce the risk of death. Home fitness does not require large equipment but uses dumbbells, yoga mats, and horizontal bars to complete fitness exercises and can effectively avoid contact with people, so it is deeply loved by people. People who work out at home use social media to obtain fitness knowledge, but learning ability is limited. Incomplete fitness is likely to lead to injury, and a cheap, timely, and accurate fitness detection system can reduce the risk of fitness injuries and can effectively improve people's fitness awareness. In the past, many studies have engaged in the detection of fitness movements, among which the detection of fitness movements based on wearable devices, body nodes, and image deep learning has achieved better performance. However, a wearable device cannot detect a variety of fitness movements, may hinder the exercise of the fitness user, and has a high cost. Both body-node-based and image-deep-learning-based methods have lower costs, but each has some drawbacks. Therefore, this paper used a method based on deep transfer learning to establish a fitness database. After that, a deep neural network was trained to detect the type and completeness of fitness movements. We used Yolov4 and Mediapipe to instantly detect fitness movements and stored the 1D fitness signal of movement to build a database. Finally, MLP was used to classify the 1D signal waveform of fitness. In the performance of the classification of fitness movement types, the mAP was 99.71%, accuracy was 98.56%, precision was 97.9%, recall was 98.56%, and the F1-score was 98.23%, which is quite a high performance. In the performance of fitness movement completeness classification, accuracy was 92.84%, precision was 92.85, recall was 92.84%, and the F1-score was 92.83%. The average FPS in detection was 17.5. Experimental results show that our method achieves higher accuracy compared to other methods.


Sujet(s)
Apprentissage machine , , Bases de données factuelles , Humains , Mouvement
18.
Disabil Rehabil Assist Technol ; : 1-14, 2022 May 26.
Article de Anglais | MEDLINE | ID: mdl-35618260

RÉSUMÉ

PURPOSE: The article presents a design and development of a generic assistive system to establish an independent conversation-platform for hearing-speech impaired and visually impaired persons. MATERIALS: The developed software system is accomplished through programming using python and html. METHODS: Considering the constraints associated to the above mentioned impairments, the system implements both speech-to-text/gesture and text/gesture-to-speech conversion in its operation. In real-time hand-gesture to speech generation process is implemented using static image tracking, CNN based deep learning technique and MediaPipe hand-tracking solution. The software-prototype-terminals can be accessed through internet using MQTT protocol to accomplish the communicative conversation between visually impaired and hearing-speech impaired persons. RESULTS: The software system exhibits an average prediction time of less than approximately 1 s and 2 s for a four-letter based audio-word and a single hand-gesture, respectively, which are commensurate to the average time-complexity during human-to-human conversation. The average accuracy and loss for the hand-gestures through the CNN based deep learning are 0.9996 and 0.0008, respectively. The confusion matrix related to the prediction of alphabet-specific hand-gestures shows its satisfactory performance in gesture recognition. CONCLUSIONS: The software-prototype of the generic assistive device shows its potential to establish an exclusive communication between a visually impaired and a hearing-speech impaired person through the internet. The same software-interface can also be used to accomplish a communicative conversation between either only visually-impaired persons or only hearing-speech impaired persons. IMPLICATIONS FOR REHABILITATIONThe article presents a design and development of a generic assistive interface to establish an independent conversation-platform for hearing-speech impaired and visually impaired people via internet network.The same software-interface can also be used to accomplish a communicative conversation between either only visually-impaired persons or only hearing-speech impaired persons.The design can be further extended by incorporating multi-modal impairments to make a universal assistive device for all-in-one communication.

19.
Sensors (Basel) ; 23(1)2022 Dec 20.
Article de Anglais | MEDLINE | ID: mdl-36616601

RÉSUMÉ

In the discipline of hand gesture and dynamic sign language recognition, deep learning approaches with high computational complexity and a wide range of parameters have been an extremely remarkable success. However, the implementation of sign language recognition applications for mobile phones with restricted storage and computing capacities is usually greatly constrained by those limited resources. In light of this situation, we suggest lightweight deep neural networks with advanced processing for real-time dynamic sign language recognition (DSLR). This paper presents a DSLR application to minimize the gap between hearing-impaired communities and regular society. The DSLR application was developed using two robust deep learning models, the GRU and the 1D CNN, combined with the MediaPipe framework. In this paper, the authors implement advanced processes to solve most of the DSLR problems, especially in real-time detection, e.g., differences in depth and location. The solution method consists of three main parts. First, the input dataset is preprocessed with our algorithm to standardize the number of frames. Then, the MediaPipe framework extracts hands and poses landmarks (features) to detect and locate them. Finally, the features of the models are passed after processing the unification of the depth and location of the body to recognize the DSL accurately. To accomplish this, the authors built a new American video-based sign dataset and named it DSL-46. DSL-46 contains 46 daily used signs that were presented with all the needed details and properties for recording the new dataset. The results of the experiments show that the presented solution method can recognize dynamic signs extremely fast and accurately, even in real-time detection. The DSLR reaches an accuracy of 98.8%, 99.84%, and 88.40% on the DSL-46, LSA64, and LIBRAS-BSL datasets, respectively.


Sujet(s)
Apprentissage profond , Humains , Gestes , Reconnaissance automatique des formes/méthodes , , Algorithmes , Main
20.
Sensors (Basel) ; 23(1)2022 Dec 20.
Article de Anglais | MEDLINE | ID: mdl-36616603

RÉSUMÉ

Motion analysis is an area with several applications for health, sports, and entertainment. The high cost of state-of-the-art equipment in the health field makes it unfeasible to apply this technique in the clinics' routines. In this vein, RGB-D and RGB equipment, which have joint tracking tools, are tested with portable and low-cost solutions to enable computational motion analysis. The recent release of Google MediaPipe, a joint inference tracking technique that uses conventional RGB cameras, can be considered a milestone due to its ability to estimate depth coordinates in planar images. In light of this, this work aims to evaluate the measurement of angular variation from RGB-D and RGB sensor data against the Qualisys Tracking Manager gold standard. A total of 60 recordings were performed for each upper and lower limb movement in two different position configurations concerning the sensors. Google's MediaPipe usage obtained close results compared to Kinect V2 sensor in the inherent aspects of absolute error, RMS, and correlation to the gold standard, presenting lower dispersion values and error metrics, which is more positive. In the comparison with equipment commonly used in physical evaluations, MediaPipe had an error within the error range of short- and long-arm goniometers.


Sujet(s)
Mouvement , Sports , Phénomènes biomécaniques , Déplacement , Référenciation
SÉLECTION CITATIONS
DÉTAIL DE RECHERCHE