Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 18.156
Filtrar
Mais filtros

Intervalo de ano de publicação
1.
Nature ; 566(7743): 195-204, 2019 02.
Artigo em Inglês | MEDLINE | ID: mdl-30760912

RESUMO

Machine learning approaches are increasingly used to extract patterns and insights from the ever-increasing stream of geospatial data, but current approaches may not be optimal when system behaviour is dominated by spatial or temporal context. Here, rather than amending classical machine learning, we argue that these contextual cues should be used as part of deep learning (an approach that is able to extract spatio-temporal features automatically) to gain further process understanding of Earth system science problems, improving the predictive ability of seasonal forecasting and modelling of long-range spatial connections across multiple timescales, for example. The next step will be a hybrid modelling approach, coupling physical process models with the versatility of data-driven machine learning.


Assuntos
Big Data , Simulação por Computador , Aprendizado Profundo , Ciências da Terra/métodos , Previsões/métodos , Reconhecimento Automatizado de Padrão/métodos , Reconhecimento Facial , Feminino , Mapeamento Geográfico , Humanos , Conhecimento , Regressão Psicológica , Reprodutibilidade dos Testes , Estações do Ano , Análise Espaço-Temporal , Fatores de Tempo , Tradução , Incerteza , Tempo (Meteorologia)
2.
Nature ; 569(7755): 208-214, 2019 05.
Artigo em Inglês | MEDLINE | ID: mdl-31068721

RESUMO

Software implementations of brain-inspired computing underlie many important computational tasks, from image processing to speech recognition, artificial intelligence and deep learning applications. Yet, unlike real neural tissue, traditional computing architectures physically separate the core computing functions of memory and processing, making fast, efficient and low-energy computing difficult to achieve. To overcome such limitations, an attractive alternative is to design hardware that mimics neurons and synapses. Such hardware, when connected in networks or neuromorphic systems, processes information in a way more analogous to brains. Here we present an all-optical version of such a neurosynaptic system, capable of supervised and unsupervised learning. We exploit wavelength division multiplexing techniques to implement a scalable circuit architecture for photonic neural networks, successfully demonstrating pattern recognition directly in the optical domain. Such photonic neurosynaptic networks promise access to the high speed and high bandwidth inherent to optical systems, thus enabling the direct processing of optical telecommunication and visual data.


Assuntos
Biomimética/métodos , Modelos Neurológicos , Redes Neurais de Computação , Reconhecimento Automatizado de Padrão/métodos , Fótons , Aprendizado de Máquina Supervisionado , Aprendizado de Máquina não Supervisionado , Potenciais de Ação , Sistemas Computacionais , Computadores , Rede Nervosa/citologia , Rede Nervosa/fisiologia , Neurônios/citologia , Neurônios/fisiologia , Sinapses/fisiologia
3.
Bioinformatics ; 39(1)2023 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-36416141

RESUMO

MOTIVATION: Most of the conventional deep neural network-based methods for drug-drug interaction (DDI) extraction consider only context information around drug mentions in the text. However, human experts use heterogeneous background knowledge about drugs to comprehend pharmaceutical papers and extract relationships between drugs. Therefore, we propose a novel method that simultaneously considers various heterogeneous information for DDI extraction from the literature. RESULTS: We first construct drug representations by conducting the link prediction task on a heterogeneous pharmaceutical knowledge graph (KG) dataset. We then effectively combine the text information of input sentences in the corpus and the information on drugs in the heterogeneous KG (HKG) dataset. Finally, we evaluate our DDI extraction method on the DDIExtraction-2013 shared task dataset. In the experiment, integrating heterogeneous drug information significantly improves the DDI extraction performance, and we achieved an F-score of 85.40%, which results in state-of-the-art performance. We evaluated our method on the DrugProt dataset and improved the performance significantly, achieving an F-score of 77.9%. Further analysis showed that each type of node in the HKG contributes to the performance improvement of DDI extraction, indicating the importance of considering multiple pieces of information. AVAILABILITY AND IMPLEMENTATION: Our code is available at https://github.com/tticoin/HKG-DDIE.git.


Assuntos
Mineração de Dados , Reconhecimento Automatizado de Padrão , Humanos , Reconhecimento Automatizado de Padrão/métodos , Mineração de Dados/métodos , Interações Medicamentosas , Redes Neurais de Computação , Preparações Farmacêuticas
4.
Opt Express ; 32(10): 16645-16656, 2024 May 06.
Artigo em Inglês | MEDLINE | ID: mdl-38858865

RESUMO

Single-Photon Avalanche Diode (SPAD) direct Time-of-Flight (dToF) sensors provide depth imaging over long distances, enabling the detection of objects even in the absence of contrast in colour or texture. However, distant objects are represented by just a few pixels and are subject to noise from solar interference, limiting the applicability of existing computer vision techniques for high-level scene interpretation. We present a new SPAD-based vision system for human activity recognition, based on convolutional and recurrent neural networks, which is trained entirely on synthetic data. In tests using real data from a 64×32 pixel SPAD, captured over a distance of 40 m, the scheme successfully overcomes the limited transverse resolution (in which human limbs are approximately one pixel across), achieving an average accuracy of 89% in distinguishing between seven different activities. The approach analyses continuous streams of video-rate depth data at a maximal rate of 66 FPS when executed on a GPU, making it well-suited for real-time applications such as surveillance or situational awareness in autonomous systems.


Assuntos
Fótons , Humanos , Atividades Humanas , Redes Neurais de Computação , Reconhecimento Automatizado de Padrão/métodos , Desenho de Equipamento
5.
Anesthesiology ; 141(1): 32-43, 2024 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-38466210

RESUMO

BACKGROUND: Research on electronic health record physiologic data is common, invariably including artifacts. Traditionally, these artifacts have been handled using simple filter techniques. The authors hypothesized that different artifact detection algorithms, including machine learning, may be necessary to provide optimal performance for various vital signs and clinical contexts. METHODS: In a retrospective single-center study, intraoperative operating room and intensive care unit (ICU) electronic health record datasets including heart rate, oxygen saturation, blood pressure, temperature, and capnometry were included. All records were screened for artifacts by at least two human experts. Classical artifact detection methods (cutoff, multiples of SD [z-value], interquartile range, and local outlier factor) and a supervised learning model implementing long short-term memory neural networks were tested for each vital sign against the human expert reference dataset. For each artifact detection algorithm, sensitivity and specificity were calculated. RESULTS: A total of 106 (53 operating room and 53 ICU) patients were randomly selected, resulting in 392,808 data points. Human experts annotated 5,167 (1.3%) data points as artifacts. The artifact detection algorithms demonstrated large variations in performance. The specificity was above 90% for all detection methods and all vital signs. The neural network showed significantly higher sensitivities than the classic methods for heart rate (ICU, 33.6%; 95% CI, 33.1 to 44.6), systolic invasive blood pressure (in both the operating room [62.2%; 95% CI, 57.5 to 71.9] and the ICU [60.7%; 95% CI, 57.3 to 71.8]), and temperature in the operating room (76.1%; 95% CI, 63.6 to 89.7). The CI for specificity overlapped for all methods. Generally, sensitivity was low, with only the z-value for oxygen saturation in the operating room reaching 88.9%. All other sensitivities were less than 80%. CONCLUSIONS: No single artifact detection method consistently performed well across different vital signs and clinical settings. Neural networks may be a promising artifact detection method for specific vital signs.


Assuntos
Algoritmos , Artefatos , Registros Eletrônicos de Saúde , Aprendizado de Máquina , Sinais Vitais , Humanos , Estudos Retrospectivos , Sinais Vitais/fisiologia , Masculino , Feminino , Pessoa de Meia-Idade , Idoso , Reconhecimento Automatizado de Padrão/métodos
6.
Nature ; 559(7714): 370-376, 2018 07.
Artigo em Inglês | MEDLINE | ID: mdl-29973727

RESUMO

From bacteria following simple chemical gradients1 to the brain distinguishing complex odour information2, the ability to recognize molecular patterns is essential for biological organisms. This type of information-processing function has been implemented using DNA-based neural networks3, but has been limited to the recognition of a set of no more than four patterns, each composed of four distinct DNA molecules. Winner-take-all computation4 has been suggested5,6 as a potential strategy for enhancing the capability of DNA-based neural networks. Compared to the linear-threshold circuits7 and Hopfield networks8 used previously3, winner-take-all circuits are computationally more powerful4, allow simpler molecular implementation and are not constrained by the number of patterns and their complexity, so both a large number of simple patterns and a small number of complex patterns can be recognized. Here we report a systematic implementation of winner-take-all neural networks based on DNA-strand-displacement9,10 reactions. We use a previously developed seesaw DNA gate motif3,11,12, extended to include a simple and robust component that facilitates the cooperative hybridization13 that is involved in the process of selecting a 'winner'. We show that with this extended seesaw motif DNA-based neural networks can classify patterns into up to nine categories. Each of these patterns consists of 20 distinct DNA molecules chosen from the set of 100 that represents the 100 bits in 10 × 10 patterns, with the 20 DNA molecules selected tracing one of the handwritten digits '1' to '9'. The network successfully classified test patterns with up to 30 of the 100 bits flipped relative to the digit patterns 'remembered' during training, suggesting that molecular circuits can robustly accomplish the sophisticated task of classifying highly complex and noisy information on the basis of similarity to a memory.


Assuntos
DNA/química , Modelos Neurológicos , Redes Neurais de Computação , Reconhecimento Automatizado de Padrão/métodos , Memória , Neurônios/fisiologia
7.
J Ultrasound Med ; 43(6): 1025-1036, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38400537

RESUMO

OBJECTIVES: To complete the task of automatic recognition and classification of thyroid nodules and solve the problem of high classification error rates when the samples are imbalanced. METHODS: An improved k-nearest neighbor (KNN) algorithm is proposed and a method for automatic thyroid nodule classification based on the improved KNN algorithm is established. In the improved KNN algorithm, we consider not only the number of class labels for various classes of data in KNNs, but also the corresponding weights. And we use the Minkowski distance measure instead of the Euclidean distance measure. RESULTS: A total of 508 ultrasound images of thyroid nodules, including 415 benign nodules and 93 malignant nodules, were used in the paper. Experimental results show the improved KNN has 0.872549 accuracy, 0.867347 precision, 1 recall, and 0.928962 F1-score. At the same time, we also considered the influence of different distance weights, the value of k, different distance measures on the classification results. CONCLUSIONS: A comparison result shows that our method has a better performance than the traditional KNN and other classical machine learning methods.


Assuntos
Algoritmos , Nódulo da Glândula Tireoide , Ultrassonografia , Nódulo da Glândula Tireoide/diagnóstico por imagem , Nódulo da Glândula Tireoide/classificação , Humanos , Ultrassonografia/métodos , Reprodutibilidade dos Testes , Glândula Tireoide/diagnóstico por imagem , Interpretação de Imagem Assistida por Computador/métodos , Reconhecimento Automatizado de Padrão/métodos
8.
Sensors (Basel) ; 24(6)2024 Mar 20.
Artigo em Inglês | MEDLINE | ID: mdl-38544240

RESUMO

Radio frequency (RF) technology has been applied to enable advanced behavioral sensing in human-computer interaction. Due to its device-free sensing capability and wide availability on Internet of Things devices. Enabling finger gesture-based identification with high accuracy can be challenging due to low RF signal resolution and user heterogeneity. In this paper, we propose MeshID, a novel RF-based user identification scheme that enables identification through finger gestures with high accuracy. MeshID significantly improves the sensing sensitivity on RF signal interference, and hence is able to extract subtle individual biometrics through velocity distribution profiling (VDP) features from less-distinct finger motions such as drawing digits in the air. We design an efficient few-shot model retraining framework based on first component reverse module, achieving high model robustness and performance in a complex environment. We conduct comprehensive real-world experiments and the results show that MeshID achieves a user identification accuracy of 95.17% on average in three indoor environments. The results indicate that MeshID outperforms the state-of-the-art in identification performance with less cost.


Assuntos
Algoritmos , Gestos , Humanos , Reconhecimento Automatizado de Padrão/métodos , Dedos , Movimento (Física)
9.
Sensors (Basel) ; 24(12)2024 Jun 14.
Artigo em Inglês | MEDLINE | ID: mdl-38931629

RESUMO

Existing end-to-end speech recognition methods typically employ hybrid decoders based on CTC and Transformer. However, the issue of error accumulation in these hybrid decoders hinders further improvements in accuracy. Additionally, most existing models are built upon Transformer architecture, which tends to be complex and unfriendly to small datasets. Hence, we propose a Nonlinear Regularization Decoding Method for Speech Recognition. Firstly, we introduce the nonlinear Transformer decoder, breaking away from traditional left-to-right or right-to-left decoding orders and enabling associations between any characters, mitigating the limitations of Transformer architectures on small datasets. Secondly, we propose a novel regularization attention module to optimize the attention score matrix, reducing the impact of early errors on later outputs. Finally, we introduce the tiny model to address the challenge of overly large model parameters. The experimental results indicate that our model demonstrates good performance. Compared to the baseline, our model achieves recognition improvements of 0.12%, 0.54%, 0.51%, and 1.2% on the Aishell1, Primewords, Free ST Chinese Corpus, and Common Voice 16.1 datasets of Uyghur, respectively.


Assuntos
Algoritmos , Interface para o Reconhecimento da Fala , Humanos , Fala/fisiologia , Dinâmica não Linear , Reconhecimento Automatizado de Padrão/métodos
10.
Sensors (Basel) ; 24(12)2024 Jun 16.
Artigo em Inglês | MEDLINE | ID: mdl-38931682

RESUMO

Monitoring activities of daily living (ADLs) plays an important role in measuring and responding to a person's ability to manage their basic physical needs. Effective recognition systems for monitoring ADLs must successfully recognize naturalistic activities that also realistically occur at infrequent intervals. However, existing systems primarily focus on either recognizing more separable, controlled activity types or are trained on balanced datasets where activities occur more frequently. In our work, we investigate the challenges associated with applying machine learning to an imbalanced dataset collected from a fully in-the-wild environment. This analysis shows that the combination of preprocessing techniques to increase recall and postprocessing techniques to increase precision can result in more desirable models for tasks such as ADL monitoring. In a user-independent evaluation using in-the-wild data, these techniques resulted in a model that achieved an event-based F1-score of over 0.9 for brushing teeth, combing hair, walking, and washing hands. This work tackles fundamental challenges in machine learning that will need to be addressed in order for these systems to be deployed and reliably work in the real world.


Assuntos
Atividades Cotidianas , Atividades Humanas , Aprendizado de Máquina , Humanos , Algoritmos , Caminhada/fisiologia , Reconhecimento Automatizado de Padrão/métodos
11.
Sensors (Basel) ; 24(12)2024 Jun 18.
Artigo em Inglês | MEDLINE | ID: mdl-38931728

RESUMO

There has been a resurgence of applications focused on human activity recognition (HAR) in smart homes, especially in the field of ambient intelligence and assisted-living technologies. However, such applications present numerous significant challenges to any automated analysis system operating in the real world, such as variability, sparsity, and noise in sensor measurements. Although state-of-the-art HAR systems have made considerable strides in addressing some of these challenges, they suffer from a practical limitation: they require successful pre-segmentation of continuous sensor data streams prior to automated recognition, i.e., they assume that an oracle is present during deployment, and that it is capable of identifying time windows of interest across discrete sensor events. To overcome this limitation, we propose a novel graph-guided neural network approach that performs activity recognition by learning explicit co-firing relationships between sensors. We accomplish this by learning a more expressive graph structure representing the sensor network in a smart home in a data-driven manner. Our approach maps discrete input sensor measurements to a feature space through the application of attention mechanisms and hierarchical pooling of node embeddings. We demonstrate the effectiveness of our proposed approach by conducting several experiments on CASAS datasets, showing that the resulting graph-guided neural network outperforms the state-of-the-art method for HAR in smart homes across multiple datasets and by large margins. These results are promising because they push HAR for smart homes closer to real-world applications.


Assuntos
Atividades Humanas , Redes Neurais de Computação , Humanos , Algoritmos , Reconhecimento Automatizado de Padrão/métodos
12.
Sensors (Basel) ; 24(8)2024 Apr 18.
Artigo em Inglês | MEDLINE | ID: mdl-38676207

RESUMO

Teaching gesture recognition is a technique used to recognize the hand movements of teachers in classroom teaching scenarios. This technology is widely used in education, including for classroom teaching evaluation, enhancing online teaching, and assisting special education. However, current research on gesture recognition in teaching mainly focuses on detecting the static gestures of individual students and analyzing their classroom behavior. To analyze the teacher's gestures and mitigate the difficulty of single-target dynamic gesture recognition in multi-person teaching scenarios, this paper proposes skeleton-based teaching gesture recognition (ST-TGR), which learns through spatio-temporal representation. This method mainly uses the human pose estimation technique RTMPose to extract the coordinates of the keypoints of the teacher's skeleton and then inputs the recognized sequence of the teacher's skeleton into the MoGRU action recognition network for classifying gesture actions. The MoGRU action recognition module mainly learns the spatio-temporal representation of target actions by stacking a multi-scale bidirectional gated recurrent unit (BiGRU) and using improved attention mechanism modules. To validate the generalization of the action recognition network model, we conducted comparative experiments on datasets including NTU RGB+D 60, UT-Kinect Action3D, SBU Kinect Interaction, and Florence 3D. The results indicate that, compared with most existing baseline models, the model proposed in this article exhibits better performance in recognition accuracy and speed.


Assuntos
Gestos , Humanos , Reconhecimento Automatizado de Padrão/métodos , Algoritmos , Ensino
13.
Sensors (Basel) ; 24(9)2024 May 05.
Artigo em Inglês | MEDLINE | ID: mdl-38733038

RESUMO

With the continuous advancement of autonomous driving and monitoring technologies, there is increasing attention on non-intrusive target monitoring and recognition. This paper proposes an ArcFace SE-attention model-agnostic meta-learning approach (AS-MAML) by integrating attention mechanisms into residual networks for pedestrian gait recognition using frequency-modulated continuous-wave (FMCW) millimeter-wave radar through meta-learning. We enhance the feature extraction capability of the base network using channel attention mechanisms and integrate the additive angular margin loss function (ArcFace loss) into the inner loop of MAML to constrain inner loop optimization and improve radar discrimination. Then, this network is used to classify small-sample micro-Doppler images obtained from millimeter-wave radar as the data source for pose recognition. Experimental tests were conducted on pose estimation and image classification tasks. The results demonstrate significant detection and recognition performance, with an accuracy of 94.5%, accompanied by a 95% confidence interval. Additionally, on the open-source dataset DIAT-µRadHAR, which is specially processed to increase classification difficulty, the network achieves a classification accuracy of 85.9%.


Assuntos
Pedestres , Radar , Humanos , Algoritmos , Marcha/fisiologia , Reconhecimento Automatizado de Padrão/métodos , Aprendizado de Máquina
14.
Sensors (Basel) ; 24(15)2024 Jul 26.
Artigo em Inglês | MEDLINE | ID: mdl-39123907

RESUMO

Skeleton-based action recognition, renowned for its computational efficiency and indifference to lighting variations, has become a focal point in the realm of motion analysis. However, most current methods typically only extract global skeleton features, overlooking the potential semantic relationships among various partial limb motions. For instance, the subtle differences between actions such as "brush teeth" and "brush hair" are mainly distinguished by specific elements. Although combining limb movements provides a more holistic representation of an action, relying solely on skeleton points proves inadequate for capturing these nuances. Therefore, integrating detailed linguistic descriptions into the learning process of skeleton features is essential. This motivates us to explore integrating fine-grained language descriptions into the learning process of skeleton features to capture more discriminative skeleton behavior representations. To this end, we introduce a new Linguistic-Driven Partial Semantic Relevance Learning framework (LPSR) in this work. While using state-of-the-art large language models to generate linguistic descriptions of local limb motions and further constrain the learning of local motions, we also aggregate global skeleton point representations and textual representations (which generated from an LLM) to obtain a more generalized cross-modal behavioral representation. On this basis, we propose a cyclic attentional interaction module to model the implicit correlations between partial limb motions. Numerous ablation experiments demonstrate the effectiveness of the method proposed in this paper, and our method also obtains state-of-the-art results.


Assuntos
Semântica , Humanos , Linguística , Movimento/fisiologia , Reconhecimento Automatizado de Padrão/métodos , Algoritmos , Aprendizagem/fisiologia
15.
Sensors (Basel) ; 24(15)2024 Jul 30.
Artigo em Inglês | MEDLINE | ID: mdl-39123986

RESUMO

Human action recognition (HAR) technology based on radar signals has garnered significant attention from both industry and academia due to its exceptional privacy-preserving capabilities, noncontact sensing characteristics, and insensitivity to lighting conditions. However, the scarcity of accurately labeled human radar data poses a significant challenge in meeting the demand for large-scale training datasets required by deep model-based HAR technology, thus substantially impeding technological advancements in this field. To address this issue, a semi-supervised learning algorithm, MF-Match, is proposed in this paper. This algorithm computes pseudo-labels for larger-scale unsupervised radar data, enabling the model to extract embedded human behavioral information and enhance the accuracy of HAR algorithms. Furthermore, the method incorporates contrastive learning principles to improve the quality of model-generated pseudo-labels and mitigate the impact of mislabeled pseudo-labels on recognition performance. Experimental results demonstrate that this method achieves action recognition accuracies of 86.69% and 91.48% on two widely used radar spectrum datasets, respectively, utilizing only 10% labeled data, thereby validating the effectiveness of the proposed approach.


Assuntos
Algoritmos , Humanos , Radar , Aprendizado de Máquina Supervisionado , Reconhecimento Automatizado de Padrão/métodos , Atividades Humanas
16.
Sensors (Basel) ; 24(3)2024 Jan 26.
Artigo em Inglês | MEDLINE | ID: mdl-38339542

RESUMO

Japanese Sign Language (JSL) is vital for communication in Japan's deaf and hard-of-hearing community. But probably because of the large number of patterns, 46 types, there is a mixture of static and dynamic, and the dynamic ones have been excluded in most studies. Few researchers have been working to develop a dynamic JSL alphabet, and their performance accuracy is unsatisfactory. We proposed a dynamic JSL recognition system using effective feature extraction and feature selection approaches to overcome the challenges. In the procedure, we follow the hand pose estimation, effective feature extraction, and machine learning techniques. We collected a video dataset capturing JSL gestures through standard RGB cameras and employed MediaPipe for hand pose estimation. Four types of features were proposed. The significance of these features is that the same feature generation method can be used regardless of the number of frames or whether the features are dynamic or static. We employed a Random forest (RF) based feature selection approach to select the potential feature. Finally, we fed the reduced features into the kernels-based Support Vector Machine (SVM) algorithm classification. Evaluations conducted on our proprietary newly created dynamic Japanese sign language alphabet dataset and LSA64 dynamic dataset yielded recognition accuracies of 97.20% and 98.40%, respectively. This innovative approach not only addresses the complexities of JSL but also holds the potential to bridge communication gaps, offering effective communication for the deaf and hard-of-hearing, and has broader implications for sign language recognition systems globally.


Assuntos
Reconhecimento Automatizado de Padrão , Língua de Sinais , Humanos , Japão , Reconhecimento Automatizado de Padrão/métodos , Mãos , Algoritmos , Gestos
17.
Sensors (Basel) ; 24(14)2024 Jul 15.
Artigo em Inglês | MEDLINE | ID: mdl-39065968

RESUMO

Human action recognition based on optical and infrared video data is greatly affected by the environment, and feature extraction in traditional machine learning classification methods is complex; therefore, this paper proposes a method for human action recognition using Frequency Modulated Continuous Wave (FMCW) radar based on an asymmetric convolutional residual network. First, the radar echo data are analyzed and processed to extract the micro-Doppler time domain spectrograms of different actions. Second, a strategy combining asymmetric convolution and the Mish activation function is adopted in the residual block of the ResNet18 network to address the limitations of linear and nonlinear transformations in the residual block for micro-Doppler spectrum recognition. This approach aims to enhance the network's ability to learn features effectively. Finally, the Improved Convolutional Block Attention Module (ICBAM) is integrated into the residual block to enhance the model's attention and comprehension of input data. The experimental results demonstrate that the proposed method achieves a high accuracy of 98.28% in action recognition and classification within complex scenes, surpassing classic deep learning approaches. Moreover, this method significantly improves the recognition accuracy for actions with similar micro-Doppler features and demonstrates excellent anti-noise recognition performance.


Assuntos
Redes Neurais de Computação , Radar , Humanos , Algoritmos , Aprendizado de Máquina , Atividades Humanas/classificação , Aprendizado Profundo , Reconhecimento Automatizado de Padrão/métodos
18.
Sensors (Basel) ; 24(14)2024 Jul 17.
Artigo em Inglês | MEDLINE | ID: mdl-39066043

RESUMO

Human activity recognition (HAR) is pivotal in advancing applications ranging from healthcare monitoring to interactive gaming. Traditional HAR systems, primarily relying on single data sources, face limitations in capturing the full spectrum of human activities. This study introduces a comprehensive approach to HAR by integrating two critical modalities: RGB imaging and advanced pose estimation features. Our methodology leverages the strengths of each modality to overcome the drawbacks of unimodal systems, providing a richer and more accurate representation of activities. We propose a two-stream network that processes skeletal and RGB data in parallel, enhanced by pose estimation techniques for refined feature extraction. The integration of these modalities is facilitated through advanced fusion algorithms, significantly improving recognition accuracy. Extensive experiments conducted on the UTD multimodal human action dataset (UTD MHAD) demonstrate that the proposed approach exceeds the performance of existing state-of-the-art algorithms, yielding improved outcomes. This study not only sets a new benchmark for HAR systems but also highlights the importance of feature engineering in capturing the complexity of human movements and the integration of optimal features. Our findings pave the way for more sophisticated, reliable, and applicable HAR systems in real-world scenarios.


Assuntos
Algoritmos , Atividades Humanas , Humanos , Processamento de Imagem Assistida por Computador/métodos , Movimento/fisiologia , Postura/fisiologia , Reconhecimento Automatizado de Padrão/métodos
19.
Sensors (Basel) ; 24(8)2024 Apr 10.
Artigo em Inglês | MEDLINE | ID: mdl-38676024

RESUMO

In recent decades, technological advancements have transformed the industry, highlighting the efficiency of automation and safety. The integration of augmented reality (AR) and gesture recognition has emerged as an innovative approach to create interactive environments for industrial equipment. Gesture recognition enhances AR applications by allowing intuitive interactions. This study presents a web-based architecture for the integration of AR and gesture recognition, designed to interact with industrial equipment. Emphasizing hardware-agnostic compatibility, the proposed structure offers an intuitive interaction with equipment control systems through natural gestures. Experimental validation, conducted using Google Glass, demonstrated the practical viability and potential of this approach in industrial operations. The development focused on optimizing the system's software and implementing techniques such as normalization, clamping, conversion, and filtering to achieve accurate and reliable gesture recognition under different usage conditions. The proposed approach promotes safer and more efficient industrial operations, contributing to research in AR and gesture recognition. Future work will include improving the gesture recognition accuracy, exploring alternative gestures, and expanding the platform integration to improve the user experience.


Assuntos
Realidade Aumentada , Gestos , Humanos , Indústrias , Software , Reconhecimento Automatizado de Padrão/métodos , Interface Usuário-Computador
20.
Sensors (Basel) ; 24(8)2024 Apr 14.
Artigo em Inglês | MEDLINE | ID: mdl-38676137

RESUMO

Human action recognition (HAR) is growing in machine learning with a wide range of applications. One challenging aspect of HAR is recognizing human actions while playing music, further complicated by the need to recognize the musical notes being played. This paper proposes a deep learning-based method for simultaneous HAR and musical note recognition in music performances. We conducted experiments on Morin khuur performances, a traditional Mongolian instrument. The proposed method consists of two stages. First, we created a new dataset of Morin khuur performances. We used motion capture systems and depth sensors to collect data that includes hand keypoints, instrument segmentation information, and detailed movement information. We then analyzed RGB images, depth images, and motion data to determine which type of data provides the most valuable features for recognizing actions and notes in music performances. The second stage utilizes a Spatial Temporal Attention Graph Convolutional Network (STA-GCN) to recognize musical notes as continuous gestures. The STA-GCN model is designed to learn the relationships between hand keypoints and instrument segmentation information, which are crucial for accurate recognition. Evaluation on our dataset demonstrates that our model outperforms the traditional ST-GCN model, achieving an accuracy of 81.4%.


Assuntos
Aprendizado Profundo , Música , Humanos , Redes Neurais de Computação , Atividades Humanas , Reconhecimento Automatizado de Padrão/métodos , Gestos , Algoritmos , Movimento/fisiologia
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa