Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 104
Filtrar
1.
IEEE J Biomed Health Inform ; 28(5): 2687-2698, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38442051

RESUMO

Self-supervised Human Activity Recognition (HAR) has been gradually gaining a lot of attention in ubiquitous computing community. Its current focus primarily lies in how to overcome the challenge of manually labeling complicated and intricate sensor data from wearable devices, which is often hard to interpret. However, current self-supervised algorithms encounter three main challenges: performance variability caused by data augmentations in contrastive learning paradigm, limitations imposed by traditional self-supervised models, and the computational load deployed on wearable devices by current mainstream transformer encoders. To comprehensively tackle these challenges, this paper proposes a powerful self-supervised approach for HAR from a novel perspective of denoising autoencoder, the first of its kind to explore how to reconstruct masked sensor data built on a commonly employed, well-designed, and computationally efficient fully convolutional network. Extensive experiments demonstrate that our proposed Masked Convolutional AutoEncoder (MaskCAE) outperforms current state-of-the-art algorithms in self-supervised, fully supervised, and semi-supervised situations without relying on any data augmentations, which fills the gap of masked sensor data modeling in HAR area. Visualization analyses show that our MaskCAE could effectively capture temporal semantics in time series sensor data, indicating its great potential in modeling abstracted sensor data. An actual implementation is evaluated on an embedded platform.


Assuntos
Algoritmos , Atividades Humanas , Humanos , Atividades Humanas/classificação , Processamento de Sinais Assistido por Computador , Dispositivos Eletrônicos Vestíveis , Aprendizado de Máquina Supervisionado , Redes Neurais de Computação
2.
IEEE J Biomed Health Inform ; 28(5): 2733-2744, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38483804

RESUMO

Human Activity Recognition (HAR) has recently attracted widespread attention, with the effective application of this technology helping people in areas such as healthcare, smart homes, and gait analysis. Deep learning methods have shown remarkable performance in HAR. A pivotal challenge is the trade-off between recognition accuracy and computational efficiency, especially in resource-constrained mobile devices. This challenge necessitates the development of models that enhance feature representation capabilities without imposing additional computational burdens. Addressing this, we introduce a novel HAR model leveraging deep learning, ingeniously designed to navigate the accuracy-efficiency trade-off. The model comprises two innovative modules: 1) Pyramid Multi-scale Convolutional Network (PMCN), which is designed with a symmetric structure and is capable of obtaining a rich receptive field at a finer level through its multiscale representation capability; 2) Cross-Attention Mechanism, which establishes interrelationships among sensor dimensions, temporal dimensions, and channel dimensions, and effectively enhances useful information while suppressing irrelevant data. The proposed model is rigorously evaluated across four diverse datasets: UCI, WISDM, PAMAP2, and OPPORTUNITY. Additional ablation and comparative studies are conducted to comprehensively assess the performance of the model. Experimental results demonstrate that the proposed model achieves superior activity recognition accuracy while maintaining low computational overhead.


Assuntos
Aprendizado Profundo , Atividades Humanas , Humanos , Atividades Humanas/classificação , Processamento de Sinais Assistido por Computador , Redes Neurais de Computação , Algoritmos , Bases de Dados Factuais , Monitorização Ambulatorial/métodos , Monitorização Ambulatorial/instrumentação
3.
IEEE Trans Pattern Anal Mach Intell ; 46(8): 5345-5361, 2024 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-38376962

RESUMO

Federated human activity recognition (FHAR) has attracted much attention due to its great potential in privacy protection. Existing FHAR methods can collaboratively learn a global activity recognition model based on unimodal or multimodal data distributed on different local clients. However, it is still questionable whether existing methods can work well in a more common scenario where local data are from different modalities, e.g., some local clients may provide motion signals while others can only provide visual data. In this article, we study a new problem of cross-modal federated human activity recognition (CM-FHAR), which is conducive to promote the large-scale use of the HAR model on more local devices. CM-FHAR has at least three dedicated challenges: 1) distributive common cross-modal feature learning, 2) modality-dependent discriminate feature learning, 3) modality imbalance issue. To address these challenges, we propose a modality-collaborative activity recognition network (MCARN), which can comprehensively learn a global activity classifier shared across all clients and multiple modality-dependent private activity classifiers. To produce modality-agnostic and modality-specific features, we learn an altruistic encoder and an egocentric encoder under the constraint of a separation loss and an adversarial modality discriminator collaboratively learned in hyper-sphere. To address the modality imbalance issue, we propose an angular margin adjustment scheme to improve the modality discriminator on modality-imbalanced data by enhancing the intra-modality compactness of the dominant modality and increase the inter-modality discrepancy. Moreover, we propose a relation-aware global-local calibration mechanism to constrain class-level pairwise relationships for the parameters of the private classifier. Finally, through decentralized optimization with alternative steps of adversarial local updating and modality-aware global aggregation, the proposed MCARN obtains state-of-the-art performance on both modality-balanced and modality-imbalanced data.


Assuntos
Algoritmos , Atividades Humanas , Reconhecimento Automatizado de Padrão , Humanos , Atividades Humanas/classificação , Reconhecimento Automatizado de Padrão/métodos , Aprendizado de Máquina
4.
IEEE Trans Image Process ; 30: 6240-6254, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34224352

RESUMO

The task of human interaction understanding involves both recognizing the action of each individual in the scene and decoding the interaction relationship among people, which is useful to a series of vision applications such as camera surveillance, video-based sports analysis and event retrieval. This paper divides the task into two problems including grouping people into clusters and assigning labels to each of them, and presents an approach to solving these problems in a joint manner. Our method does not assume the number of groups is known beforehand as this will substantially restrict its application. With the observation that the two challenges are highly correlated, the key idea is to model the pairwise interacting relations among people via a complete graph and its associated energy function such that the labeling and grouping problems are translated into the minimization of the energy function. We implement this joint framework by fusing both deep features and rich contextual cues, and learn the fusion parameters from data. An alternating search algorithm is developed in order to efficiently solve the associated inference problem. By combining the grouping and labeling results obtained with our method, we are able to achieve the semantic-level understanding of human interactions. Extensive experiments are performed to qualitatively and quantitatively evaluate the effectiveness of our approach, which outperforms state-of-the-art methods on several important benchmarks. An ablation study is also performed to verify the effectiveness of different modules within our approach.


Assuntos
Atividades Humanas/classificação , Processamento de Imagem Assistida por Computador/métodos , Aprendizado de Máquina , Algoritmos , Humanos , Gravação em Vídeo
5.
IEEE Trans Image Process ; 30: 6583-6593, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34270424

RESUMO

Human-Object Interaction (HOI) Detection is an important task to understand how humans interact with objects. Most of the existing works treat this task as an exhaustive triplet 〈 human, verb, object 〉 classification problem. In this paper, we decompose it and propose a novel two-stage graph model to learn the knowledge of interactiveness and interaction in one network, namely, Interactiveness Proposal Graph Network (IPGN). In the first stage, we design a fully connected graph for learning the interactiveness, which distinguishes whether a pair of human and object is interactive or not. Concretely, it generates the interactiveness features to encode high-level semantic interactiveness knowledge for each pair. The class-agnostic interactiveness is a more general and simpler objective, which can be used to provide reasonable proposals for the graph construction in the second stage. In the second stage, a sparsely connected graph is constructed with all interactive pairs selected by the first stage. Specifically, we use the interactiveness knowledge to guide the message passing. By contrast with the feature similarity, it explicitly represents the connections between the nodes. Benefiting from the valid graph reasoning, the node features are well encoded for interaction learning. Experiments show that the proposed method achieves state-of-the-art performance on both V-COCO and HICO-DET datasets.


Assuntos
Processamento de Imagem Assistida por Computador/métodos , Redes Neurais de Computação , Algoritmos , Animais , Bases de Dados Factuais , Atividades Humanas/classificação , Humanos , Semântica
6.
IEEE Trans Image Process ; 30: 3691-3704, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33705316

RESUMO

This article presents a novel keypoints-based attention mechanism for visual recognition in still images. Deep Convolutional Neural Networks (CNNs) for recognizing images with distinctive classes have shown great success, but their performance in discriminating fine-grained changes is not at the same level. We address this by proposing an end-to-end CNN model, which learns meaningful features linking fine-grained changes using our novel attention mechanism. It captures the spatial structures in images by identifying semantic regions (SRs) and their spatial distributions, and is proved to be the key to modeling subtle changes in images. We automatically identify these SRs by grouping the detected keypoints in a given image. The "usefulness" of these SRs for image recognition is measured using our innovative attentional mechanism focusing on parts of the image that are most relevant to a given task. This framework applies to traditional and fine-grained image recognition tasks and does not require manually annotated regions (e.g. bounding-box of body parts, objects, etc.) for learning and prediction. Moreover, the proposed keypoints-driven attention mechanism can be easily integrated into the existing CNN models. The framework is evaluated on six diverse benchmark datasets. The model outperforms the state-of-the-art approaches by a considerable margin using Distracted Driver V1 (Acc: 3.39%), Distracted Driver V2 (Acc: 6.58%), Stanford-40 Actions (mAP: 2.15%), People Playing Musical Instruments (mAP: 16.05%), Food-101 (Acc: 6.30%) and Caltech-256 (Acc: 2.59%) datasets.


Assuntos
Aprendizado Profundo , Atividades Humanas/classificação , Processamento de Imagem Assistida por Computador/métodos , Feminino , Humanos , Masculino , Semântica
7.
IEEE Trans Image Process ; 30: 2562-2574, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33232232

RESUMO

Human motion prediction, which aims at predicting future human skeletons given the past ones, is a typical sequence-to-sequence problem. Therefore, extensive efforts have been devoted to exploring different RNN-based encoder-decoder architectures. However, by generating target poses conditioned on the previously generated ones, these models are prone to bringing issues such as error accumulation problem. In this paper, we argue that such issue is mainly caused by adopting autoregressive manner. Hence, a novel Non-AuToregressive model (NAT) is proposed with a complete non-autoregressive decoding scheme, as well as a context encoder and a positional encoding module. More specifically, the context encoder embeds the given poses from temporal and spatial perspectives. The frame decoder is responsible for predicting each future pose independently. The positional encoding module injects positional signal into the model to indicate the temporal order. Besides, a multitask training paradigm is presented for both low-level human skeleton prediction and high-level human action recognition, resulting in the considerable improvement for the prediction task. Our approach is evaluated on Human3.6M and CMU-Mocap benchmarks and outperforms state-of-the-art autoregressive methods.


Assuntos
Processamento de Imagem Assistida por Computador/métodos , Aprendizado de Máquina , Movimento/fisiologia , Atividades Humanas/classificação , Humanos , Intenção , Modelos Estatísticos , Gravação em Vídeo
8.
Nat Commun ; 11(1): 1551, 2020 03 25.
Artigo em Inglês | MEDLINE | ID: mdl-32214095

RESUMO

Recognizing human physical activities using wireless sensor networks has attracted significant research interest due to its broad range of applications, such as healthcare, rehabilitation, athletics, and senior monitoring. There are critical challenges inherent in designing a sensor-based activity recognition system operating in and around a lossy medium such as the human body to gain a trade-off among power consumption, cost, computational complexity, and accuracy. We introduce an innovative wireless system based on magnetic induction for human activity recognition to tackle these challenges and constraints. The magnetic induction system is integrated with machine learning techniques to detect a wide range of human motions. This approach is successfully evaluated using synthesized datasets, laboratory measurements, and deep recurrent neural networks.


Assuntos
Aprendizado Profundo , Atividades Humanas/classificação , Fenômenos Magnéticos , Monitorização Fisiológica/métodos , Processamento de Sinais Assistido por Computador , Humanos , Movimento (Física) , Dispositivos Eletrônicos Vestíveis , Tecnologia sem Fio
9.
Sensors (Basel) ; 20(5)2020 Mar 08.
Artigo em Inglês | MEDLINE | ID: mdl-32182668

RESUMO

Over the past few years, the Internet of Things (IoT) has been greatly developed with one instance being smart home devices gradually entering into people's lives. To maximize the impact of such deployments, home-based activity recognition is required to initially recognize behaviors within smart home environments and to use this information to provide better health and social care services. Activity recognition has the ability to recognize people's activities from the information about their interaction with the environment collected by sensors embedded within the home. In this paper, binary data collected by anonymous binary sensors such as pressure sensors, contact sensors, passive infrared sensors etc. are used to recognize activities. A radial basis function neural network (RBFNN) with localized stochastic-sensitive autoencoder (LiSSA) method is proposed for the purposes of home-based activity recognition. An autoencoder (AE) is introduced to extract useful features from the binary sensor data by converting binary inputs into continuous inputs to extract increased levels of hidden information. The generalization capability of the proposed method is enhanced by minimizing both the training error and the stochastic sensitivity measure in an attempt to improve the ability of the classifier to tolerate uncertainties in the sensor data. Four binary home-based activity recognition datasets including OrdonezA, OrdonezB, Ulster, and activities of daily living data from van Kasteren (vanKasterenADL) are used to evaluate the effectiveness of the proposed method. Compared with well-known benchmarking approaches including support vector machine (SVM), multilayer perceptron neural network (MLPNN), random forest and an RBFNN-based method, the proposed method yielded the best performance with 98.35%, 86.26%, 96.31%, 92.31% accuracy on four datasets, respectively.


Assuntos
Atividades Humanas/classificação , Monitorização Ambulatorial/métodos , Rede Nervosa , Adulto , Serviços de Assistência Domiciliar , Humanos , Internet das Coisas , Masculino , Processos Estocásticos , Máquina de Vetores de Suporte
10.
IEEE J Biomed Health Inform ; 24(1): 131-143, 2020 01.
Artigo em Inglês | MEDLINE | ID: mdl-30716055

RESUMO

The irregularity detection of daily behaviors for the elderly is an important issue in homecare. Plenty of mechanisms have been developed to detect the health condition of the elderly based on the explicit irregularity of several biomedical parameters or some specific behaviors. However, few research works focus on detecting the implicit irregularity involving the combination of diverse behaviors, which can assess the cognitive and physical wellbeing of elders but cannot be directly identified based on sensor data. This paper proposes an Implicit IRregularity Detection (IIRD) mechanism that aims to detect the implicit irregularity by developing the unsupervised learning algorithm based on daily behaviors. The proposed IIRD mechanism identifies the distance and similarity between daily behaviors, which are important features to distinguish the regular and irregular daily behaviors and detect the implicit irregularity of elderly health condition. Performance results show that the proposed IIRD outperforms the existing unsupervised machine-learning mechanisms in terms of the detection accuracy and irregularity recall.


Assuntos
Serviços de Assistência Domiciliar , Atividades Humanas/classificação , Aprendizado de Máquina não Supervisionado , Idoso , Algoritmos , Bases de Dados Factuais , Humanos , Monitorização Fisiológica
11.
IEEE Trans Pattern Anal Mach Intell ; 42(2): 502-508, 2020 02.
Artigo em Inglês | MEDLINE | ID: mdl-30802849

RESUMO

We present the Moments in Time Dataset, a large-scale human-annotated collection of one million short videos corresponding to dynamic events unfolding within three seconds. Modeling the spatial-audio-temporal dynamics even for actions occurring in 3 second videos poses many challenges: meaningful events do not include only people, but also objects, animals, and natural phenomena; visual and auditory events can be symmetrical in time ("opening" is "closing" in reverse), and either transient or sustained. We describe the annotation process of our dataset (each video is tagged with one action or activity label among 339 different classes), analyze its scale and diversity in comparison to other large-scale video datasets for action recognition, and report results of several baseline models addressing separately, and jointly, three modalities: spatial, temporal and auditory. The Moments in Time dataset, designed to have a large coverage and diversity of events in both visual and auditory modalities, can serve as a new challenge to develop models that scale to the level of complexity and abstract reasoning that a human processes on a daily basis.


Assuntos
Bases de Dados Factuais , Gravação em Vídeo , Animais , Atividades Humanas/classificação , Humanos , Processamento de Imagem Assistida por Computador , Reconhecimento Automatizado de Padrão
12.
IEEE Trans Pattern Anal Mach Intell ; 42(10): 2684-2701, 2020 10.
Artigo em Inglês | MEDLINE | ID: mdl-31095476

RESUMO

Research on depth-based human activity analysis achieved outstanding performance and demonstrated the effectiveness of 3D representation for action recognition. The existing depth-based and RGB+D-based action recognition benchmarks have a number of limitations, including the lack of large-scale training samples, realistic number of distinct class categories, diversity in camera views, varied environmental conditions, and variety of human subjects. In this work, we introduce a large-scale dataset for RGB+D human action recognition, which is collected from 106 distinct subjects and contains more than 114 thousand video samples and 8 million frames. This dataset contains 120 different action classes including daily, mutual, and health-related activities. We evaluate the performance of a series of existing 3D activity analysis methods on this dataset, and show the advantage of applying deep learning methods for 3D-based human action recognition. Furthermore, we investigate a novel one-shot 3D activity recognition problem on our dataset, and a simple yet effective Action-Part Semantic Relevance-aware (APSR) framework is proposed for this task, which yields promising results for recognition of the novel action classes. We believe the introduction of this large-scale dataset will enable the community to apply, adapt, and develop various data-hungry learning techniques for depth-based and RGB+D-based human activity understanding.


Assuntos
Aprendizado Profundo , Atividades Humanas/classificação , Processamento de Imagem Assistida por Computador/métodos , Reconhecimento Automatizado de Padrão/métodos , Algoritmos , Benchmarking , Humanos , Semântica , Gravação em Vídeo
13.
IEEE J Biomed Health Inform ; 24(1): 27-38, 2020 01.
Artigo em Inglês | MEDLINE | ID: mdl-31107668

RESUMO

PURPOSE: To evaluate and enhance the generalization performance of machine learning physical activity intensity prediction models developed with raw acceleration data on populations monitored by different activity monitors. METHOD: Five datasets from four studies, each containing only hip- or wrist-based raw acceleration data (two hip- and three wrist-based) were extracted. The five datasets were then used to develop and validate artificial neural networks (ANN) in three setups to classify activity intensity categories (sedentary behavior, light, and moderate-to-vigorous). To examine generalizability, the ANN models were developed using within dataset (leave-one-subject-out) cross validation, and then cross tested to other datasets with different accelerometers. To enhance the models' generalizability, a combination of four of the five datasets was used for training and the fifth dataset for validation. Finally, all the five datasets were merged to develop a single model that is generalizable across the datasets (50% of the subjects from each dataset for training, the remaining for validation). RESULTS: The datasets showed high performance in within dataset cross validation (accuracy 71.9-95.4%, Kappa K = 0.63-0.94). The performance of the within dataset validated models decreased when applied to datasets with different accelerometers (41.2-59.9%, K = 0.21-0.48). The trained models on merged datasets consisting hip and wrist data predicted the left-out dataset with acceptable performance (65.9-83.7%, K = 0.61-0.79). The model trained with all five datasets performed with acceptable performance across the datasets (80.4-90.7%, K = 0.68-0.89). CONCLUSIONS: Integrating heterogeneous datasets in training sets seems a viable approach for enhancing the generalization performance of the models. Instead, within dataset validation is not sufficient to understand the models' performance on other populations with different accelerometers.


Assuntos
Acelerometria/métodos , Exercício Físico/fisiologia , Aprendizado de Máquina , Reconhecimento Automatizado de Padrão/métodos , Adulto , Bases de Dados Factuais , Atividades Humanas/classificação , Humanos , Modelos Estatísticos , Monitorização Fisiológica , Redes Neurais de Computação
14.
IEEE J Biomed Health Inform ; 24(1): 292-299, 2020 01.
Artigo em Inglês | MEDLINE | ID: mdl-30969934

RESUMO

Human activity recognition has been widely used in healthcare applications such as elderly monitoring, exercise supervision, and rehabilitation monitoring. Compared with other approaches, sensor-based wearable human activity recognition is less affected by environmental noise and therefore is promising in providing higher recognition accuracy. However, one of the major issues of existing wearable human activity recognition methods is that although the average recognition accuracy is acceptable, the recognition accuracy for some activities (e.g., ascending stairs and descending stairs) is low, mainly due to relatively less training data and complex behavior pattern for these activities. Another issue is that the recognition accuracy is low when the training data from the test subject are limited, which is a common case in real practice. In addition, the use of neural network leads to large computational complexity and thus high power consumption. To address these issues, we proposed a new human activity recognition method with two-stage end-to-end convolutional neural network and a data augmentation method. Compared with the state-of-the-art methods (including neural network based methods and other methods), the proposed methods achieve significantly improved recognition accuracy and reduced computational complexity.


Assuntos
Atividades Humanas/classificação , Movimento/fisiologia , Redes Neurais de Computação , Acelerometria/métodos , Adulto , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Monitorização Ambulatorial/métodos , Reconhecimento Automatizado de Padrão , Dispositivos Eletrônicos Vestíveis , Adulto Jovem
15.
IEEE J Biomed Health Inform ; 24(4): 1206-1214, 2020 04.
Artigo em Inglês | MEDLINE | ID: mdl-31443058

RESUMO

Ecological Momentary Assessment (EMA) is an in-the-moment data collection method which avoids retrospective biases and maximizes ecological validity. A challenge in designing EMA systems is finding a time to ask EMA questions that increases participant engagement and improves the quality of data collection. In this work, we introduce SEP-EMA, a machine learning-based method for providing transition-based context-aware EMA prompt timings. We compare our proposed technique with traditional time-based prompting for 19 individuals living in smart homes. Results reveal that SEP-EMA increased participant response rate by 7.19% compared to time-based prompting. Our findings suggest that prompting during activity transitions makes the EMA process more usable and effective by increasing EMA response rates and mitigating loss of data due to low response rates.


Assuntos
Algoritmos , Avaliação Momentânea Ecológica , Atividades Humanas/classificação , Idoso , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Reconhecimento Automatizado de Padrão , Estudos Retrospectivos , Fatores de Tempo
16.
IEEE Trans Pattern Anal Mach Intell ; 42(1): 126-139, 2020 01.
Artigo em Inglês | MEDLINE | ID: mdl-30296212

RESUMO

With the popularity of mobile sensor technology, smart wearable devices open a unprecedented opportunity to solve the challenging human activity recognition (HAR) problem by learning expressive representations from the multi-dimensional daily sensor signals. This inspires us to develop a new algorithm applicable to both camera-based and wearable sensor-based HAR systems. Although competitive classification accuracy has been reported, existing methods often face the challenge of distinguishing visually similar activities composed of activity patterns in different temporal orders. In this paper, we propose a novel probabilistic algorithm to compactly encode temporal orders of activity patterns for HAR. Specifically, the algorithm learns an optimal set of latent patterns such that their temporal structures really matter in recognizing different human activities. Then, a novel probabilistic First-Take-All (pFTA) approach is introduced to generate compact features from the orders of these latent patterns to encode the entire sequence, and the temporal structural similarity between different sequences can be efficiently measured by the Hamming distance between compact features. Experiments on three public HAR datasets show the proposed pFTA approach can achieve competitive performance in terms of accuracy as well as efficiency.


Assuntos
Atividades Humanas/classificação , Reconhecimento Automatizado de Padrão/métodos , Algoritmos , Bases de Dados Factuais , Humanos , Processamento de Imagem Assistida por Computador , Modelos Estatísticos , Gravação em Vídeo , Dispositivos Eletrônicos Vestíveis
17.
IEEE Trans Pattern Anal Mach Intell ; 42(3): 622-635, 2020 03.
Artigo em Inglês | MEDLINE | ID: mdl-30489262

RESUMO

A first-person video delivers what the camera wearer (actor) experiences through physical interactions with surroundings. In this paper, we focus on a problem of Force from Motion-estimating the active force and torque exerted by the actor to drive her/his activity-from a first-person video. We use two physical cues inherited in the first-person video. (1) Ego-motion: the camera motion is generated by a resultant of force interactions, which allows us to understand the effect of the active force using Newtonian mechanics. (2) Visual semantics: the first-person visual scene is deployed to afford the actor's activity, which is indicative of the physical context of the activity. We estimate the active force and torque using a dynamical system that can describe the transition (dynamics) of the actor's physical state (position, orientation, and linear/angular momentum) where the latent physical state is indirectly observed by the first-person video. We approximate the physical state with the 3D camera trajectory that is reconstructed up to scale and orientation. The absolute scale factor and gravitation field are learned from the ego-motion and visual semantics of the first-person video. Inspired by an optimal control theory, we solve the dynamical system by minimizing reprojection error. Our method shows quantitatively equivalent reconstruction comparing to IMU measurements in terms of gravity and scale recovery and outperforms the methods based on 2D optical flow for an active action recognition task. We apply our method to first-person videos of mountain biking, urban bike racing, skiing, speedflying with parachute, and wingsuit flying where inertial measurements are not accessible.


Assuntos
Atividades Humanas/classificação , Processamento de Imagem Assistida por Computador/métodos , Movimento/fisiologia , Gravação em Vídeo/métodos , Aceleração , Humanos , Esportes
18.
IEEE Trans Neural Netw Learn Syst ; 31(5): 1747-1756, 2020 05.
Artigo em Inglês | MEDLINE | ID: mdl-31329134

RESUMO

Recent years have witnessed the success of deep learning methods in human activity recognition (HAR). The longstanding shortage of labeled activity data inherently calls for a plethora of semisupervised learning methods, and one of the most challenging and common issues with semisupervised learning is the imbalanced distribution of labeled data over classes. Although the problem has long existed in broad real-world HAR applications, it is rarely explored in the literature. In this paper, we propose a semisupervised deep model for imbalanced activity recognition from multimodal wearable sensory data. We aim to address not only the challenges of multimodal sensor data (e.g., interperson variability and interclass similarity) but also the limited labeled data and class-imbalance issues simultaneously. In particular, we propose a pattern-balanced semisupervised framework to extract and preserve diverse latent patterns of activities. Furthermore, we exploit the independence of multi-modalities of sensory data and attentively identify salient regions that are indicative of human activities from inputs by our recurrent convolutional attention networks. Our experimental results demonstrate that the proposed model achieves a competitive performance compared to a multitude of state-of-the-art methods, both semisupervised and supervised ones, with 10% labeled training data. The results also show the robustness of our method over imbalanced, small training data sets.


Assuntos
Atividades Humanas/classificação , Redes Neurais de Computação , Reconhecimento Automatizado de Padrão/classificação , Reconhecimento Automatizado de Padrão/métodos , Aprendizado de Máquina Supervisionado/classificação , Humanos
19.
Int J Behav Nutr Phys Act ; 16(1): 106, 2019 11 14.
Artigo em Inglês | MEDLINE | ID: mdl-31727080

RESUMO

BACKGROUND: Globally, the International Classification of Activities for Time-Use Statistics (ICATUS) is one of the most widely used time-use classifications to identify time spent in various activities. Comprehensive 24-h activities that can be extracted from ICATUS provide possible implications for the use of time-use data in relation to activity-health associations; however, these activities are not classified in a way that makes such analysis feasible. This study, therefore, aimed to develop criteria for classifying ICATUS activities into sleep, sedentary behaviour (SB), light physical activity (LPA), and moderate-to-vigorous physical activity (MVPA), based on expert assessment. METHOD: We classified activities from the Trial ICATUS 2005 and final ICATUS 2016. One author assigned METs and codes for wakefulness status and posture, to all subclass activities in the Trial ICATUS 2005. Once coded, one author matched the most detailed level of activities from the ICATUS 2016 with the corresponding activities in the Trial ICATUS 2005, where applicable. The assessment and harmonisation of each ICATUS activity were reviewed independently and anonymously by four experts, as part of a Delphi process. Given a large number of ICATUS activities, four separate Delphi panels were formed for this purpose. A series of Delphi survey rounds were repeated until a consensus among all experts was reached. RESULTS: Consensus about harmonisation and classification of ICATUS activities was reached by the third round of the Delphi survey in all four panels. A total of 542 activities were classified into sleep, SB, LPA, and MVPA categories. Of these, 390 activities were from the Trial ICATUS 2005 and 152 activities were from the final ICATUS 2016. The majority of ICATUS 2016 activities were harmonised into the ICATUS activity groups (n = 143). CONCLUSIONS: Based on expert consensus, we developed a classification system that enables ICATUS-based time-use data to be classified into sleep, SB, LPA, and MVPA categories. Adoption and consistent use of this classification system will facilitate standardisation of time-use data processing for the purpose of sleep, SB and physical activity research, and improve between-study comparability. Future studies should test the applicability of the classification system by applying it to empirical data.


Assuntos
Exercício Físico , Atividades Humanas/classificação , Comportamento Sedentário , Sono/fisiologia , Inquéritos e Questionários/normas , Humanos
20.
Environ Monit Assess ; 191(Suppl 1): 336, 2019 Jun 20.
Artigo em Inglês | MEDLINE | ID: mdl-31222398

RESUMO

Soil concentrations of 12 heavy metals that have been linked to various anthropogenic activities were measured in samples collected from the uppermost horizon in approximately 1000 wetlands across the conterminous US as part of the 2011 National Wetland Condition Assessment (NWCA). The heavy metals were silver (Ag), cadmium (Cd), cobalt (Co), chromium (Cr), copper (Cu), nickel (Ni), lead (Pb), antimony (Sb), tin (Sn), vanadium (V), tungsten (W), and zinc (Zn). Using thresholds to distinguish natural background concentrations from human-mediated additions, we evaluated wetland soil heavy metal concentrations in the conterminous US and four regions using a Heavy Metal Index (HMI) that reflects human-mediated heavy metal loads based on the number of elements above expected background concentration. We also examined the individual elements to detect concentrations of heavy metals above expected background that frequently occur in wetland soils. Our data show that wetland soils of the conterminous US typically have low heavy metal loads, and that most of the measured elements occur nationally in concentrations below thresholds that relate to anthropogenic activities. However, we found that soil lead is more common in wetland soils than other measured elements, occurring nationally in 11.3% of the wetland area in concentrations above expected natural background (> 35 ppm). Our data show positive relationships between soil lead concentration and four individual landscape metrics: road density, percent impervious surface, housing unit density, and population density in a 1-km radius buffer area surrounding a site. These relationships, while evident on a national level, are strongest in the eastern US, where the highest road densities and greatest population densities occur. Because lead can be strongly bound to wetland soils in particular, maintenance of the good condition of our nation's wetlands is likely to minimize risk of lead mobilization.


Assuntos
Monitoramento Ambiental/métodos , Atividades Humanas , Metais Pesados/análise , Poluentes do Solo/análise , Áreas Alagadas , Monitoramento Ambiental/estatística & dados numéricos , Atividades Humanas/classificação , Atividades Humanas/estatística & dados numéricos , Humanos , Fatores de Risco , Estados Unidos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA