ABSTRACT
In this study, we propose an augmentation method for machine learning based on relabeling data in caregiving and nursing staff indoor localization with Bluetooth Low Energy (BLE) technology. Indoor localization is used to monitor staff-to-patient assistance in caregiving and to gain insights into workload management. However, improving accuracy is challenging when there is a limited amount of data available for training. In this paper, we propose a data augmentation method to reuse the Received Signal Strength (RSS) from different beacons by relabeling to the locations with less samples, resolving data imbalance. Standard deviation and Kullback-Leibler divergence between minority and majority classes are used to measure signal pattern to find matching beacons to relabel. By matching beacons between classes, two variations of relabeling are implemented, specifically full and partial matching. The performance is evaluated using the real-world dataset we collected for five days in a nursing care facility installed with 25 BLE beacons. A Random Forest model is utilized for location recognition, and performance is compared using the weighted F1-score to account for class imbalance. By increasing the beacon data with our proposed relabeling method for data augmentation, we achieve a higher minority class F1-score compared to augmentation with Random Sampling, Synthetic Minority Oversampling Technique (SMOTE) and Adaptive Synthetic Sampling (ADASYN). Our proposed method utilizes collected beacon data by leveraging majority class samples. Full matching demonstrated a 6 to 8% improvement from the original baseline overall weighted F1-score.
Subject(s)
Machine Learning , Recognition, Psychology , Humans , Data Collection , Research Design , TechnologyABSTRACT
In this paper, we propose the notification optimization method by providing multiple alternative times as a reminder for a forecasted activity with and without probabilistic considerations for the activity that needs to be completed and needs notification. It is important to consider various factors when sending notifications to people after obtaining the results of the forecasted activity. We should not send notifications only when we have forecasted results because future daily activities are unpredictable. Therefore, it is important to strike a balance between providing useful reminders and avoiding excessive interruptions, especially for low probabilities of forecasted activity. Our study investigates the impact of the low probability of forecasted activity and optimizes the notification time with reinforcement learning. We also show the gaps between forecasted activities that are useful for self-improvement by people for the balance of important tasks, such as tasks completed as planned and additional tasks to be completed. For evaluation, we utilize two datasets: the existing dataset and data we collected in the field with the technology we have developed. In the data collection, we have 23 activities from six participants. To evaluate the effectiveness of these approaches, we assess the percentage of positive responses, user response rate, and response duration as performance criteria. Our proposed method provides a more effective way to optimize notifications. By incorporating the probability level of activity that needs to be done and needs notification into the state, we achieve a better response rate than the baseline, with the advantage of reaching 27.15%, as well as than the other criteria, which are also improved by using probability.
ABSTRACT
One of the biggest challenges of activity data collection is the need to rely on users and keep them engaged to continually provide labels. Recent breakthroughs in mobile platforms have proven effective in bringing deep neural networks powered intelligence into mobile devices. This study proposes a novel on-device personalization for data labeling for an activity recognition system using mobile sensing. The key idea behind this system is that estimated activities personalized for a specific individual user can be used as feedback to motivate user contribution and improve data labeling quality. First, we exploited fine-tuning using a Deep Recurrent Neural Network to address the lack of sufficient training data and minimize the need for training deep learning on mobile devices from scratch. Second, we utilized a model pruning technique to reduce the computation cost of on-device personalization without affecting the accuracy. Finally, we built a robust activity data labeling system by integrating the two techniques outlined above, allowing the mobile application to create a personalized experience for the user. To demonstrate the proposed model's capability and feasibility, we developed and deployed the proposed system to realistic settings. For our experimental setup, we gathered more than 16,800 activity windows from 12 activity classes using smartphone sensors. We empirically evaluated the proposed quality by comparing it with a baseline using machine learning. Our results indicate that the proposed system effectively improved activity accuracy recognition for individual users and reduced cost and latency for inference for mobile devices. Based on our findings, we highlight critical and promising future research directions regarding the design of efficient activity data collection with on-device personalization.
ABSTRACT
Sensor-based human activity recognition has various applications in the arena of healthcare, elderly smart-home, sports, etc. There are numerous works in this field-to recognize various human activities from sensor data. However, those works are based on data patterns that are clean data and have almost no missing data, which is a genuine concern for real-life healthcare centers. Therefore, to address this problem, we explored the sensor-based activity recognition when some partial data were lost in a random pattern. In this paper, we propose a novel method to improve activity recognition while having missing data without any data recovery. For the missing data pattern, we considered data to be missing in a random pattern, which is a realistic missing pattern for sensor data collection. Initially, we created different percentages of random missing data only in the test data, while the training was performed on good quality data. In our proposed approach, we explicitly induce different percentages of missing data randomly in the raw sensor data to train the model with missing data. Learning with missing data reinforces the model to regulate missing data during the classification of various activities that have missing data in the test module. This approach demonstrates the plausibility of the machine learning model, as it can learn and predict from an identical domain. We exploited several time-series statistical features to extricate better features in order to comprehend various human activities. We explored both support vector machine and random forest as machine learning models for activity classification. We developed a synthetic dataset to empirically evaluate the performance and show that the method can effectively improve the recognition accuracy from 80.8% to 97.5%. Afterward, we tested our approach with activities from two challenging benchmark datasets: the human activity sensing consortium (HASC) dataset and single chest-mounted accelerometer dataset. We examined the method for different missing percentages, varied window sizes, and diverse window sliding widths. Our explorations demonstrated improved recognition performances even in the presence of missing data. The achieved results provide persuasive findings on sensor-based activity recognition in the presence of missing data.
Subject(s)
Human Activities , Machine Learning , Pattern Recognition, Automated , Humans , Support Vector MachineABSTRACT
Wearable sensor-based systems and devices have been expanded in different application domains, especially in the healthcare arena. Automatic age and gender estimation has several important applications. Gait has been demonstrated as a profound motion cue for various applications. A gait-based age and gender estimation challenge was launched in the 12th IAPR International Conference on Biometrics (ICB), 2019. In this competition, 18 teams initially registered from 14 countries. The goal of this challenge was to find some smart approaches to deal with age and gender estimation from sensor-based gait data. For this purpose, we employed a large wearable sensor-based gait dataset, which has 745 subjects (357 females and 388 males), from 2 to 78 years old in the training dataset; and 58 subjects (19 females and 39 males) in the test dataset. It has several walking patterns. The gait data sequences were collected from three IMUZ sensors, which were placed on waist-belt or at the top of a backpack. There were 67 solutions from ten teams-for age and gender estimation. This paper extensively analyzes the methods and achieved-results from various approaches. Based on analysis, we found that deep learning-based solutions lead the competitions compared with conventional handcrafted methods. We found that the best result achieved 24.23% prediction error for gender estimation, and 5.39 mean absolute error for age estimation by employing angle embedded gait dynamic image and temporal convolution network.
Subject(s)
Gait Analysis/methods , Wearable Electronic Devices , Adolescent , Adult , Aged , Ageism , Algorithms , Biometry/methods , Child , Child, Preschool , Female , Humans , Male , Middle Aged , Smartphone , Young AdultABSTRACT
Labeling activity data is a central part of the design and evaluation of human activity recognition systems. The performance of the systems greatly depends on the quantity and "quality" of annotations; therefore, it is inevitable to rely on users and to keep them motivated to provide activity labels. While mobile and embedded devices are increasingly using deep learning models to infer user context, we propose to exploit on-device deep learning inference using a long short-term memory (LSTM)-based method to alleviate the labeling effort and ground truth data collection in activity recognition systems using smartphone sensors. The novel idea behind this is that estimated activities are used as feedback for motivating users to collect accurate activity labels. To enable us to perform evaluations, we conduct the experiments with two conditional methods. We compare the proposed method showing estimated activities using on-device deep learning inference with the traditional method showing sentences without estimated activities through smartphone notifications. By evaluating with the dataset gathered, the results show our proposed method has improvements in both data quality (i.e., the performance of a classification model) and data quantity (i.e., the number of data collected) that reflect our method could improve activity data collection, which can enhance human activity recognition systems. We discuss the results, limitations, challenges, and implications for on-device deep learning inference that support activity data collection. Also, we publish the preliminary dataset collected to the research community for activity recognition.
Subject(s)
Deep Learning , Human Activities , Humans , Neural Networks, Computer , Smartphone , Wireless TechnologyABSTRACT
Integrating speech recondition technology into an electronic health record (EHR) has been studied in recent years. However, the full adoption of the system still faces challenges such as handling speech errors, transforming raw data into an understandable format and controlling the transition from one field to the next field with speech commands. To reduce errors, cost, and documentation time, we propose a dialogue system care record (DSCR) based on a smartphone for nursing documentation. We describe the effects of DSCR on (1) documentation speed, (2) document accuracy and (3) user satisfaction. We tested the application with 12 participants to examine the usability and feasibility of DSCR. The evaluation shows that DSCR can collect data efficiently by achieving 96% of documentation accuracy. Average documentation speed was increased by 15% (P = 0.012) compared to traditional electronic forms (e-forms). The participants' average satisfaction rating was 4.8 using DSCR compared to 3.6 using e-forms on a scale of 1-5 (P = 0.032).
Subject(s)
Data Collection/methods , Electronic Health Records , Language , Speech Recognition Software , User-Computer InterfaceABSTRACT
In this paper, we address Zero-shot learning for sensor activity recognition using word embeddings. The goal of Zero-shot learning is to estimate an unknown activity class (i.e., an activity that does not exist in a given training dataset) by learning to recognize components of activities expressed in semantic vectors. The existing zero-shot methods use mainly 2 kinds of representation as semantic vectors, attribute vector and embedding word vector. However, few zero-shot activity recognition methods based on embedding vector have been studied; especially for sensor-based activity recognition, no such studies exist, to the best of our knowledge. In this paper, we compare and thoroughly evaluate the Zero-shot method with different semantic vectors: (1) attribute vector, (2) embedding vector, and (3) expanded embedding vector and analyze their correlation to performance. Our results indicate that the performance of the three spaces is similar but the use of word embedding leads to a more efficient method, since this type of semantic vector can be generated automatically. Moreover, our suggested method achieved higher accuracy than attribute-vector methods, in cases when there exist similar information in both the given sensor data and in the semantic vector; the results of this study help select suitable classes and sensor data to build a training dataset.
Subject(s)
Human Activities , Pattern Recognition, Automated , Semantics , Databases as Topic , HumansABSTRACT
BACKGROUND: The prevalence of non-communicable diseases is increasing throughout the world, including developing countries. OBJECTIVE: The intent was to conduct a study of a preventive medical service in a developing country, combining eHealth checkups and teleconsultation as well as assess stratification rules and the short-term effects of intervention. METHODS: We developed an eHealth system that comprises a set of sensor devices in an attaché case, a data transmission system linked to a mobile network, and a data management application. We provided eHealth checkups for the populations of five villages and the employees of five factories/offices in Bangladesh. Individual health condition was automatically categorized into four grades based on international diagnostic standards: green (healthy), yellow (caution), orange (affected), and red (emergent). We provided teleconsultation for orange- and red-grade subjects and we provided teleprescription for these subjects as required. RESULTS: The first checkup was provided to 16,741 subjects. After one year, 2361 subjects participated in the second checkup and the systolic blood pressure of these subjects was significantly decreased from an average of 121 mmHg to an average of 116 mmHg (P<.001). Based on these results, we propose a cost-effective method using a machine learning technique (random forest method) using the medical interview, subject profiles, and checkup results as predictor to avoid costly measurements of blood sugar, to ensure sustainability of the program in developing countries. CONCLUSIONS: The results of this study demonstrate the benefits of an eHealth checkup and teleconsultation program as an effective health care system in developing countries.
Subject(s)
Chronic Disease/prevention & control , Developing Countries , Preventive Medicine/methods , Remote Consultation , Adolescent , Adult , Aged , Aged, 80 and over , Child , Delivery of Health Care , Electronic Prescribing , Female , Humans , Male , Middle Aged , Remote Consultation/instrumentation , Risk Factors , Telemedicine , Young AdultABSTRACT
This paper presents a new approach called EmbedHDP, which aims to enhance the evaluation models utilized for assessing sentence suggestions in nursing care record applications. The primary objective is to determine the alignment of the proposed evaluation metric with human evaluators who are caregivers. It is crucial due to the direct relevance of the provided provided to the health or condition of the elderly. The motivation for this proposal arises from challenges observed in previous models. Our analysis examines the mechanisms of current evaluation metrics such as BERTScore, cosine similarity, ROUGE, and BLEU to achieve reliable metrics evaluation. Several limitations were identified. In some cases, BERTScore encountered difficulties in effectively evaluating the nursing care record domain and consistently providing quality assessments of generated sentence suggestions above 60%. Cosine similarity is a widely used method, but it has limitations regarding word order. This can lead to potential misjudgments of semantic differences within similar word sets. Another technique, ROUGE, relies on lexical overlap but tends to ignore semantic accuracy. Additionally, while BLEU is helpful, it may not fully capture semantic coherence in its evaluations. After calculating the correlation coefficient, it was found that EmbedHDP is effective in evaluating nurse care records due to its ability to handle a variety of sentence structures and medical terminology, providing differentiated and contextually relevant assessments. Additionally, this research used a dataset comprising 320 pairs of sentences with correspondingly equivalent lengths. The results revealed that EmbedHDP outperformed other evaluation models, achieving a coefficient score of 61%, followed by cosine similarity, with a score of 59%, and BERTScore, with 58%. This shows the effectiveness of our proposed approach in improving the evaluation of sentence suggestions in nursing care record applications.
ABSTRACT
BACKGROUND AND OBJECTIVE: This study describes the integration of a spoken dialogue system and nursing records on an Android smartphone application intending to help nurses reduce documentation time and improve the overall experience of a healthcare setting. The application also incorporates with collecting personal sensor data and activity labels for activity recognition. METHODS: We developed a joint model based on a bidirectional long-short term memory and conditional random fields (Bi-LSTM-CRF) to identify user intention and extract record details from user utterances. Then, we transformed unstructured data into record inputs on the smartphone application. RESULTS: The joint model achieved the highest F1-score at 96.79%. Moreover, we conducted an experiment to demonstrate the proposed model's capability and feasibility in recording in realistic settings. Our preliminary evaluation results indicate that when using the dialogue-based, we could increase the percentage of documentation speed to 58.13% compared to the traditional keyboard-based. CONCLUSIONS: Based on our findings, we highlight critical and promising future research directions regarding the design of the efficient spoken dialogue system and nursing records.
Subject(s)
Nursing Records , Smartphone , Data Collection , Electronic Health Records , HumansABSTRACT
The integration of digital voice assistants in nursing residences is becoming increasingly important to facilitate nursing productivity with documentation. A key idea behind this system is training natural language understanding (NLU) modules that enable the machine to classify the purpose of the user utterance (intent) and extract pieces of valuable information present in the utterance (entity). One of the main obstacles when creating robust NLU is the lack of sufficient labeled data, which generally relies on human labeling. This process is cost-intensive and time-consuming, particularly in the high-level nursing care domain, which requires abstract knowledge. In this paper, we propose an automatic dialogue labeling framework of NLU tasks, specifically for nursing record systems. First, we apply data augmentation techniques to create a collection of variant sample utterances. The individual evaluation result strongly shows a stratification rate, with regard to both fluency and accuracy in utterances. We also investigate the possibility of applying deep generative models for our augmented dataset. The preliminary character-based model based on long short-term memory (LSTM) obtains an accuracy of 90% and generates various reasonable texts with BLEU scores of 0.76. Secondly, we introduce an idea for intent and entity labeling by using feature embeddings and semantic similarity-based clustering. We also empirically evaluate different embedding methods for learning good representations that are most suitable to use with our data and clustering tasks. Experimental results show that fastText embeddings produce strong performances both for intent labeling and on entity labeling, which achieves an accuracy level of 0.79 and 0.78 f1-scores and 0.67 and 0.61 silhouette scores, respectively.
ABSTRACT
BACKGROUND: For more than 30 years, there has been close cooperation between Japanese and German scientists with regard to information systems in health care. Collaboration has been formalized by an agreement between the respective scientific associations. Following this agreement, two joint workshops took place to explore the similarities and differences of electronic health record systems (EHRS) against the background of the two national healthcare systems that share many commonalities. OBJECTIVES: To establish a framework and requirements for the quality of EHRS that may also serve as a basis for comparing different EHRS. METHODS: Donabedian's three dimensions of quality of medical care were adapted to the outcome, process, and structural quality of EHRS and their management. These quality dimensions were proposed before the first workshop of EHRS experts and enriched during the discussions. RESULTS: The Quality Requirements Framework of EHRS (QRF-EHRS) was defined and complemented by requirements for high quality EHRS. The framework integrates three quality dimensions (outcome, process, and structural quality), three layers of information systems (processes and data, applications, and physical tools) and three dimensions of information management (strategic, tactical, and operational information management). CONCLUSIONS: Describing and comparing the quality of EHRS is in fact a multidimensional problem as given by the QRF-EHRS framework. This framework will be utilized to compare Japanese and German EHRS, notably those that were presented at the second workshop.
Subject(s)
Electronic Health Records/standards , Information Management/standards , Congresses as Topic , Electronic Health Records/economics , Germany , Humans , Japan , SoftwareABSTRACT
Insufficient healthcare facilities and unavailability of medical experts in rural areas are the two major reasons that kept the people unreached to healthcare services. Recent penetration of mobile phone and the demand to basic healthcare services, remote health consultancy over mobile phone became popular in developing countries. In this paper, we introduce two such representative initiatives from Bangladesh and discuss the technical challenges they face to serve a remote patient. To solve these issues, we have prototyped a box with necessary diagnostic tools, we call it a "portable clinic" and a software tool, "GramHealth" for managing the patient information. We carried out experiments in three villages in Bangladesh to observe the usability of the portable clinic and verify the functionality of "GramHealth". We display the qualitative analysis of the results obtained from the experiment. GramHealth DB has a unique combination of structured, semi-structured and un-structured data. We are currently looking at these data to see whether these can be treated as BigData and if yes, how to analyze the data and what to expect from these data to make a better clinical decision support.