Automatic Annotation of Unlabeled Data from Smartphone-Based Motion and Location Sensors.

Pius Owoh, Nsikak; Mahinderjit Singh, Manmeet; Zaaba, Zarul Fitri

Pius Owoh, Nsikak; Mahinderjit Singh, Manmeet; Zaaba, Zarul Fitri.

Afiliação

Pius Owoh N; School of Computer Sciences, Universiti Sains Malaysia, 11800 USM Penang, Malaysia. onp15_com079@student.usm.my.
Mahinderjit Singh M; School of Computer Sciences, Universiti Sains Malaysia, 11800 USM Penang, Malaysia. manmeet@usm.my.
Zaaba ZF; School of Computer Sciences, Universiti Sains Malaysia, 11800 USM Penang, Malaysia. zarulfitri@usm.my.

Sensors (Basel) ; 18(7)2018 Jul 03.

Article em En | MEDLINE | ID: mdl-29970823

RESUMO

Automatic data annotation eliminates most of the challenges we faced due to the manual methods of annotating sensor data. It significantly improves users’ experience during sensing activities since their active involvement in the labeling process is reduced. An unsupervised learning technique such as clustering can be used to automatically annotate sensor data. However, the lingering issue with clustering is the validation of generated clusters. In this paper, we adopted the k-means clustering algorithm for annotating unlabeled sensor data for the purpose of detecting sensitive location information of mobile crowd sensing users. Furthermore, we proposed a cluster validation index for the k-means algorithm, which is based on Multiple Pair-Frequency. Thereafter, we trained three classifiers (Support Vector Machine, K-Nearest Neighbor, and Naïve Bayes) using cluster labels generated from the k-means clustering algorithm. The accuracy, precision, and recall of these classifiers were evaluated during the classification of “non-sensitive” and “sensitive” data from motion and location sensors. Very high accuracy scores were recorded from Support Vector Machine and K-Nearest Neighbor classifiers while a fairly high accuracy score was recorded from the Naïve Bayes classifier. With the hybridized machine learning (unsupervised and supervised) technique presented in this paper, unlabeled sensor data was automatically annotated and then classified.

Palavras-chave

activity recognition; clustering; data security; multivariate data; sensitive data

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Ano de publicação: 2018 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Ano de publicação: 2018 Tipo de documento: Article