Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 16 de 16
Filter
1.
IEEE Int Conf Mob Data Manag ; 2023: 148-157, 2023 Jul.
Article in English | MEDLINE | ID: mdl-37965426

ABSTRACT

Human mobility data is useful for various applications in urban planning, transportation, and public health, but collecting and sharing real-world trajectories can be challenging due to privacy and data quality issues. To address these problems, recent research focuses on generating synthetic trajectories, mainly using generative adversarial networks (GANs) trained by real-world trajectories. In this paper, we hypothesize that by explicitly capturing the modality of transportation (e.g., walking, biking, driving), we can generate not only more diverse and representative trajectories for different modalities but also more realistic trajectories that preserve the geographical density, trajectory, and transition level properties by capturing both cross-modality and modality-specific patterns. Towards this end, we propose a Clustering-based Sequence Generative Adversarial Network (CSGAN) that simultaneously clusters the trajectories based on their modalities and learns the essential properties of real-world trajectories to generate realistic and representative synthetic trajectories. To measure the effectiveness of generated trajectories, in addition to typical density and trajectory level statistics, we define several new metrics for a comprehensive evaluation, including modality distribution and transition probabilities both globally and within each modality. Our extensive experiments with real-world datasets show the superiority of our model in various metrics over state-of-the-art models.

2.
Proc Mach Learn Res ; 202: 40669-40680, 2023 Jul.
Article in English | MEDLINE | ID: mdl-37933246

ABSTRACT

A fundamental problem in data management is to find the elements in an array that match a query. Recently, learned indexes are being extensively used to solve this problem, where they learn a model to predict the location of the items in the array. They are empirically shown to outperform non-learned methods (e.g., B-trees or binary search that answer queries in O(logn) time) by orders of magnitude. However, success of learned indexes has not been theoretically justified. Only existing attempt shows the same query time of O(logn), but with a constant factor improvement in space complexity over non-learned methods, under some assumptions on data distribution. In this paper, we significantly strengthen this result, showing that under mild assumptions on data distribution, and the same space complexity as non-learned methods, learned indexes can answer queries in O(loglogn) expected query time. We also show that allowing for slightly larger but still near-linear space overhead, a learned index can achieve O(1) expected query time. Our results theoretically prove learned indexes are orders of magnitude faster than non-learned methods, theoretically grounding their empirical success.

3.
IEEE Trans Neural Netw Learn Syst ; 34(12): 10711-10723, 2023 Dec.
Article in English | MEDLINE | ID: mdl-35544501

ABSTRACT

Learning low-dimensional representations of bipartite graphs enables e-commerce applications, such as recommendation, classification, and link prediction. A layerwise-trained bipartite graph neural network (L-BGNN) embedding method, which is unsupervised, efficient, and scalable, is proposed in this work. To aggregate the information across and within two partitions of a bipartite graph, a customized interdomain message passing (IDMP) operation and an intradomain alignment (IDA) operation are adopted by the proposed L-BGNN method. Furthermore, we develop a layerwise training algorithm for L-BGNN to capture the multihop relationship of large bipartite networks and improve training efficiency. We conduct extensive experiments on several datasets and downstream tasks of various scales to demonstrate the effectiveness and efficiency of the L-BGNN method as compared with state-of-the-art methods. Our codes are publicly available at https://github.com/TianXieUSC/L-BGNN.

4.
Proc ACM Manag Data ; 1(1)2023 May.
Article in English | MEDLINE | ID: mdl-38770441

ABSTRACT

Range aggregate queries (RAQs) are an integral part of many real-world applications, where, often, fast and approximate answers for the queries are desired. Recent work has studied answering RAQs using machine learning (ML) models, where a model of the data is learned to answer the queries. However, there is no theoretical understanding of why and when the ML based approaches perform well. Furthermore, since the ML approaches model the data, they fail to capitalize on any query specific information to improve performance in practice. In this paper, we focus on modeling "queries" rather than data and train neural networks to learn the query answers. This change of focus allows us to theoretically study our ML approach to provide a distribution and query dependent error bound for neural networks when answering RAQs. We confirm our theoretical results by developing NeuroSketch, a neural network framework to answer RAQs in practice. Extensive experimental study on real-world, TPC-benchmark and synthetic datasets show that NeuroSketch answers RAQs multiple orders of magnitude faster than state-of-the-art and with better accuracy.

5.
Article in English | MEDLINE | ID: mdl-38384746

ABSTRACT

Mobile apps that use location data are pervasive, spanning domains such as transportation, urban planning and healthcare. Important use cases for location data rely on statistical queries, e.g., identifying hotspots where users work and travel. Such queries can be answered efficiently by building histograms. However, precise histograms can expose sensitive details about individual users. Differential privacy (DP) is a mature and widely-adopted protection model, but most approaches for DP-compliant histograms work in a data-independent fashion, leading to poor accuracy. The few proposed data-dependent techniques attempt to adjust histogram partitions based on dataset characteristics, but they do not perform well due to the addition of noise required to achieve DP. In addition, they use ad-hoc criteria to decide the depth of the partitioning. We identify density homogeneity as a main factor driving the accuracy of DP-compliant histograms, and we build a data structure that splits the space such that data density is homogeneous within each resulting partition. We propose a self-tuning approach to decide the depth of the partitioning structure that optimizes the use of privacy budget. Furthermore, we provide an optimization that scales the proposed split approach to large datasets while maintaining accuracy. We show through extensive experiments on large-scale real-world data that the proposed approach achieves superior accuracy compared to existing approaches.

6.
IEEE Int Conf Mob Data Manag ; 2022: 361-366, 2022 Jun.
Article in English | MEDLINE | ID: mdl-36345435

ABSTRACT

Accurately monitoring the number of individuals inside a building is vital to limiting COVID-19 transmission. Low adoption of contact tracing apps due to privacy concerns has increased pervasiveness of passive digital tracking alternatives. Large arrays of WiFi access points can conveniently track mobile devices on university and industry campuses. The CrowdMap system employed by the University of Southern California enables such tracking by collecting aggregate statistics from connections to access points around campus. However, since these devices can be used to infer the movement of individuals, there is still a significant risk that even aggregate occupancy statistics will violate the location privacy of individuals. We examine the use of Differential Privacy in reporting statistics from this system as measured using point and range count queries. We propose discretization schemes to model the positions of users given only user connections to WiFi access points. Using this information we are able to release accurate counts of occupants in areas of campus buildings such as labs, hallways, and large discussion halls with minimized risk to individual users' privacy.

7.
Proceedings VLDB Endowment ; 16(2): 167-179, 2022 Oct.
Article in English | MEDLINE | ID: mdl-37220471

ABSTRACT

Fairness in data-driven decision-making studies scenarios where individuals from certain population segments may be unfairly treated when being considered for loan or job applications, access to public resources, or other types of services. In location-based applications, decisions are based on individual whereabouts, which often correlate with sensitive attributes such as race, income, and education. While fairness has received significant attention recently, e.g., in machine learning, there is little focus on achieving fairness when dealing with location data. Due to their characteristics and specific type of processing algorithms, location data pose important fairness challenges. We introduce the concept of spatial data fairness to address the specific challenges of location data and spatial queries. We devise a novel building block to achieve fairness in the form of fair polynomials. Next, we propose two mechanisms based on fair polynomials that achieve individual spatial fairness, corresponding to two common location-based decision-making types: distance-based and zone-based. Extensive experimental results on real data show that the proposed mechanisms achieve spatial fairness without sacrificing utility.

8.
JCO Clin Cancer Inform ; 4: 839-853, 2020 09.
Article in English | MEDLINE | ID: mdl-32970482

ABSTRACT

PURPOSE: Unplanned health care encounters (UHEs) such as emergency room visits can occur commonly during cancer chemotherapy treatments. Patients at an increased risk of UHEs are typically identified by clinicians using performance status (PS) assessments based on a descriptive scale, such as the Eastern Cooperative Oncology Group (ECOG) scale. Such assessments can be bias prone, resulting in PS score disagreements between assessors. We therefore propose to evaluate PS using physical activity measurements (eg, energy expenditure) from wearable activity trackers. Specifically, we examined the feasibility of using a wristband (band) and a smartphone app for PS assessments. METHODS: We conducted an observational study on a cohort of patients with solid tumor receiving highly emetogenic chemotherapy. Patients were instructed to wear the band for a 60-day activity-tracking period. During clinic visits, we obtained ECOG scores assessed by physicians, coordinators, and patients themselves. UHEs occurring during the activity-tracking period plus a 90-day follow-up period were later compiled. We defined our primary outcome as the percentage of patients adherent to band-wear ≥ 80% of 10 am to 8 pm for ≥ 80% of the activity-tracking period. In an exploratory analysis, we computed hourly metabolic equivalent of task (MET) and counted 10 am to 8 pm hours with > 1.5 METs as nonsedentary physical activity hours. RESULTS: Forty-one patients completed the study (56.1% female; 61.0% age 40-60 years); 68% were adherent to band-wear. ECOG score disagreement between assessors ranged from 35.3% to 50.0%. In our exploratory analysis, lower average METs and nonsedentary hours, but not higher ECOG scores, were associated with higher 150-day UHEs. CONCLUSION: The use of a wearable activity tracker is generally feasible in a similar population of patients with cancer. A larger randomized controlled trial should be conducted to confirm the association between lower nonsedentary hours and higher UHEs.


Subject(s)
Fitness Trackers , Neoplasms , Adult , Cohort Studies , Delivery of Health Care , Exercise , Female , Humans , Male , Middle Aged , Neoplasms/drug therapy
9.
Geoinformatica ; 24(4): 951-985, 2020.
Article in English | MEDLINE | ID: mdl-32837253

ABSTRACT

Monitoring location updates from mobile users has important applications in many areas, ranging from public health (e.g., COVID-19 contact tracing) and national security to social networks and advertising. However, sensitive information can be derived from movement patterns, thus protecting the privacy of mobile users is a major concern. Users may only be willing to disclose their locations when some condition is met, for instance in proximity of a disaster area or an event of interest. Currently, such functionality can be achieved using searchable encryption. Such cryptographic primitives provide provable guarantees for privacy, and allow decryption only when the location satisfies some predicate. Nevertheless, they rely on expensive pairing-based cryptography (PBC), of which direct application to the domain of location updates leads to impractical solutions. We propose secure and efficient techniques for private processing of location updates that complement the use of PBC and lead to significant gains in performance by reducing the amount of required pairing operations. We implement two optimizations that further improve performance: materialization of results to expensive mathematical operations, and parallelization. We also propose an heuristic that brings down the computational overhead through enlarging an alert zone by a small factor (given as system parameter), therefore trading off a small and controlled amount of privacy for significant performance gains. Extensive experimental results show that the proposed techniques significantly improve performance compared to the baseline, and reduce the searchable encryption overhead to a level that is practical in a computing environment with reasonable resources, such as the cloud.

10.
JCO Clin Cancer Inform ; 4: 583-601, 2020 06.
Article in English | MEDLINE | ID: mdl-32598179

ABSTRACT

PURPOSE: Performance status (PS) is a key factor in oncologic decision making, but conventional scales used to measure PS vary among observers. Consumer-grade biometric sensors have previously been identified as objective alternatives to the assessment of PS. Here, we investigate how one such biometric sensor can be used during a clinic visit to identify patients who are at risk for complications, particularly unexpected hospitalizations that may delay treatment or result in low physical activity. We aim to provide a novel and objective means of predicting tolerability to chemotherapy. METHODS: Thirty-eight patients across three centers in the United States who were diagnosed with a solid tumor with plans for treatment with two cycles of highly emetogenic chemotherapy were included in this single-arm, observational prospective study. A noninvasive motion-capture system quantified patient movement from chair to table and during the get-up-and-walk test. Activity levels were recorded using a wearable sensor over a 2-month period. Changes in kinematics from two motion-capture data points pre- and post-treatment were tested for correlation with unexpected hospitalizations and physical activity levels as measured by a wearable activity sensor. RESULTS: Among 38 patients (mean age, 48.3 years; 53% female), kinematic features from chair to table were the best predictors for unexpected health care encounters (area under the curve, 0.775 ± 0.029) and physical activity (area under the curve, 0.830 ± 0.080). Chair-to-table acceleration of the nonpivoting knee (t = 3.39; P = .002) was most correlated with unexpected health care encounters. Get-up-and-walk kinematics were most correlated with physical activity, particularly the right knee acceleration (t = -2.95; P = .006) and left arm angular velocity (t = -2.4; P = .025). CONCLUSION: Chair-to-table kinematics are good predictors of unexpected hospitalizations, whereas the get-up-and-walk kinematics are good predictors of low physical activity.


Subject(s)
Acceleration , Biomechanical Phenomena , Female , Humans , Male , Middle Aged , Prospective Studies
11.
AMIA Jt Summits Transl Sci Proc ; 2020: 654-663, 2020.
Article in English | MEDLINE | ID: mdl-32477688

ABSTRACT

Atrial fibrillation (AF) is the most common cardiac arrhythmia as well as a significant risk factor in heart failure and coronary artery disease. AF can be detected by using a short ECG recording. However, discriminating atrial fibrillation from normal sinus rhythm, other arrhythmia and strong noise, given a short ECG recording, is challenging. Towards this end, we propose MultiFusionNet, a deep learning network that uses a multiplicative fusion method to combine two deep neural networks trained on different sources of knowledge, i.e., extracted features and raw data. Thus, MultiFusionNet can exploit the relevant extracted features to improve upon the utilization of the deep learning model on the raw data. Our experiments show that this approach offers the most accurate AF classification and outperforms recently published algorithms that either use extracted features or raw data separately. Finally, we show that our multiplicative fusion method for combining the two sub-networks outperforms several other combining methods.

12.
J Patient Rep Outcomes ; 3(1): 41, 2019 Jul 16.
Article in English | MEDLINE | ID: mdl-31313047

ABSTRACT

BACKGROUND: Patient performance status is routinely used in oncology to estimate physical functioning, an important factor in clinical treatment decisions and eligibility for clinical trials. However, validity and reliability data for ratings of performance status have not been optimal. This study recruited oncology patients who were about to begin emetogenic palliative or adjuvant chemotherapy for treatment of solid tumors. We employed actigraphy as the gold standard for physical activity level. Correspondences between actigraphy and oncologists' and patients' ratings of performance status were examined and compared with the correspondences of actigraphy and several patient reported outcomes (PROs). The study was designed to determine feasibility of the measurement approaches and if PROs can improve the accuracy of assessment of performance status. METHODS: Oncologists and patients made performance status ratings at visit 1. Patients wore an actigraph and entered weekly PROs on a smartphone app. Data for days 1-14 after visit 1 were analyzed. Chart reviews were conducted to tabulate all unexpected medical events across days 1-150. RESULTS: Neither oncologist nor patient ratings of performance status predicted steps/hour (actigraphy). The PROMIS® Physical Function PRO (average of Days 1, 7, 14) was associated with steps/hour at high (for men) and moderate (for women) levels; the PROMIS® Fatigue PRO predicted steps for men, but not for women. Unexpected medical events occurred in 57% of patients. Only body weight in female patients predicted events; oncologist and patient performance status ratings, steps/hour, and other PROs did not. CONCLUSIONS: PROMIS® Physical Function and Fatigue PROs show good correspondence with steps/hour making them easy, useful tools for oncologists to improve their assessment of performance status, especially for male patients. Female patients had lower levels of steps/hour than males and lower correlations among the predictors, suggesting the need for further work to improve performance status assessment in women. Assessment of pre-morbid sedentary behavior alongside current Physical Functioning and Fatigue PROs may allow for a more valid determination of disease-related activity level and performance status.

13.
IEEE J Transl Eng Health Med ; 7: 2800207, 2019.
Article in English | MEDLINE | ID: mdl-30800535

ABSTRACT

This paper examines how features extracted from full-day data recorded by wearable sensors are able to differentiate between infants with typical development and those with or at risk for developmental delays. Wearable sensors were used to collect full-day (8-13 h) leg movement data from infants with typical development ([Formula: see text]) and infants at risk for developmental delay ([Formula: see text]). At 24 months, at-risk infants were assessed as having good ([Formula: see text]) or poor ([Formula: see text]) developmental outcomes. With this limited size dataset, our statistical analysis indicated that accelerometer features collected earlier in infancy differentiated between at-risk infants with poor and good outcomes at 24 months, as well as infants with typical development. This paper also tested how these features performed on a subset of the data for which the infant movement was known, i.e., 5-min intervals more representative of clinical observations. Our results on this limited dataset indicated that features for full-day data showed more group differences than similar features for the 5-min intervals, supporting the usefulness of full-day movement monitoring.

14.
Clin Biomech (Bristol, Avon) ; 56: 61-69, 2018 07.
Article in English | MEDLINE | ID: mdl-29803824

ABSTRACT

BACKGROUND: Biomechanical characterization of human performance with respect to fatigue and fitness is relevant in many settings, however is usually limited to either fully qualitative assessments or invasive methods which require a significant experimental setup consisting of numerous sensors, force plates, and motion detectors. Qualitative assessments are difficult to standardize due to their intrinsic subjective nature, on the other hand, invasive methods provide reliable metrics but are not feasible for large scale applications. METHODS: Presented here is a dynamical toolset for detecting performance groups using a non-invasive system based on the Microsoft Kinect motion capture sensor, and a case study of 37 cancer patients performing two clinically monitored tasks before and after therapy regimens. Dynamical features are extracted from the motion time series data and evaluated based on their ability to i) cluster patients into coherent fitness groups using unsupervised learning algorithms and to ii) predict Eastern Cooperative Oncology Group performance status via supervised learning. FINDINGS: The unsupervised patient clustering is comparable to clustering based on physician assigned Eastern Cooperative Oncology Group status in that they both have similar concordance with change in weight before and after therapy as well as unexpected hospitalizations throughout the study. The extracted dynamical features can predict physician, coordinator, and patient Eastern Cooperative Oncology Group status with an accuracy of approximately 80%. INTERPRETATION: The non-invasive Microsoft Kinect sensor and the proposed dynamical toolset comprised of data preprocessing, feature extraction, dimensionality reduction, and machine learning offers a low-cost and general method for performance segregation and can complement existing qualitative clinical assessments.


Subject(s)
Body Weight , Monitoring, Physiologic , Movement , Neoplasms/physiopathology , Algorithms , Biomechanical Phenomena , Cluster Analysis , Female , Hospitalization , Humans , Machine Learning , Male , Self Report , Software , Weight Gain , Weight Loss
15.
J Pathol Inform ; 9: 2, 2018.
Article in English | MEDLINE | ID: mdl-29531847

ABSTRACT

The advent of the digital pathology has introduced new avenues of diagnostic medicine. Among them, crowdsourcing has attracted researchers' attention in the recent years, allowing them to engage thousands of untrained individuals in research and diagnosis. While there exist several articles in this regard, prior works have not collectively documented them. We, therefore, aim to review the applications of crowdsourcing in human pathology in a semi-systematic manner. We first, introduce a novel method to do a systematic search of the literature. Utilizing this method, we, then, collect hundreds of articles and screen them against a predefined set of criteria. Furthermore, we crowdsource part of the screening process, to examine another potential application of crowdsourcing. Finally, we review the selected articles and characterize the prior uses of crowdsourcing in pathology.

16.
Int J Bioinform Res Appl ; 3(1): 4-23, 2007.
Article in English | MEDLINE | ID: mdl-18048170

ABSTRACT

We propose a novel approach for recognising static and dynamic hand gestures by analysing the raw data streams generated by the sensors attached to the human hands. We utilise the concept of 'range of motion' in the movement of fingers and exploit this characteristic to analyse the acquired data for recognising hand signs. Our approach for hand gesture recognition addresses two major problems: user-dependency and device-dependency. Furthermore, we show that our approach neither requires calibration nor involves training. We apply our approach for recognising American Sign Language (ASL) signs and show that more than 75% accuracy in sign recognition can be achieved.


Subject(s)
Biomechanical Phenomena/methods , Computational Biology/methods , Sign Language , Algorithms , Artificial Intelligence , Calibration , Gestures , Hand , Humans , Information Storage and Retrieval , Models, Statistical , Motion , Nonverbal Communication , Pattern Recognition, Automated , Pattern Recognition, Visual , Recognition, Psychology , Visual Perception
SELECTION OF CITATIONS
SEARCH DETAIL
...