RESUMO
Healthcare researchers are increasingly utilizing smartphone sensor data as a scalable and cost-effective approach to studying individualized health-related behaviors in real-world settings. However, to develop reliable and robust digital behavioral signatures that may help in the early prediction of the individualized disease trajectory and future prognosis, there is a critical need to quantify the potential variability that may be present in the underlying sensor data due to variations in the smartphone hardware and software used by large population. Using sensor data collected in real-world settings from 3000 participants' smartphones for up to 84 days, we compared differences in the completeness, correctness, and consistency of the three most common smartphone sensors-the accelerometer, gyroscope, and GPS- within and across Android and iOS devices. Our findings show considerable variation in sensor data quality within and across Android and iOS devices. Sensor data from iOS devices showed significantly lower levels of anomalous point density (APD) compared to Android across all sensors (p < 1 × 10-4). iOS devices showed a considerably lower missing data ratio (MDR) for the accelerometer compared to the GPS data (p < 1 × 10-4). Notably, the quality features derived from raw sensor data across devices alone could predict the device type (Android vs. iOS) with an up to 0.98 accuracy 95% CI [0.977, 0.982]. Such significant differences in sensor data quantity and quality gathered from iOS and Android platforms could lead to considerable variation in health-related inference derived from heterogenous consumer-owned smartphones. Our research highlights the importance of assessing, measuring, and adjusting for such critical differences in smartphone sensor-based assessments. Understanding the factors contributing to the variation in sensor data based on daily device usage will help develop reliable, standardized, inclusive, and practically applicable digital behavioral patterns that may be linked to health outcomes in real-world settings.
Assuntos
Acelerometria , Smartphone , Humanos , Acelerometria/instrumentação , Confiabilidade dos Dados , Feminino , Masculino , AdultoRESUMO
In 2021, the National Guideline Alliance for the Royal College of Obstetricians and Gynaecologists reviewed the body of evidence, including two meta-analyses, implicating supine sleeping position as a risk factor for growth restriction and stillbirth. While they concluded that pregnant people should be advised to avoid going to sleep on their back after 28 weeks' gestation, their main critique of the evidence was that, to date, all studies were retrospective and sleeping position was not objectively measured. As such, the Alliance noted that it would not be possible to prospectively study the associations between sleeping position and adverse pregnancy outcomes. Our aim was to demonstrate the feasibility of building a vision-based model for automated and accurate detection and quantification of sleeping position throughout the third trimester-a model with the eventual goal to be developed further and used by researchers as a tool to enable them to either confirm or disprove the aforementioned associations. We completed a Canada-wide, cross-sectional study in 24 participants in the third trimester. Infrared videos of eleven simulated sleeping positions unique to pregnancy and a sitting position both with and without bed sheets covering the body were prospectively collected. We extracted 152,618 images from 48 videos, semi-randomly down-sampled and annotated 5,970 of them, and fed them into a deep learning algorithm, which trained and validated six models via six-fold cross-validation. The performance of the models was evaluated using an unseen testing set. The models detected the twelve positions, with and without bed sheets covering the body, achieving an average precision of 0.72 and 0.83, respectively, and an average recall ("sensitivity") of 0.67 and 0.76, respectively. For the supine class with and without bed sheets covering the body, the models achieved an average precision of 0.61 and 0.75, respectively, and an average recall of 0.74 and 0.81, respectively.
RESUMO
BACKGROUND: Smartphones are increasingly used in health research. They provide a continuous connection between participants and researchers to monitor long-term health trajectories of large populations at a fraction of the cost of traditional research studies. However, despite the potential of using smartphones in remote research, there is an urgent need to develop effective strategies to reach, recruit, and retain the target populations in a representative and equitable manner. OBJECTIVE: We aimed to investigate the impact of combining different recruitment and incentive distribution approaches used in remote research on cohort characteristics and long-term retention. The real-world factors significantly impacting active and passive data collection were also evaluated. METHODS: We conducted a secondary data analysis of participant recruitment and retention using data from a large remote observation study aimed at understanding real-world factors linked to cold, influenza, and the impact of traumatic brain injury on daily functioning. We conducted recruitment in 2 phases between March 15, 2020, and January 4, 2022. Over 10,000 smartphone owners in the United States were recruited to provide 12 weeks of daily surveys and smartphone-based passive-sensing data. Using multivariate statistics, we investigated the potential impact of different recruitment and incentive distribution approaches on cohort characteristics. Survival analysis was used to assess the effects of sociodemographic characteristics on participant retention across the 2 recruitment phases. Associations between passive data-sharing patterns and demographic characteristics of the cohort were evaluated using logistic regression. RESULTS: We analyzed over 330,000 days of engagement data collected from 10,000 participants. Our key findings are as follows: first, the overall characteristics of participants recruited using digital advertisements on social media and news media differed significantly from those of participants recruited using crowdsourcing platforms (Prolific and Amazon Mechanical Turk; P<.001). Second, participant retention in the study varied significantly across study phases, recruitment sources, and socioeconomic and demographic factors (P<.001). Third, notable differences in passive data collection were associated with device type (Android vs iOS) and participants' sociodemographic characteristics. Black or African American participants were significantly less likely to share passive sensor data streams than non-Hispanic White participants (odds ratio 0.44-0.49, 95% CI 0.35-0.61; P<.001). Fourth, participants were more likely to adhere to baseline surveys if the surveys were administered immediately after enrollment. Fifth, technical glitches could significantly impact real-world data collection in remote settings, which can severely impact generation of reliable evidence. CONCLUSIONS: Our findings highlight several factors, such as recruitment platforms, incentive distribution frequency, the timing of baseline surveys, device heterogeneity, and technical glitches in data collection infrastructure, that could impact remote long-term data collection. Combined together, these empirical findings could help inform best practices for monitoring anomalies during real-world data collection and for recruiting and retaining target populations in a representative and equitable manner.