Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 16 de 16
Filter
1.
J Chem Inf Model ; 63(12): 3659-3668, 2023 06 26.
Article in English | MEDLINE | ID: mdl-37312524

ABSTRACT

Machine learning models are increasingly being utilized to predict outcomes of organic chemical reactions. A large amount of reaction data is used to train these models, which is in stark contrast to how expert chemists discover and develop new reactions by leveraging information from a small number of relevant transformations. Transfer learning and active learning are two strategies that can operate in low-data situations, which may help fill this gap and promote the use of machine learning for tackling real-world challenges in organic synthesis. This Perspective introduces active and transfer learning and connects these to potential opportunities and directions for further research, especially in the area of prospective development of chemical transformations.


Subject(s)
Machine Learning , Prospective Studies , Chemistry Techniques, Synthetic
2.
J Med Internet Res ; 25: e43664, 2023 04 20.
Article in English | MEDLINE | ID: mdl-37079370

ABSTRACT

BACKGROUND: Although evidence supporting the feasibility of large-scale mobile health (mHealth) systems continues to grow, privacy protection remains an important implementation challenge. The potential scale of publicly available mHealth applications and the sensitive nature of the data involved will inevitably attract unwanted attention from adversarial actors seeking to compromise user privacy. Although privacy-preserving technologies such as federated learning (FL) and differential privacy (DP) offer strong theoretical guarantees, it is not clear how such technologies actually perform under real-world conditions. OBJECTIVE: Using data from the University of Michigan Intern Health Study (IHS), we assessed the privacy protection capabilities of FL and DP against the trade-offs in the associated model's accuracy and training time. Using a simulated external attack on a target mHealth system, we aimed to measure the effectiveness of such an attack under various levels of privacy protection on the target system and measure the costs to the target system's performance associated with the chosen levels of privacy protection. METHODS: A neural network classifier that attempts to predict IHS participant daily mood ecological momentary assessment score from sensor data served as our target system. An external attacker attempted to identify participants whose average mood ecological momentary assessment score is lower than the global average. The attack followed techniques in the literature, given the relevant assumptions about the abilities of the attacker. For measuring attack effectiveness, we collected attack success metrics (area under the curve [AUC], positive predictive value, and sensitivity), and for measuring privacy costs, we calculated the target model training time and measured the model utility metrics. Both sets of metrics are reported under varying degrees of privacy protection on the target. RESULTS: We found that FL alone does not provide adequate protection against the privacy attack proposed above, where the attacker's AUC in determining which participants exhibit lower than average mood is over 0.90 in the worst-case scenario. However, under the highest level of DP tested in this study, the attacker's AUC fell to approximately 0.59 with only a 10% point decrease in the target's R2 and a 43% increase in model training time. Attack positive predictive value and sensitivity followed similar trends. Finally, we showed that participants in the IHS most likely to require strong privacy protection are also most at risk from this particular privacy attack and subsequently stand to benefit the most from these privacy-preserving technologies. CONCLUSIONS: Our results demonstrated both the necessity of proactive privacy protection research and the feasibility of the current FL and DP methods implemented in a real mHealth scenario. Our simulation methods characterized the privacy-utility trade-off in our mHealth setup using highly interpretable metrics, providing a framework for future research into privacy-preserving technologies in data-driven health and medical applications.


Subject(s)
Privacy , Telemedicine , Humans , Algorithms , Computer Security , Neural Networks, Computer , Telemedicine/methods
3.
J Med Internet Res ; 25: e46700, 2023 03 30.
Article in English | MEDLINE | ID: mdl-36995757

ABSTRACT

Brauneck and colleagues have combined technical and legal perspectives in their timely and valuable paper "Federated Machine Learning, Privacy-Enhancing Technologies, and Data Protection Laws in Medical Research: Scoping Review." Researchers who design mobile health (mHealth) systems must adopt the same privacy-by-design approach that privacy regulations (eg, General Data Protection Regulation) do. In order to do this successfully, we will have to overcome implementation challenges in privacy-enhancing technologies such as differential privacy. We will also have to pay close attention to emerging technologies such as private synthetic data generation.


Subject(s)
Biomedical Research , Telemedicine , Humans , Privacy , Computer Security , Machine Learning
4.
Exp Gerontol ; 173: 112107, 2023 03.
Article in English | MEDLINE | ID: mdl-36731807

ABSTRACT

Aging is a ubiquitous biological process that limits the maximal lifespan of most organisms. Significant efforts by many groups have identified mechanisms that, when triggered by natural or artificial stimuli, are sufficient to either enhance or decrease maximal lifespan. Previous aging studies using the nematode Caenorhabditis elegans (C. elegans) generated a wealth of publicly available transcriptomics datasets linking changes in gene expression to lifespan regulation. However, a comprehensive comparison of these datasets across studies in the context of aging biology is missing. Here, we carry out a systematic meta-analysis of over 1200 bulk RNA sequencing (RNASeq) samples obtained from 74 peer-reviewed publications on aging-related transcriptomic changes in C. elegans. Using both differential expression analyses and machine learning approaches, we mine the pooled data for novel pro-longevity genes. We find that both approaches identify known and propose novel pro-longevity genes. Further, we find that inter-lab experimental variance complicates the application of machine learning algorithms, a limitation that was not solved using bulk RNA-Seq batch correction and normalization techniques. Taken as a whole, our results indicate that machine learning approaches may hold promise for the identification of genes that regulate aging but will require more sophisticated batch correction strategies or standardized input data to reliably identify novel pro-longevity genes.


Subject(s)
Caenorhabditis elegans Proteins , Caenorhabditis elegans , Animals , Caenorhabditis elegans/metabolism , Caenorhabditis elegans Proteins/genetics , Caenorhabditis elegans Proteins/metabolism , RNA-Seq , Aging/genetics , Longevity/genetics
5.
NPJ Digit Med ; 6(1): 4, 2023 Jan 11.
Article in English | MEDLINE | ID: mdl-36631665

ABSTRACT

Gamification, the application of gaming elements to increase enjoyment and engagement, has the potential to improve the effectiveness of digital health interventions, while the effectiveness of competition gamification components remains poorly understood on residency. To address this gap, we evaluate the effect of smartphone-based gamified team competition intervention on daily step count and sleep duration via a micro-randomized trial on medical interns. Our aim is to assess potential improvements in the factors (namely step count and sleep) that may help interns cope with stress and improve well-being. In 1779 interns, team competition intervention significantly increases the mean daily step count by 105.8 steps (SE 35.8, p = 0.03) relative to the no competition arm, while does not significantly affect the mean daily sleep minutes (p = 0.76). Moderator analyses indicate that the causal effects of competition on daily step count and sleep minutes decreased by 14.5 steps (SE 10.2, p = 0.16) and 1.9 minutes (SE 0.6, p = 0.003) for each additional week-in-study, respectively. Intra-institutional competition negatively moderates the causal effect of competition upon daily step count by -90.3 steps (SE 86.5, p = 0.30). Our results show that gamified team competition delivered via mobile app significantly increases daily physical activity which suggests that team competition can function as a mobile health intervention tool to increase short-term physical activity levels for medical interns. Future improvements in strategies of forming competition opponents and introducing occasional competition breaks may improve the overall effectiveness.

6.
J Comput Chem ; 43(27): 1880-1886, 2022 10 15.
Article in English | MEDLINE | ID: mdl-36000759

ABSTRACT

Conformer-RL is an open-source Python package for applying deep reinforcement learning (RL) to the task of generating a diverse set of low-energy conformations for a single molecule. The library features a simple interface to train a deep RL conformer generation model on any covalently bonded molecule or polymer, including most drug-like molecules. Under the hood, it implements state-of-the-art RL algorithms and graph neural network architectures tuned specifically for molecular structures. Conformer-RL is also a platform for researching new algorithms and neural network architectures for conformer generation, as the library contains modular class interfaces for RL environments and agents, allowing users to easily swap components with their own implementations. Additionally, it comes with tools to visualize and save generated conformers for further analysis. Conformer-RL is well-tested and thoroughly documented with tutorials for each of the functionalities mentioned above, and is available on PyPi and Github: https://github.com/ZimmermanGroup/conformer-rl.


Subject(s)
Neural Networks, Computer , Reinforcement, Psychology , Algorithms , Molecular Conformation , Polymers
7.
Chem Sci ; 13(22): 6655-6668, 2022 Jun 07.
Article in English | MEDLINE | ID: mdl-35756521

ABSTRACT

Transfer and active learning have the potential to accelerate the development of new chemical reactions, using prior data and new experiments to inform models that adapt to the target area of interest. This article shows how specifically tuned machine learning models, based on random forest classifiers, can expand the applicability of Pd-catalyzed cross-coupling reactions to types of nucleophiles unknown to the model. First, model transfer is shown to be effective when reaction mechanisms and substrates are closely related, even when models are trained on relatively small numbers of data points. Then, a model simplification scheme is tested and found to provide comparative predictivity on reactions of new nucleophiles that include unseen reagent combinations. Lastly, for a challenging target where model transfer only provides a modest benefit over random selection, an active transfer learning strategy is introduced to improve model predictions. Simple models, composed of a small number of decision trees with limited depths, are crucial for securing generalizability, interpretability, and performance of active transfer learning.

8.
JMIR Mhealth Uhealth ; 9(3): e23728, 2021 03 30.
Article in English | MEDLINE | ID: mdl-33783362

ABSTRACT

BACKGROUND: The use of wearables facilitates data collection at a previously unobtainable scale, enabling the construction of complex predictive models with the potential to improve health. However, the highly personal nature of these data requires strong privacy protection against data breaches and the use of data in a way that users do not intend. One method to protect user privacy while taking advantage of sharing data across users is federated learning, a technique that allows a machine learning model to be trained using data from all users while only storing a user's data on that user's device. By keeping data on users' devices, federated learning protects users' private data from data leaks and breaches on the researcher's central server and provides users with more control over how and when their data are used. However, there are few rigorous studies on the effectiveness of federated learning in the mobile health (mHealth) domain. OBJECTIVE: We review federated learning and assess whether it can be useful in the mHealth field, especially for addressing common mHealth challenges such as privacy concerns and user heterogeneity. The aims of this study are to describe federated learning in an mHealth context, apply a simulation of federated learning to an mHealth data set, and compare the performance of federated learning with the performance of other predictive models. METHODS: We applied a simulation of federated learning to predict the affective state of 15 subjects using physiological and motion data collected from a chest-worn device for approximately 36 minutes. We compared the results from this federated model with those from a centralized or server model and with the results from training individual models for each subject. RESULTS: In a 3-class classification problem using physiological and motion data to predict whether the subject was undertaking a neutral, amusing, or stressful task, the federated model achieved 92.8% accuracy on average, the server model achieved 93.2% accuracy on average, and the individual model achieved 90.2% accuracy on average. CONCLUSIONS: Our findings support the potential for using federated learning in mHealth. The results showed that the federated model performed better than a model trained separately on each individual and nearly as well as the server model. As federated learning offers more privacy than a server model, it may be a valuable option for designing sensitive data collection methods.


Subject(s)
Privacy , Telemedicine , Computer Simulation , Humans , Machine Learning , Research Design
9.
J Med Internet Res ; 22(3): e15033, 2020 03 31.
Article in English | MEDLINE | ID: mdl-32229469

ABSTRACT

BACKGROUND: Individuals in stressful work environments often experience mental health issues, such as depression. Reducing depression rates is difficult because of persistently stressful work environments and inadequate time or resources to access traditional mental health care services. Mobile health (mHealth) interventions provide an opportunity to deliver real-time interventions in the real world. In addition, the delivery times of interventions can be based on real-time data collected with a mobile device. To date, data and analyses informing the timing of delivery of mHealth interventions are generally lacking. OBJECTIVE: This study aimed to investigate when to provide mHealth interventions to individuals in stressful work environments to improve their behavior and mental health. The mHealth interventions targeted 3 categories of behavior: mood, activity, and sleep. The interventions aimed to improve 3 different outcomes: weekly mood (assessed through a daily survey), weekly step count, and weekly sleep time. We explored when these interventions were most effective, based on previous mood, step, and sleep scores. METHODS: We conducted a 6-month micro-randomized trial on 1565 medical interns. Medical internship, during the first year of physician residency training, is highly stressful, resulting in depression rates several folds higher than those of the general population. Every week, interns were randomly assigned to receive push notifications related to a particular category (mood, activity, sleep, or no notifications). Every day, we collected interns' daily mood valence, sleep, and step data. We assessed the causal effect moderation by the previous week's mood, steps, and sleep. Specifically, we examined changes in the effect of notifications containing mood, activity, and sleep messages based on the previous week's mood, step, and sleep scores. Moderation was assessed with a weighted and centered least-squares estimator. RESULTS: We found that the previous week's mood negatively moderated the effect of notifications on the current week's mood with an estimated moderation of -0.052 (P=.001). That is, notifications had a better impact on mood when the studied interns had a low mood in the previous week. Similarly, we found that the previous week's step count negatively moderated the effect of activity notifications on the current week's step count, with an estimated moderation of -0.039 (P=.01) and that the previous week's sleep negatively moderated the effect of sleep notifications on the current week's sleep with an estimated moderation of -0.075 (P<.001). For all three of these moderators, we estimated that the treatment effect was positive (beneficial) when the moderator was low, and negative (harmful) when the moderator was high. CONCLUSIONS: These findings suggest that an individual's current state meaningfully influences their receptivity to mHealth interventions for mental health. Timing interventions to match an individual's state may be critical to maximizing the efficacy of interventions. TRIAL REGISTRATION: ClinicalTrials.gov NCT03972293; http://clinicaltrials.gov/ct2/show/NCT03972293.


Subject(s)
Internship and Residency/standards , Telemedicine/methods , Female , Humans , Male
10.
J Chem Inf Model ; 60(3): 1290-1301, 2020 03 23.
Article in English | MEDLINE | ID: mdl-32091880

ABSTRACT

In a departure from conventional chemical approaches, data-driven models of chemical reactions have recently been shown to be statistically successful using machine learning. These models, however, are largely black box in character and have not provided the kind of chemical insights that historically advanced the field of chemistry. To examine the knowledgebase of machine-learning models-what does the machine learn-this article deconstructs black-box machine-learning models of a diverse chemical reaction data set. Through experimentation with chemical representations and modeling techniques, the analysis provides insights into the nature of how statistical accuracy can arise, even when the model lacks informative physical principles. By peeling back the layers of these complicated models we arrive at a minimal, chemically intuitive model (and no machine learning involved). This model is based on systematic reaction-type classification and Evans-Polanyi relationships within reaction types which are easily visualized and interpreted. Through exploring this simple model, we gain deeper understanding of the data set and uncover a means for expert interactions to improve the model's reliability.


Subject(s)
Machine Learning , Reproducibility of Results
11.
J Chem Inf Model ; 59(9): 3645-3654, 2019 09 23.
Article in English | MEDLINE | ID: mdl-31381340

ABSTRACT

Reaction databases provide a great deal of useful information to assist planning of experiments but do not provide any interpretation or chemical concepts to accompany this information. In this work, reactions are labeled with experimental conditions, and network analysis shows that consistencies within clusters of data points can be leveraged to organize this information. In particular, this analysis shows how particular experimental conditions (specifically solvent) are effective in enabling specific organic reactions (Friedel-Crafts, Aldol addition, Claisen condensation, Diels-Alder, and Wittig), including variations within each reaction class. Network analysis shows data points for reactions tend to break into clusters that depend on the catalyst and chemical structure. This type of clustering, which mimics how a chemist reasons, is derived directly from the network. Therefore, the findings of this work could augment synthesis planning by providing predictions in a fashion that mimics human chemists. To numerically evaluate solvent prediction ability, three methods are compared: network analysis (through the k-nearest neighbor algorithm), a support vector machine, and a deep neural network. The most accurate method in 4 of the 5 test cases is the network analysis, with deep neural networks also showing good prediction scores. The network analysis tool was evaluated by an expert panel of chemists, who generally agreed that the algorithm produced accurate solvent choices while simultaneously being transparent in the underlying reasons for its predictions.


Subject(s)
Chemistry Techniques, Synthetic , Machine Learning , Models, Chemical , Catalysis , Cycloaddition Reaction , Humans , Solvents/chemistry
12.
Ann Behav Med ; 52(6): 446-462, 2018 05 18.
Article in English | MEDLINE | ID: mdl-27663578

ABSTRACT

Background: The just-in-time adaptive intervention (JITAI) is an intervention design aiming to provide the right type/amount of support, at the right time, by adapting to an individual's changing internal and contextual state. The availability of increasingly powerful mobile and sensing technologies underpins the use of JITAIs to support health behavior, as in such a setting an individual's state can change rapidly, unexpectedly, and in his/her natural environment. Purpose: Despite the increasing use and appeal of JITAIs, a major gap exists between the growing technological capabilities for delivering JITAIs and research on the development and evaluation of these interventions. Many JITAIs have been developed with minimal use of empirical evidence, theory, or accepted treatment guidelines. Here, we take an essential first step towards bridging this gap. Methods: Building on health behavior theories and the extant literature on JITAIs, we clarify the scientific motivation for JITAIs, define their fundamental components, and highlight design principles related to these components. Examples of JITAIs from various domains of health behavior research are used for illustration. Conclusions: As we enter a new era of technological capacity for delivering JITAIs, it is critical that researchers develop sophisticated and nuanced health behavior theories capable of guiding the construction of such interventions. Particular attention has to be given to better understanding the implications of providing timely and ecologically sound support for intervention adherence and retention.


Subject(s)
Behavioral Medicine/methods , Health Behavior , Patient Compliance , Research Design , Telemedicine/methods , Humans
13.
Adv Neural Inf Process Syst ; 30: 5973-5981, 2017 Dec.
Article in English | MEDLINE | ID: mdl-29225449

ABSTRACT

Contextual bandits have become popular as they offer a middle ground between very simple approaches based on multi-armed bandits and very complex approaches using the full power of reinforcement learning. They have demonstrated success in web applications and have a rich body of associated theoretical guarantees. Linear models are well understood theoretically and preferred by practitioners because they are not only easily interpretable but also simple to implement and debug. Furthermore, if the linear model is true, we get very strong performance guarantees. Unfortunately, in emerging applications in mobile health, the time-invariant linear model assumption is untenable. We provide an extension of the linear model for contextual bandits that has two parts: baseline reward and treatment effect. We allow the former to be complex but keep the latter simple. We argue that this model is plausible for mobile health applications. At the same time, it leads to algorithms with strong performance guarantees as in the linear model setting, while still allowing for complex nonlinear baseline modeling. Our theory is supported by experiments on data gathered in a recently concluded mobile health study.

14.
Stat Med ; 35(12): 1944-71, 2016 05 30.
Article in English | MEDLINE | ID: mdl-26707831

ABSTRACT

The use and development of mobile interventions are experiencing rapid growth. In "just-in-time" mobile interventions, treatments are provided via a mobile device, and they are intended to help an individual make healthy decisions 'in the moment,' and thus have a proximal, near future impact. Currently, the development of mobile interventions is proceeding at a much faster pace than that of associated data science methods. A first step toward developing data-based methods is to provide an experimental design for testing the proximal effects of these just-in-time treatments. In this paper, we propose a 'micro-randomized' trial design for this purpose. In a micro-randomized trial, treatments are sequentially randomized throughout the conduct of the study, with the result that each participant may be randomized at the 100s or 1000s of occasions at which a treatment might be provided. Further, we develop a test statistic for assessing the proximal effect of a treatment as well as an associated sample size calculator. We conduct simulation evaluations of the sample size calculator in various settings. Rules of thumb that might be used in designing a micro-randomized trial are discussed. This work is motivated by our collaboration on the HeartSteps mobile application designed to increase physical activity. Copyright © 2015 John Wiley & Sons, Ltd.


Subject(s)
Mobile Applications , Randomized Controlled Trials as Topic/standards , Sample Size , Exercise , Health Promotion/methods , Humans , Mobile Applications/statistics & numerical data , Randomized Controlled Trials as Topic/methods , Randomized Controlled Trials as Topic/statistics & numerical data , Statistics as Topic
15.
Health Psychol ; 34S: 1220-8, 2015 Dec.
Article in English | MEDLINE | ID: mdl-26651463

ABSTRACT

OBJECTIVE: This article presents an experimental design, the microrandomized trial, developed to support optimization of just-in-time adaptive interventions (JITAIs). JITAIs are mHealth technologies that aim to deliver the right intervention components at the right times and locations to optimally support individuals' health behaviors. Microrandomized trials offer a way to optimize such interventions by enabling modeling of causal effects and time-varying effect moderation for individual intervention components within a JITAI. METHOD: The article describes the microrandomized trial design, enumerates research questions that this experimental design can help answer, and provides an overview of the data analyses that can be used to assess the causal effects of studied intervention components and investigate time-varying moderation of those effects. RESULTS: Microrandomized trials enable causal modeling of proximal effects of the randomized intervention components and assessment of time-varying moderation of those effects. CONCLUSION: Microrandomized trials can help researchers understand whether their interventions are having intended effects, when and for whom they are effective, and what factors moderate the interventions' effects, enabling creation of more effective JITAIs.


Subject(s)
Adaptation, Psychological , Early Medical Intervention/methods , Health Behavior , Randomized Controlled Trials as Topic/methods , Telemedicine/methods , Early Medical Intervention/trends , Humans , Research Design/standards , Telemedicine/trends
16.
PLoS One ; 8(5): e58977, 2013.
Article in English | MEDLINE | ID: mdl-23650495

ABSTRACT

Correctly identifying associations of genes with diseases has long been a goal in biology. With the emergence of large-scale gene-phenotype association datasets in biology, we can leverage statistical and machine learning methods to help us achieve this goal. In this paper, we present two methods for predicting gene-disease associations based on functional gene associations and gene-phenotype associations in model organisms. The first method, the Katz measure, is motivated from its success in social network link prediction, and is very closely related to some of the recent methods proposed for gene-disease association inference. The second method, called Catapult (Combining dATa Across species using Positive-Unlabeled Learning Techniques), is a supervised machine learning method that uses a biased support vector machine where the features are derived from walks in a heterogeneous gene-trait network. We study the performance of the proposed methods and related state-of-the-art methods using two different evaluation strategies, on two distinct data sets, namely OMIM phenotypes and drug-target interactions. Finally, by measuring the performance of the methods using two different evaluation strategies, we show that even though both methods perform very well, the Katz measure is better at identifying associations between traits and poorly studied genes, whereas Catapult is better suited to correctly identifying gene-trait associations overall [corrected].


Subject(s)
Genetic Association Studies/methods , Algorithms , Animals , Gene Regulatory Networks , Humans , Models, Genetic , Models, Statistical , Protein Interaction Mapping , Social Networking , Support Vector Machine
SELECTION OF CITATIONS
SEARCH DETAIL
...