Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 25
Filter
1.
PLoS One ; 19(7): e0307136, 2024.
Article in English | MEDLINE | ID: mdl-39024327

ABSTRACT

Ensuring stable blood glucose (BG) levels within the norm is crucial for potential long-term health complications prevention when managing a chronic disease like Type 1 diabetes (T1D), as well as body weight. Therefore, accurately forecasting blood sugar levels holds significant importance for clinicians and specific users, such as type one diabetic patients. In recent years, Continuous Glucose Monitoring (CGM) devices have been developed and are now in use. However, the ability to forecast future blood glucose values is essential for better management. Previous studies proposed the use of food intake documentation in order to enhance the forecasting accuracy. Unfortunately, these methods require the participants to manually record their daily activities such as food intake, drink and exercise, which creates somewhat inaccurate data, and is hard to maintain along time. To reduce the burden on participants and improve the accuracy of BG level predictions, as well as optimize training and prediction times, this study proposes a framework that continuously tracks participants' movements using a smartwatch. The framework analyzes sensor data and allows users to document their activities. We developed a model incorporating BG data, smartwatch sensor data, and user-documented activities. This model was applied to a dataset we collected from a dozen participants. Our study's results indicate that documented activities did not enhance BG level predictions. However, using smartwatch sensors, such as heart rate and step detector data, in addition to blood glucose measurements from the last sixty minutes, significantly improved the predictions.


Subject(s)
Blood Glucose Self-Monitoring , Blood Glucose , Humans , Blood Glucose/analysis , Blood Glucose Self-Monitoring/instrumentation , Blood Glucose Self-Monitoring/methods , Diabetes Mellitus, Type 1/blood , Male , Female , Adult , Wearable Electronic Devices
2.
J Biomed Inform ; 156: 104665, 2024 Jun 08.
Article in English | MEDLINE | ID: mdl-38852777

ABSTRACT

OBJECTIVE: Develop a new method for continuous prediction that utilizes a single temporal pattern ending with an event of interest and its multiple instances detected in the temporal data. METHODS: Use temporal abstraction to transform time series, instantaneous events, and time intervals into a uniform representation using symbolic time intervals (STIs). Introduce a new approach to event prediction using a single time intervals-related pattern (TIRP), which can learn models to predict whether and when an event of interest will occur, based on multiple instances of a pattern that end with the event. RESULTS: The proposed methods achieved an average improvement of 5% AUROC over LSTM-FCN, the best-performed baseline model, out of the evaluated baseline models (RawXGB, Resnet, LSTM-FCN, and ROCKET) that were applied to real-life datasets. CONCLUSION: The proposed methods for predicting events continuously have the potential to be used in a wide range of real-world and real-time applications in diverse domains with heterogeneous multivariate temporal data. For example, it could be used to predict panic attacks early using wearable devices or to predict complications early in intensive care unit patients.

3.
PLoS One ; 19(3): e0297270, 2024.
Article in English | MEDLINE | ID: mdl-38437185

ABSTRACT

Professional bicycle racing is a popular sport that has attracted significant attention in recent years. The evolution and ubiquitous use of sensors allow cyclists to measure many metrics including power, heart rate, speed, cadence, and more in training and racing. In this paper we explore for the first time assignment of a subset of a team's cyclists to an upcoming race. We introduce RaceFit, a model that recommends, based on recent workouts and past assignments, cyclists for participation in an upcoming race. RaceFit consists of binary classifiers that are trained on pairs of a cyclist and a race, described by their relevant properties (features) such as the cyclist's demographic properties, as well as features extracted from his workout data from recent weeks; as well additional properties of the race, such as its distance, elevation gain, and more. Two main approaches are introduced in recommending on each stage in a race and aggregate from it to the race, or on the entire race. The model training is based on binary label which represent participation of cyclist in a race (or in a stage) in past events. We evaluated RaceFit rigorously on a large dataset of three pro-cycling teams' cyclists and race data achieving up to 80% precision@i. The first experiment had shown that using TP or STRAVA data performs the same. Then the best-performing parameters of the framework are using 5 weeks time window, imputation was effective, and the CatBoost classifier performed best. However, the model with any of the parameters performed always better than the baselines, in which the cyclists are assigned based on their popularity in historical data. Additionally, we present the top-ranked predictive features.


Subject(s)
Bicycling , Sports , Benchmarking , Heart Rate
4.
Artif Intell Med ; 139: 102525, 2023 05.
Article in English | MEDLINE | ID: mdl-37100504

ABSTRACT

Prevention and treatment of complications are the backbone of medical care, particularly in critical care settings. Early detection and prompt intervention can potentially prevent complications from occurring and improve outcomes. In this study, we use four longitudinal vital signs variables of intensive care unit patients, focusing on predicting acute hypertensive episodes (AHEs). These episodes represent elevations in blood pressure and may result in clinical damage or indicate a change in a patient's clinical situation, such as an elevation in intracranial pressure or kidney failure. Prediction of AHEs may allow clinicians to anticipate changes in the patient's condition and respond early on to prevent these from occurring. Temporal abstraction was employed to transform the multivariate temporal data into a uniform representation of symbolic time intervals, from which frequent time-intervals-related patterns (TIRPs) are mined and used as features for AHE prediction. A novel TIRP metric for classification, called coverage, is introduced that measures the coverage of a TIRP's instances in a time window. For comparison, several baseline models were applied on the raw time series data, including logistic regression and sequential deep learning models, are used. Our results show that using frequent TIRPs as features outperforms the baseline models, and the use of the coverage, metric outperforms other TIRP metrics. Two approaches to predicting AHEs in real-life application conditions are evaluated: using a sliding window to continuously predict whether a patient would experience an AHE within a specific prediction time period ahead, our models produced an AUC-ROC of 82%, but with low AUPRC. Alternatively, predicting whether an AHE would generally occur during the entire admission resulted in an AUC-ROC of 74%.


Subject(s)
Hypertension , Intensive Care Units , Humans , Critical Illness , Blood Pressure , Critical Care , Hypertension/diagnosis
5.
J Biomed Inform ; 134: 104198, 2022 10.
Article in English | MEDLINE | ID: mdl-36100163

ABSTRACT

Mortality prevention in T2D elderly population having Chronic Kidney Disease (CKD) may be possible thorough risk assessment and predictive modeling. In this study we investigate the ability to predict mortality using heterogeneous Electronic Health Records data. Temporal abstraction is employed to transform the heterogeneous multivariate temporal data into a uniform representation of symbolic time intervals, from which then frequent Time Intervals Related Patterns (TIRPs) are discovered. However, in this study a novel representation of the TIRPs is introduced, which enables to incorporate them in Deep Learning Networks. We describe here the use of iTirps and bTirps, in which the TIRPs are represented by a integer and binary vector representing the time respectively. While bTirp represents whether a TIRP's instance was present, iTirp represents whether multiple instances were present. While the framework showed encouraging results, a major challenge is often the large number of TIRPs, which may cause the models to under-perform. We introduce a novel method for TIRPs' selection method, called TIRP Ranking Criteria (TRC), which is consists on the TIRP's metrics, such as the differences in its recurrences, its frequencies, and the average duration difference between the classes. Additionally, we introduce an advanced version, called TRC Redundant TIRP Removal (TRC-RTR), TIRPs that highly correlate are candidates for removal. Then the selected subset of iTirp/bTirps is fed into a Deep Learning architecture like a Recurrent Neural Network or a Convolutional Neural Network. Furthermore, a predictive committee is utilized in which raw data and iTirp data are both used as input. Our results show that iTirps-based models that use a subset of iTirps based on the TRC-RTR method outperform models that use raw data or models that use full set of discovered iTirps.


Subject(s)
Diabetes Mellitus, Type 2 , Electronic Health Records , Aged , Humans , Neural Networks, Computer
6.
Cell Syst ; 13(9): 711-723.e7, 2022 09 21.
Article in English | MEDLINE | ID: mdl-35921844

ABSTRACT

Multicellular synchronization is a ubiquitous phenomenon in living systems. However, how noisy and heterogeneous behaviors of individual cells are integrated across a population toward multicellular synchronization is unclear. Here, we study the process of multicellular calcium synchronization of the endothelial cell monolayer in response to mechanical stimuli. We applied information theory to quantify the asymmetric information transfer between pairs of cells and defined quantitative measures to how single cells receive or transmit information within a multicellular network. Our analysis revealed that multicellular synchronization was established by gradual enhancement of information spread from the single cell to the multicellular scale. Synchronization was associated with heterogeneity in the cells' communication properties, reinforcement of the cells' state, and information flow. Altogether, we suggest a phenomenological model where cells gradually learn their local environment, adjust, and reinforce their internal state to stabilize the multicellular network architecture to support information flow from local to global scales toward multicellular synchronization.


Subject(s)
Calcium , Information Theory , Cell Communication
7.
J Biomed Inform ; 134: 104169, 2022 10.
Article in English | MEDLINE | ID: mdl-36038065

ABSTRACT

Temporal knowledge discovery in clinical problems, is crucial to investigate problems in the data science era. Meaningful progress has been made computationally in the discovery of frequent temporal patterns, which may store potentially meaningful knowledge. However, for temporal knowledge discovery and acquisition, effective visualization is essential and still stores much room for contributions. While visualization of frequent temporal patterns was relatively under researched, it stores meaningful opportunities in facilitating usable ways to assist domain experts, or researchers, in exploring and acquiring temporal knowledge. In this paper, a novel approach for the visualization of an enumeration tree of frequent temporal patterns is introduced for, whether mined from a single population, or for the comparison of patterns that were discovered in two separate populations. While this approach is relevant to any sequence-based patterns, we demonstrate its use on the most complex scenario of time intervals related patterns (TIRPs). The interface enables users to browse an enumeration tree of frequent patterns, or search for specific patterns, as well as discover the most discriminating TIRPs among two populations. For that a novel visualization of the temporal patterns is introduced using a bubble chart, in which each bubble represents a temporal pattern, and the chart axes represent the various metrics of the patterns, such as their frequency, reoccurrence, and more, which provides a fast overview of the patterns as a whole, as well as access specific ones. We present a comprehensive and rigorous user study on two real-life datasets, demonstrating the usability advantages of the novel approaches.


Subject(s)
Data Visualization , Pattern Recognition, Automated , Time
8.
Artif Intell Med ; 130: 102325, 2022 08.
Article in English | MEDLINE | ID: mdl-35809964

ABSTRACT

Mortality in the type II diabetic elderly population can sometimes be prevented through intervention, for which risk assessment through predictive modeling is required. Since Electronic Health Records data are typically heterogeneous and sparse, the use of Temporal Abstraction and time intervals mining to discover frequent Time Intervals Related Patterns (TIRPs) is employed. While TIRPs are used as features for a predictive model, the temporal relations between them in general, and among each TIRP's instances are not represented. We introduce a novel TIRP based representation called integer-TIRP (iTirp) in which the TIRPs become channels containing values that represent the TIRP instances that were detected at each time point. Then the iTirp representation is fed into a Deep Learning architecture, that learns this kind of temporal relations, using a Recurrent Neural Network or a Convolutional Neural Network. Additionally, a predictive committee is introduced in which raw data and iTirp data are concatenated as inputs. Our results show that iTirps based models outperform the use of deep learning with raw data, resulting in 82% AUC.


Subject(s)
Diabetes Mellitus, Type 2 , Neural Networks, Computer , Aged , Electronic Health Records , Humans
9.
J Biomed Inform ; 117: 103734, 2021 05.
Article in English | MEDLINE | ID: mdl-33711544

ABSTRACT

Outcomes' prediction in Electronic Health Records (EHR) and specifically in Critical Care is increasingly attracting more exploration and research. In this study, we used clinical data from the Intensive Care Unit (ICU), focusing on ICU acquired sepsis. Looking at the current literature, several evaluation approaches are reported, inspired by epidemiological designs, in which some do not always reflect real-life application's conditions. This problem seems relevant generally to outcomes' prediction in longitudinal EHR data, or generally longitudinal data, while in this study we focused on ICU data. Unlike in most previous studies that investigated all sepsis admissions, we focused specifically on ICU-Acquired Sepsis. Due to the sparse nature of the longitudinal data, we employed the use of Temporal Abstraction and Time Interval-Related Patterns discovery, which are further used as classification features. Two experiments were designed using three different outcomes prediction study designs from the literature, implementing various levels of real-life conditions to evaluate the prediction models. The first experiment focused on predicting whether a patient would suffer from ICU-acquired sepsis and when during her admission, given a sliding observation time window, and the comparison of the three study designs behavior. The second experiment focused only on predicting whether the patient will suffer from ICU-acquired sepsis, based on data taken relatively to his admission start time. Our results show that using Temporal Discretization for Classification (TD4C) led to better performance than using the Equal-Width Discretization, Knowledge-Based, or SAX. Also, using two states abstraction was better than three or four. Using the default Binary TIRP representation method performed better than Mean Duration, Horizontal Support, and horizontally normalized horizontal support. Using XGBoost as a classifier performed better than Logistic Regression, Neural Net, or Random Forest. Additionally, it is demonstrated why the use of case-crossover-control is most appropriate for real life application conditions evaluation, unlike other incomplete designs that may even result in "better performance".


Subject(s)
Intensive Care Units , Sepsis , Critical Care , Electronic Health Records , Female , Humans , Prognosis , Sepsis/diagnosis , Sepsis/epidemiology
10.
J Biomed Inform ; 90: 103092, 2019 02.
Article in English | MEDLINE | ID: mdl-30654029
11.
J Biomed Inform ; 75: 83-95, 2017 Nov.
Article in English | MEDLINE | ID: mdl-28987378

ABSTRACT

Increasingly, frequent temporal patterns discovered in longitudinal patient records are proposed as features for classification and prediction, and as means to cluster patient clinical trajectories. However, to justify that, we must demonstrate that most frequent temporal patterns are indeed consistently discoverable within the records of different patient subsets within similar patient populations. We have developed several measures for the consistency of the discovery of temporal patterns. We focus on time-interval relations patterns (TIRPs) that can be discovered within different subsets of the same patient population. We expect the discovered TIRPs (1) to be frequent in each subset, (2) preserve their "local" metrics - the absolute frequency of each pattern, measured by a Proportion Test, and (3) preserve their "global" characteristics - their overall distribution, measured by a Kolmogorov-Smirnov test. We also wanted to examine the effect on consistency, over a variety of settings, of varying the minimal frequency threshold for TIRP discovery, and of using a TIRP-filtering criterion that we previously introduced, the Semantic Adjacency Criterion (SAC). We applied our methodology to three medical domains (oncology, infectious hepatitis, and diabetes). We found that, within the minimal frequency ranges we had examined, 70-95% of the discovered TIRPs were consistently discoverable; 40-48% of them maintained their local frequency. TIRP global distribution similarity varied widely, from 0% to 65%. Increasing the threshold usually increased the percentage of TIRPs that were repeatedly discovered across different patient subsets within the same domain, and the probability of a similar TIRP distribution. Using the SAC principle, enhanced, for most minimal support levels, the percentage of repeating TIRPs, their local consistency and their global consistency. The effect of using the SAC was further strengthened as the minimal frequency threshold was raised.


Subject(s)
Medical Records , Pattern Recognition, Automated/methods , Algorithms , Chronic Disease , Diabetes Mellitus/pathology , Hepatitis, Viral, Human/pathology , Humans , Time and Motion Studies
12.
J Biomed Inform ; 75: 70-82, 2017 Nov.
Article in English | MEDLINE | ID: mdl-28823923

ABSTRACT

Prediction of medical events, such as clinical procedures, is essential for preventing disease, understanding disease mechanism, and increasing patient quality of care. Although longitudinal clinical data from Electronic Health Records provides opportunities to develop predictive models, the use of these data faces significant challenges. Primarily, while the data are longitudinal and represent thousands of conceptual events having duration, they are also sparse, complicating the application of traditional analysis approaches. Furthermore, the framework presented here takes advantage of the events duration and gaps. International standards for electronic healthcare data represent data elements, such as procedures, conditions, and drug exposures, using eras, or time intervals. Such eras contain both an event and a duration and enable the application of time intervals mining - a relatively new subfield of data mining. In this study, we present Maitreya, a framework for time intervals analytics in longitudinal clinical data. Maitreya discovers frequent time intervals related patterns (TIRPs), which we use as prognostic markers for modelling clinical events. We introduce three novel TIRP metrics that are normalized versions of the horizontal-support, that represents the number of TIRP instances per patient. We evaluate Maitreya on 28 frequent and clinically important procedures, using the three novel TIRP representation metrics in comparison to no temporal representation and previous TIRPs metrics. We also evaluate the epsilon value that makes Allen's relations more flexible with several settings of 30, 60, 90 and 180days in comparison to the default zero. For twenty-two of these procedures, the use of temporal patterns as predictors was superior to non-temporal features, and the use of the vertically normalized horizontal support metric to represent TIRPs as features was most effective. The use of the epsilon value with thirty days was slightly better than the zero.


Subject(s)
Electronic Health Records , Time and Motion Studies , Algorithms , Humans
13.
Artif Intell Med ; 81: 12-32, 2017 Sep.
Article in English | MEDLINE | ID: mdl-28456512

ABSTRACT

BACKGROUND AND OBJECTIVES: Labeling instances by domain experts for classification is often time consuming and expensive. To reduce such labeling efforts, we had proposed the application of active learning (AL) methods, introduced our CAESAR-ALE framework for classifying the severity of clinical conditions, and shown its significant reduction of labeling efforts. The use of any of three AL methods (one well known [SVM-Margin], and two that we introduced [Exploitation and Combination_XA]) significantly reduced (by 48% to 64%) condition labeling efforts, compared to standard passive (random instance-selection) SVM learning. Furthermore, our new AL methods achieved maximal accuracy using 12% fewer labeled cases than the SVM-Margin AL method. However, because labelers have varying levels of expertise, a major issue associated with learning methods, and AL methods in particular, is how to best to use the labeling provided by a committee of labelers. First, we wanted to know, based on the labelers' learning curves, whether using AL methods (versus standard passive learning methods) has an effect on the Intra-labeler variability (within the learning curve of each labeler) and inter-labeler variability (among the learning curves of different labelers). Then, we wanted to examine the effect of learning (either passively or actively) from the labels created by the majority consensus of a group of labelers. METHODS: We used our CAESAR-ALE framework for classifying the severity of clinical conditions, the three AL methods and the passive learning method, as mentioned above, to induce the classifications models. We used a dataset of 516 clinical conditions and their severity labeling, represented by features aggregated from the medical records of 1.9 million patients treated at Columbia University Medical Center. We analyzed the variance of the classification performance within (intra-labeler), and especially among (inter-labeler) the classification models that were induced by using the labels provided by seven labelers. We also compared the performance of the passive and active learning models when using the consensus label. RESULTS: The AL methods: produced, for the models induced from each labeler, smoother Intra-labeler learning curves during the training phase, compared to the models produced when using the passive learning method. The mean standard deviation of the learning curves of the three AL methods over all labelers (mean: 0.0379; range: [0.0182 to 0.0496]), was significantly lower (p=0.049) than the Intra-labeler standard deviation when using the passive learning method (mean: 0.0484; range: [0.0275-0.0724). Using the AL methods resulted in a lower mean Inter-labeler AUC standard deviation among the AUC values of the labelers' different models during the training phase, compared to the variance of the induced models' AUC values when using passive learning. The Inter-labeler AUC standard deviation, using the passive learning method (0.039), was almost twice as high as the Inter-labeler standard deviation using our two new AL methods (0.02 and 0.019, respectively). The SVM-Margin AL method resulted in an Inter-labeler standard deviation (0.029) that was higher by almost 50% than that of our two AL methods The difference in the inter-labeler standard deviation between the passive learning method and the SVM-Margin learning method was significant (p=0.042). The difference between the SVM-Margin and Exploitation method was insignificant (p=0.29), as was the difference between the Combination_XA and Exploitation methods (p=0.67). Finally, using the consensus label led to a learning curve that had a higher mean intra-labeler variance, but resulted eventually in an AUC that was at least as high as the AUC achieved using the gold standard label and that was always higher than the expected mean AUC of a randomly selected labeler, regardless of the choice of learning method (including a passive learning method). Using a paired t-test, the difference between the intra-labeler AUC standard deviation when using the consensus label, versus that value when using the other two labeling strategies, was significant only when using the passive learning method (p=0.014), but not when using any of the three AL methods. CONCLUSIONS: The use of AL methods, (a) reduces intra-labeler variability in the performance of the induced models during the training phase, and thus reduces the risk of halting the process at a local minimum that is significantly different in performance from the rest of the learned models; and (b) reduces Inter-labeler performance variance, and thus reduces the dependence on the use of a particular labeler. In addition, the use of a consensus label, agreed upon by a rather uneven group of labelers, might be at least as good as using the gold standard labeler, who might not be available, and certainly better than randomly selecting one of the group's individual labelers. Finally, using the AL methods: when provided by the consensus label reduced the intra-labeler AUC variance during the learning phase, compared to using passive learning.


Subject(s)
Data Mining/methods , Electronic Health Records/classification , Supervised Machine Learning , Area Under Curve , Humans , Learning Curve , Observer Variation , Phenotype , Reproducibility of Results , Severity of Illness Index , Time Factors
14.
Article in English | MEDLINE | ID: mdl-27429447

ABSTRACT

Accurate prognosis of outcome events, such as clinical procedures or disease diagnosis, is central in medicine. The emergence of longitudinal clinical data, like the Electronic Health Records (EHR), represents an opportunity to develop automated methods for predicting patient outcomes. However, these data are highly dimensional and very sparse, complicating the application of predictive modeling techniques. Further, their temporal nature is not fully exploited by current methods, and temporal abstraction was recently used which results in symbolic time intervals representation. We present Maitreya, a framework for the prediction of outcome events that leverages these symbolic time intervals. Using Maitreya, learn predictive models based on the temporal patterns in the clinical records that are prognostic markers and use these markers to train predictive models for eight clinical procedures. In order to decrease the number of patterns that are used as features, we propose the use of three one class feature selection methods. We evaluate the performance of Maitreya under several parameter settings, including the one-class feature selection, and compare our results to that of atemporal approaches. In general, we found that the use of temporal patterns outperformed the atemporal methods, when representing the number of pattern occurrences.


Subject(s)
Data Mining/methods , Medical Informatics/methods , Prognosis , Treatment Outcome , Algorithms , Electronic Health Records/classification , Humans , Time Factors
15.
J Biomed Inform ; 61: 44-54, 2016 06.
Article in English | MEDLINE | ID: mdl-27016383

ABSTRACT

Classification of condition severity can be useful for discriminating among sets of conditions or phenotypes, for example when prioritizing patient care or for other healthcare purposes. Electronic Health Records (EHRs) represent a rich source of labeled information that can be harnessed for severity classification. The labeling of EHRs is expensive and in many cases requires employing professionals with high level of expertise. In this study, we demonstrate the use of Active Learning (AL) techniques to decrease expert labeling efforts. We employ three AL methods and demonstrate their ability to reduce labeling efforts while effectively discriminating condition severity. We incorporate three AL methods into a new framework based on the original CAESAR (Classification Approach for Extracting Severity Automatically from Electronic Health Records) framework to create the Active Learning Enhancement framework (CAESAR-ALE). We applied CAESAR-ALE to a dataset containing 516 conditions of varying severity levels that were manually labeled by seven experts. Our dataset, called the "CAESAR dataset," was created from the medical records of 1.9 million patients treated at Columbia University Medical Center (CUMC). All three AL methods decreased labelers' efforts compared to the learning methods applied by the original CAESER framework in which the classifier was trained on the entire set of conditions; depending on the AL strategy used in the current study, the reduction ranged from 48% to 64% that can result in significant savings, both in time and money. As for the PPV (precision) measure, CAESAR-ALE achieved more than 13% absolute improvement in the predictive capabilities of the framework when classifying conditions as severe. These results demonstrate the potential of AL methods to decrease the labeling efforts of medical experts, while increasing accuracy given the same (or even a smaller) number of acquired conditions. We also demonstrated that the methods included in the CAESAR-ALE framework (Exploitation and Combination_XA) are more robust to the use of human labelers with different levels of professional expertise.


Subject(s)
Data Curation , Electronic Health Records , Problem-Based Learning , Algorithms , Automation , Humans
16.
Article in English | MEDLINE | ID: mdl-26559926

ABSTRACT

Small molecules are indispensable to modern medical therapy. However, their use may lead to unintended, negative medical outcomes commonly referred to as adverse drug reactions (ADRs). These effects vary widely in mechanism, severity, and populations affected, making ADR prediction and identification important public health concerns. Current methods rely on clinical trials and postmarket surveillance programs to find novel ADRs; however, clinical trials are limited by small sample size, whereas postmarket surveillance methods may be biased and inherently leave patients at risk until sufficient clinical evidence has been gathered. Systems pharmacology, an emerging interdisciplinary field combining network and chemical biology, provides important tools to uncover and understand ADRs and may mitigate the drawbacks of traditional methods. In particular, network analysis allows researchers to integrate heterogeneous data sources and quantify the interactions between biological and chemical entities. Recent work in this area has combined chemical, biological, and large-scale observational health data to predict ADRs in both individual patients and global populations. In this review, we explore the rapid expansion of systems pharmacology in the study of ADRs. We enumerate the existing methods and strategies and illustrate progress in the field with a model framework that incorporates crucial data elements, such as diet and comorbidities, known to modulate ADR risk. Using this framework, we highlight avenues of research that may currently be underexplored, representing opportunities for future work.


Subject(s)
Drug-Related Side Effects and Adverse Reactions/metabolism , Systems Biology/methods , Animals , Drug-Related Side Effects and Adverse Reactions/pathology , Drug-Related Side Effects and Adverse Reactions/physiopathology , Humans
17.
J Biomed Inform ; 42(1): 11-21, 2009 Feb.
Article in English | MEDLINE | ID: mdl-18721900

ABSTRACT

We designed and implemented a generic search engine (Vaidurya), as part of our Digital clinical-Guideline Library (DeGeL) framework. Two search methods were implemented in addition to full-text search: (1) concept-based search, which relies on pre-indexing the guidelines in a clinically meaningful fashion, and (2) context-sensitive search, which relies on first semi-structuring the guidelines according to a given ontology, then searching for terms within specific labeled text segments. The Vaidurya engine is fully functional and is used within the DeGeL system. We describe the Vaidurya ontological and algorithmic framework; we also briefly summarize the results of a detailed evaluation in the clinical-guideline domain, demonstrating that both concept-based and context-sensitive ontology-independent search are highly feasible and significantly improve on free text search retrieval performance. We conclude by analyzing the limitations and advantages of the approach, and the steps that we have started to take to extend it based on user feedback.


Subject(s)
Information Storage and Retrieval/methods , Medical Informatics/methods , Practice Guidelines as Topic , Abstracting and Indexing , Algorithms , Humans , Internet , Libraries, Digital , Medical Subject Headings , User-Computer Interface , Vocabulary, Controlled
18.
AMIA Annu Symp Proc ; 2009: 452-6, 2009 Nov 14.
Article in English | MEDLINE | ID: mdl-20351898

ABSTRACT

Medical knowledge includes frequently occurring temporal patterns in longitudinal patient records. These patterns are not easily detectable by human clinicians. Current knowledge could be extended by automated temporal data mining. However, multivariate time-oriented data are often present at various levels of abstraction and at multiple temporal granularities, requiring a transformation into a more abstract, yet uniform dimension suitable for mining. Temporal abstraction (of both the time and value dimensions) can transform multiple types of point-based data into a meaningful, time-interval-based data representation, in which significant, interval-based temporal patterns can be discovered. We introduce a modular, fast time-interval mining method, KarmaLego, which exploits the transitivity inherent in temporal relations. We demonstrate the usefulness of KarmaLego in finding meaningful temporal patterns within a set of records of diabetic patients; several patterns seem to have a different frequency depending on gender. We also suggest additional uses of the discovered patterns for temporal clustering of the mined population and for classifying multivariate time series.


Subject(s)
Data Mining/methods , Electronic Health Records , Pattern Recognition, Automated/methods , Algorithms , Diabetes Mellitus/blood , Diabetes Mellitus/drug therapy , Humans , Mathematical Concepts , Time
19.
Stud Health Technol Inform ; 129(Pt 1): 422-6, 2007.
Article in English | MEDLINE | ID: mdl-17911752

ABSTRACT

Many digital libraries use hierarchical indexing schema, such as MeSH to enable concept based search in the retrieval phase. However, improving or outperforming the traditional full text search isn't trivial. We present an extensive set of experiments using a hierarchical concept based search retrieval method, applied in addition to several baselines, within the Vaidruya search and retrieval framework. Concept Based Search applied in addition to a low baseline is outperforming significantly, especially when queried on concepts in the third level and using disjunction within the hierarchical trees.


Subject(s)
Information Storage and Retrieval/methods , Natural Language Processing , Subject Headings , Abstracting and Indexing , Medical Subject Headings , Vocabulary, Controlled
20.
J Am Med Inform Assoc ; 14(2): 164-74, 2007.
Article in English | MEDLINE | ID: mdl-17213502

ABSTRACT

OBJECTIVES: Study comparatively (1) concept-based search, using documents pre-indexed by a conceptual hierarchy; (2) context-sensitive search, using structured, labeled documents; and (3) traditional full-text search. Hypotheses were: (1) more contexts lead to better retrieval accuracy; and (2) adding concept-based search to the other searches would improve upon their baseline performances. DESIGN: Use our Vaidurya architecture, for search and retrieval evaluation, of structured documents classified by a conceptual hierarchy, on a clinical guidelines test collection. MEASUREMENTS: Precision computed at different levels of recall to assess the contribution of the retrieval methods. Comparisons of precisions done with recall set at 0.5, using t-tests. RESULTS: Performance increased monotonically with the number of query context elements. Adding context-sensitive elements, mean improvement was 11.1% at recall 0.5. With three contexts, mean query precision was 42% +/- 17% (95% confidence interval [CI], 31% to 53%); with two contexts, 32% +/- 13% (95% CI, 27% to 38%); and one context, 20% +/- 9% (95% CI, 15% to 24%). Adding context-based queries to full-text queries monotonically improved precision beyond the 0.4 level of recall. Mean improvement was 4.5% at recall 0.5. Adding concept-based search to full-text search improved precision to 19.4% at recall 0.5. CONCLUSIONS: The study demonstrated usefulness of concept-based and context-sensitive queries for enhancing the precision of retrieval from a digital library of semi-structured clinical guideline documents. Concept-based searches outperformed free-text queries, especially when baseline precision was low. In general, the more ontological elements used in the query, the greater the resulting precision.


Subject(s)
Information Storage and Retrieval/methods , Vocabulary, Controlled , Abstracting and Indexing , Algorithms , Information Science , Practice Guidelines as Topic
SELECTION OF CITATIONS
SEARCH DETAIL