Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 6 de 6
Filter
1.
Genome Biol ; 22(1): 287, 2021 10 07.
Article in English | MEDLINE | ID: mdl-34620211

ABSTRACT

BACKGROUND: The diversity of genomic alterations in cancer poses challenges to fully understanding the etiologies of the disease. Recent interest in infrequent mutations, in genes that reside in the "long tail" of the mutational distribution, uncovered new genes with significant implications in cancer development. The study of cancer-relevant genes often requires integrative approaches pooling together multiple types of biological data. Network propagation methods demonstrate high efficacy in achieving this integration. Yet, the majority of these methods focus their assessment on detecting known cancer genes or identifying altered subnetworks. In this paper, we introduce a network propagation approach that entirely focuses on prioritizing long tail genes with potential functional impact on cancer development. RESULTS: We identify sets of often overlooked, rarely to moderately mutated genes whose biological interactions significantly propel their mutation-frequency-based rank upwards during propagation in 17 cancer types. We call these sets "upward mobility genes" and hypothesize that their significant rank improvement indicates functional importance. We report new cancer-pathway associations based on upward mobility genes that are not previously identified using driver genes alone, validate their role in cancer cell survival in vitro using extensive genome-wide RNAi and CRISPR data repositories, and further conduct in vitro functional screenings resulting in the validation of 18 previously unreported genes. CONCLUSION: Our analysis extends the spectrum of cancer-relevant genes and identifies novel potential therapeutic targets.


Subject(s)
Genes, Neoplasm , Neoplasms/genetics , Cell Survival , Genes, Neoplasm/drug effects , Humans , Mutation , Neoplasms/metabolism , Protein Interaction Mapping
2.
JAMA Netw Open ; 2(7): e196835, 2019 07 03.
Article in English | MEDLINE | ID: mdl-31290991

ABSTRACT

Importance: Better prediction of major bleeding after percutaneous coronary intervention (PCI) may improve clinical decisions aimed to reduce bleeding risk. Machine learning techniques, bolstered by better selection of variables, hold promise for enhancing prediction. Objective: To determine whether machine learning techniques better predict post-PCI major bleeding compared with the existing National Cardiovascular Data Registry (NCDR) models. Design, Setting, and Participants: This comparative effectiveness study used the NCDR CathPCI Registry data version 4.4 (July 1, 2009, to April 1, 2015), machine learning techniques were used (logistic regression with lasso regularization and gradient descent boosting [XGBoost, version 0.71.2]), and output was then compared with the existing simplified risk score and full NCDR models. The existing models were recreated, and then performance was evaluated through additional techniques and variables in a 5-fold cross-validation in analysis conducted from October 1, 2015, to October 27, 2017. The setting was retrospective modeling of a nationwide clinical registry of PCI. Participants were all patients undergoing PCI. Percutaneous coronary intervention procedures were excluded if they were not the index PCI of admission, if the hospital site had missing outcomes measures, or if the patient underwent subsequent coronary artery bypass grafting. Exposures: Clinical variables available at admission and diagnostic coronary angiography data were used to determine the severity and complexity of presentation. Main Outcomes and Measures: The main outcome was in-hospital major bleeding within 72 hours after PCI. Results were evaluated by comparing C statistics, calibration, and decision threshold-based metrics, including the F score (harmonic mean of positive predictive value and sensitivity) and the false discovery rate. Results: The post-PCI major bleeding rate among 3 316 465 procedures (patients' median age, 65 years; interquartile range, 56-73 years; 68.1% male) was 4.5%. The existing full model achieved a mean C statistic of 0.78 (95% CI, 0.78-0.78). The use of XGBoost and full range of selected variables achieved a C statistic of 0.82 (95% CI, 0.82-0.82), with an F score of 0.31 (95% CI, 0.30-0.31). XGBoost correctly identified an additional 3.7% of cases identified as high risk who experienced a bleeding event and an overall improvement of 1.0% of cases identified as low risk who did not experience a bleeding event. The data-driven decision threshold helped improve the false discovery rate of the existing techniques. The existing simplified risk score model improved the false discovery rate from more than 90% to 78.7%. Modifying the model and the data decision threshold improved this rate from 78.7% to 73.4%. Conclusions and Relevance: Machine learning techniques improved the prediction of major bleeding after PCI. These techniques may help to better identify patients who would benefit most from strategies to reduce bleeding risk.


Subject(s)
Machine Learning , Percutaneous Coronary Intervention/adverse effects , Postoperative Hemorrhage/diagnosis , Registries/statistics & numerical data , Risk Assessment/methods , Aged , Clinical Decision Rules , Comparative Effectiveness Research , Coronary Artery Disease/surgery , Female , Humans , Male , Models, Statistical , Percutaneous Coronary Intervention/methods , Risk Adjustment/methods , United States
3.
Stud Health Technol Inform ; 250: 245-249, 2018.
Article in English | MEDLINE | ID: mdl-29857453

ABSTRACT

Many researchers are working toward the goal of data-driven care by predicting the risk of 30-day readmissions for patients with heart failure. Most published predictive models have used only patient level data from either single-center studies or secondary data analysis of randomized control trials. This study describes a hierarchical model that captures regional differences in addition to patient-level data from 1778 unique patients across 31 geographically distributed hospitals from one health system. The model was developed using Bayesian techniques operating on a large set of predictors. It provided Area Under Curve (AUC) of 0.64 for the validation cohort. We confirmed that the regional differences indeed exist in the observed data and verified that our model was able to capture the regional variances in predicting the risk of 30-day readmission for patients in our cohort.


Subject(s)
Heart Failure/therapy , Patient Readmission , Risk Assessment , Bayes Theorem , Cohort Studies , Humans , Models, Theoretical
4.
Stud Health Technol Inform ; 250: 250-255, 2018.
Article in English | MEDLINE | ID: mdl-29857454

ABSTRACT

Decades-long research efforts have shown that Heart Failure (HF) is the most expensive diagnosis for hospitalizations and the most frequent diagnosis for 30-day readmissions. If risk stratification for readmission of HF patients could be carried out at the time of discharge from the index hospitalization, corresponding appropriate post-discharge interventions could be arranged to avoid potential readmission. We, therefore, sought to explore and compare two newer machine learning methods of risk prediction using 56 predictors from electronic health records data of 1778 unique HF patients from 31 hospitals across the United States. We used two approaches boosted trees and spike-and-slab regression for analysis and found that boosted trees provided better predictive results (AUC: 0.719) as compared to spike-and-slab regression (AUC: 0.621) in our dataset.


Subject(s)
Heart Failure/therapy , Machine Learning , Patient Readmission , Forecasting , Hospitalization , Humans , Patient Discharge , Risk Assessment , United States
5.
IEEE J Biomed Health Inform ; 21(6): 1719-1729, 2017 11.
Article in English | MEDLINE | ID: mdl-28287993

ABSTRACT

Electronic health records (EHR) provide opportunities to leverage vast arrays of data to help prevent adverse events, improve patient outcomes, and reduce hospital costs. This paper develops a postoperative complications prediction system by extracting data from the EHR and creating features. The analytic engine then provides model accuracy, calibration, feature ranking, and personalized feature responses. This allows clinicians to interpret the likelihood of an adverse event occurring, general causes for these events, and the contributing factors for each specific patient. The patient cohort considered was 5214 patients in Yale-New Haven Hospital undergoing major cardiovascular procedures. Cohort-specific models predicted the likelihood of postoperative respiratory failure and infection, and achieved an area under the receiver operating characteristic curve of 0.81 for respiratory failure and 0.83 for infection.


Subject(s)
Cardiac Surgical Procedures/adverse effects , Machine Learning , Models, Statistical , Postoperative Complications/epidemiology , Electronic Health Records , Humans
6.
Circ Cardiovasc Qual Outcomes ; 9(6): 629-640, 2016 11.
Article in English | MEDLINE | ID: mdl-28263938

ABSTRACT

BACKGROUND: The current ability to predict readmissions in patients with heart failure is modest at best. It is unclear whether machine learning techniques that address higher dimensional, nonlinear relationships among variables would enhance prediction. We sought to compare the effectiveness of several machine learning algorithms for predicting readmissions. METHODS AND RESULTS: Using data from the Telemonitoring to Improve Heart Failure Outcomes trial, we compared the effectiveness of random forests, boosting, random forests combined hierarchically with support vector machines or logistic regression (LR), and Poisson regression against traditional LR to predict 30- and 180-day all-cause readmissions and readmissions because of heart failure. We randomly selected 50% of patients for a derivation set, and a validation set comprised the remaining patients, validated using 100 bootstrapped iterations. We compared C statistics for discrimination and distributions of observed outcomes in risk deciles for predictive range. In 30-day all-cause readmission prediction, the best performing machine learning model, random forests, provided a 17.8% improvement over LR (mean C statistics, 0.628 and 0.533, respectively). For readmissions because of heart failure, boosting improved the C statistic by 24.9% over LR (mean C statistic 0.678 and 0.543, respectively). For 30-day all-cause readmission, the observed readmission rates in the lowest and highest deciles of predicted risk with random forests (7.8% and 26.2%, respectively) showed a much wider separation than LR (14.2% and 16.4%, respectively). CONCLUSIONS: Machine learning methods improved the prediction of readmission after hospitalization for heart failure compared with LR and provided the greatest predictive range in observed readmission rates.


Subject(s)
Algorithms , Data Mining/methods , Heart Failure/therapy , Patient Readmission , Support Vector Machine , Telemedicine , Aged , Databases, Factual , Female , Heart Failure/diagnosis , Humans , Logistic Models , Male , Middle Aged , Nonlinear Dynamics , Randomized Controlled Trials as Topic , Reproducibility of Results , Risk Assessment , Risk Factors , Time Factors
SELECTION OF CITATIONS
SEARCH DETAIL
...