RESUMO
BACKGROUND: Blood component transfusions are a common and often necessary medical practice during the epidemics of dengue. Transfusions are required for patients when they developed severe dengue fever or thrombocytopenia of 10×109/L or less. This study therefore investigated the risk factors, performance and effectiveness of eight different machine-learning algorithms to predict blood component transfusion requirements in confirmed dengue cases admitted to hospital. The objective was to study the risk factors that can help to predict blood component transfusion needs. METHODS: Eight predictive models were developed based on retrospective data from a private group of hospitals in India. A python package SHAP (SHapley Additive exPlanations) was used to explain the output of the "XGBoost" model. RESULTS: Sixteen vital variables were finally selected as having the most significant effects on blood component transfusion prediction. The XGBoost model presented significantly better predictive performance (area under the curve: 0.793; 95 % confidence interval: 0.699-0.795) than the other models. CONCLUSION: Predictive modelling techniques can be utilized to streamline blood component preparation procedures and can help in the triage of high-risk patients and readiness of caregivers to provide blood component transfusions when required. This study demonstrates the potential of multilayer algorithms to reasonably predict any blood component transfusion needs which may help healthcare providers make more informed decisions regarding patient care.
RESUMO
BACKGROUND: Breast cancer (BC) is one of the most common female cancers. Clinical and histopathological information is collectively used for diagnosis, but is often not precise. We applied machine learning (ML) methods to identify the valuable gene signature model based on differentially expressed genes (DEGs) for BC diagnosis and prognosis. METHODS: A cohort of 701 samples from 11 GEO BC microarray datasets was used for the identification of significant DEGs. Seven ML methods, including RFECV-LR, RFECV-SVM, LR-L1, SVC-L1, RF, and Extra-Trees were applied for gene reduction and the construction of a diagnostic model for cancer classification. Kaplan-Meier survival analysis was performed for prognostic signature construction. The potential biomarkers were confirmed via qRT-PCR and validated by another set of ML methods including GBDT, XGBoost, AdaBoost, KNN, and MLP. RESULTS: We identified 355 DEGs and predicted BC-associated pathways, including kinetochore metaphase signaling, PTEN, senescence, and phagosome-formation pathways. A hub of 28 DEGs and a novel diagnostic nine-gene signature (COL10A, S100P, ADAMTS5, WISP1, COMP, CXCL10, LYVE1, COL11A1, and INHBA) were identified using stringent filter conditions. Similarly, a novel prognostic model consisting of eight-gene signatures (CCNE2, NUSAP1, TPX2, S100P, ITM2A, LIFR, TNXA, and ZBTB16) was also identified using disease-free survival and overall survival analysis. Gene signatures were validated by another set of ML methods. Finally, qRT-PCR results confirmed the expression of the identified gene signatures in BC. CONCLUSION: The ML approach helped construct novel diagnostic and prognostic models based on the expression profiling of BC. The identified nine-gene signature and eight-gene signatures showed excellent potential in BC diagnosis and prognosis, respectively.
RESUMO
Background: Intervention planning to reduce 30-day readmission post-acute myocardial infarction (AMI) in an environment of resource scarcity can be improved by readmission prediction score. The aim of study is to derive and validate a prediction model based on routinely collected hospital data for identification of risk factors for all-cause readmission within zero to 30 days post discharge from AMI. Methods: Our study includes 2,849 AMI patient records (January 2005 to December 2014) from a tertiary care facility in India. EMR with ICD-10 diagnosis, admission, pathological, procedural and medication data is used for model building. Model performance is analyzed for different combination of feature groups and diabetes sub-cohort. The derived models are evaluated to identify risk factors for readmissions. Results: The derived model using all features has the highest discrimination in predicting readmission, with AUC as 0.62; (95 percent confidence interval) in internal validation with 70/30 split for derivation and validation. For the sub-cohort of diabetes patients (1359) the discrimination is slightly better with AUC 0.66; (95 percent CI;). Some of the positively associated predictive variables, include age group 80-90, medicine class administered during index admission (Anti-ischemic drugs, Alpha 1 blocker, Xanthine oxidase inhibitors), additional procedure in index admission (Dialysis). While some of the negatively associated predictive variables, include patient demography (Male gender), medicine class administered during index admission (Betablocker, Anticoagulant, Platelet inhibitors, Anti-arrhythmic). Conclusions: Routinely collected data in the hospital's clinical and administrative data repository can identify patients at high risk of readmission following AMI, potentially improving AMI readmission rate.