Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 26
Filter
Add more filters

Country/Region as subject
Publication year range
1.
Eur J Clin Invest ; 54(5): e14161, 2024 May.
Article in English | MEDLINE | ID: mdl-38239087

ABSTRACT

BACKGROUND: The metabolically healthy obese (MHO) phenotype is associated with an increased risk of coronary heart disease (CHD) in the general population. However, association of metabolic health and obesity phenotypes with CHD risk in adult cancer survivors remains unclear. We aimed to investigate the associations between different metabolic health and obesity phenotypes with incident CHD in adult cancer survivors. METHODS: We used National Health Insurance Service (NHIS) to identify a cohort of 173,951 adult cancer survivors aged more than 20 years free of cardiovascular complications. Metabolically healthy nonobese (MHN), MHO, metabolically unhealthy nonobese (MUN), metabolically unhealthy obese (MUO) phenotypes were created using as at least three out of five metabolic health criteria along with obesity (body mass index ≥ 25.0 kg/m2). We used Cox proportional hazards model to assess CHD risk in each metabolic health and obesity phenotypes. RESULTS: During 1,376,050 person-years of follow-up, adult cancer survivors with MHO phenotype had a significantly higher risk of CHD (hazard ratio [HR] = 1.52; 95% confidence intervals [CI]: 1.41 to 1.65) as compared to those without obesity and metabolic abnormalities. MUN (HR = 1.81; 95% CI: 1.59 to 2.06) and MUO (HR = 1.92; 95% CI: 1.72 to 2.15) phenotypes were also associated with an increased risk of CHD among adult cancer survivors. CONCLUSIONS: Adult cancer survivors with MHO phenotype had a higher risk of CHD than those who are MHN. Metabolic health status and obesity were jointly associated with CHD risk in adult cancer survivors.


Subject(s)
Cancer Survivors , Cardiovascular Diseases , Coronary Disease , Metabolic Syndrome , Neoplasms , Obesity, Metabolically Benign , Adult , Humans , Risk Factors , Cardiovascular Diseases/epidemiology , Neoplasms/epidemiology , Neoplasms/complications , Obesity/complications , Obesity/epidemiology , Body Mass Index , Coronary Disease/epidemiology , Coronary Disease/complications , Phenotype , Obesity, Metabolically Benign/epidemiology , Metabolic Syndrome/epidemiology , Metabolic Syndrome/complications
2.
Health Care Manag Sci ; 27(1): 114-129, 2024 Mar.
Article in English | MEDLINE | ID: mdl-37921927

ABSTRACT

Overcrowding of emergency departments is a global concern, leading to numerous negative consequences. This study aimed to develop a useful and inexpensive tool derived from electronic medical records that supports clinical decision-making and can be easily utilized by emergency department physicians. We presented machine learning models that predicted the likelihood of hospitalizations within 24 hours and estimated waiting times. Moreover, we revealed the enhanced performance of these machine learning models compared to existing models by incorporating unstructured text data. Among several evaluated models, the extreme gradient boosting model that incorporated text data yielded the best performance. This model achieved an area under the receiver operating characteristic curve score of 0.922 and an area under the precision-recall curve score of 0.687. The mean absolute error revealed a difference of approximately 3 hours. Using this model, we classified the probability of patients not being admitted within 24 hours as Low, Medium, or High and identified important variables influencing this classification through explainable artificial intelligence. The model results are readily displayed on an electronic dashboard to support the decision-making of emergency department physicians and alleviate overcrowding, thereby resulting in socioeconomic benefits for medical facilities.


Subject(s)
Artificial Intelligence , Waiting Lists , Humans , Hospitalization , Emergency Service, Hospital , Machine Learning , Retrospective Studies
3.
Cardiovasc Drugs Ther ; 37(1): 129-140, 2023 02.
Article in English | MEDLINE | ID: mdl-34622354

ABSTRACT

PURPOSE: To estimate the risk of recurrent cardiovascular events in a real-world population of very high-risk Korean patients with prior myocardial infarction (MI), ischemic stroke (IS), or symptomatic peripheral artery disease (sPAD), similar to the Further cardiovascular OUtcomes Research with proprotein convertase subtilisin-kexin type 9 Inhibition in subjects with Elevated Risk (FOURIER) trial population. METHODS: This retrospective study used the Asan Medical Center Heart Registry database built on electronic medical records (EMR) from 2000 to 2016. Patients with a history of clinically evident atherosclerotic cardiovascular disease (ASCVD) with multiple risk factors were followed up for 3 years. The primary endpoint was a composite of MI, stroke, hospitalization for unstable angina, coronary revascularization, and all-cause mortality. RESULTS: Among 15,820 patients, the 3-year cumulative incidence of the composite primary endpoint was 15.3% and the 3-year incidence rate was 5.7 (95% CI 5.5-5.9) per 100 person-years. At individual endpoints, the rates of deaths, MI, and IS were 0.4 (0.3-0.4), 0.9 (0.8-0.9), and 0.8 (0.7-0.9), respectively. The risk of the primary endpoint did not differ significantly between recipients of different intensities of statin therapy. Low-density lipoprotein cholesterol (LDL-C) goals were only achieved in 24.4% of patients during the first year of follow-up. CONCLUSION: By analyzing EMR data representing routine practice in Korea, we found that patients with very high-risk ASCVD were at substantial risk of further cardiovascular events in 3 years. Given the observed risk of recurrent events with suboptimal lipid management by statin, additional treatment to control LDL-C might be necessary to reduce the burden of further cardiovascular events for very high-risk ASCVD patients.


Subject(s)
Anticholesteremic Agents , Atherosclerosis , Cardiovascular Diseases , Hydroxymethylglutaryl-CoA Reductase Inhibitors , Humans , Hydroxymethylglutaryl-CoA Reductase Inhibitors/adverse effects , Cardiovascular Diseases/diagnosis , Cardiovascular Diseases/drug therapy , Cardiovascular Diseases/epidemiology , Cholesterol, LDL , Anticholesteremic Agents/adverse effects , Electronic Health Records , Retrospective Studies , Proprotein Convertase 9 , Republic of Korea/epidemiology
4.
BMC Med Inform Decis Mak ; 21(1): 29, 2021 01 28.
Article in English | MEDLINE | ID: mdl-33509180

ABSTRACT

BACKGROUND: Cardiovascular diseases (CVDs) are difficult to diagnose early and have risk factors that are easy to overlook. Early prediction and personalization of treatment through the use of artificial intelligence (AI) may help clinicians and patients manage CVDs more effectively. However, to apply AI approaches to CVDs data, it is necessary to establish and curate a specialized database based on electronic health records (EHRs) and include pre-processed unstructured data. METHODS: To build a suitable database (CardioNet) for CVDs that can utilize AI technology, contributing to the overall care of patients with CVDs. First, we collected the anonymized records of 748,474 patients who had visited the Asan Medical Center (AMC) or Ulsan University Hospital (UUH) because of CVDs. Second, we set clinically plausible criteria to remove errors and duplication. Third, we integrated unstructured data such as readings of medical examinations with structured data sourced from EHRs to create the CardioNet. We subsequently performed natural language processing to structuralize the significant variables associated with CVDs because most results of the principal CVD-related medical examinations are free-text readings. Additionally, to ensure interoperability for convergent multi-center research, we standardized the data using several codes that correspond to the common data model. Finally, we created the descriptive table (i.e., dictionary of the CardioNet) to simplify access and utilization of data for clinicians and engineers and continuously validated the data to ensure reliability. RESULTS: CardioNet is a comprehensive database that can serve as a training set for AI models and assist in all aspects of clinical management of CVDs. It comprises information extracted from EHRs and results of readings of CVD-related digital tests. It consists of 27 tables, a code-master table, and a descriptive table. CONCLUSIONS: CardioNet database specialized in CVDs was established, with continuing data collection. We are actively supporting multi-center research, which may require further data processing, depending on the subject of the study. CardioNet will serve as the fundamental database for future CVD-related research projects.


Subject(s)
Artificial Intelligence , Cardiovascular Diseases , Cardiovascular Diseases/diagnosis , Cardiovascular Diseases/epidemiology , Databases, Factual , Humans , Natural Language Processing , Reproducibility of Results
7.
Comput Biol Med ; 168: 107738, 2024 01.
Article in English | MEDLINE | ID: mdl-37995536

ABSTRACT

Electronic medical records(EMR) have considerable potential to advance healthcare technologies, including medical AI. Nevertheless, due to the privacy issues associated with the sharing of patient's personal information, it is difficult to sufficiently utilize them. Generative models based on deep learning can solve this problem by creating synthetic data similar to real patient data. However, the data used for training these deep learning models run into the risk of getting leaked because of malicious attacks. This means that traditional deep learning-based generative models cannot completely solve the privacy issues. Therefore, we suggested a method to prevent the leakage of training data by protecting the model from malicious attacks using local differential privacy(LDP). Our method was evaluated in terms of utility and privacy. Experimental results demonstrated that the proposed method can generate medical data with reasonable performance while protecting training data from malicious attacks.


Subject(s)
Electronic Health Records , Privacy , Humans , Health Facilities
8.
Sci Rep ; 14(1): 17723, 2024 07 31.
Article in English | MEDLINE | ID: mdl-39085306

ABSTRACT

Loop diuretics are prevailing drugs to manage fluid overload in heart failure. However, adjusting to loop diuretic doses is strenuous due to the lack of a diuretic guideline. Accordingly, we developed a novel clinician decision support system for adjusting loop diuretics dosage with a Long Short-Term Memory (LSTM) algorithm using time-series EMRs. Weight measurements were used as the target to estimate fluid loss during diuretic therapy. We designed the TSFD-LSTM, a bi-directional LSTM model with an attention mechanism, to forecast weight change 48 h after heart failure patients were injected with loop diuretics. The model utilized 65 variables, including disease conditions, concurrent medications, laboratory results, vital signs, and physical measurements from EMRs. The framework processed four sequences simultaneously as inputs. An ablation study on attention mechanisms and a comparison with the transformer model as a baseline were conducted. The TSFD-LSTM outperformed the other models, achieving 85% predictive accuracy with MAE and MSE values of 0.56 and 1.45, respectively. Thus, the TSFD-LSTM model can aid in personalized loop diuretic treatment and prevent adverse drug events, contributing to improved healthcare efficacy for heart failure patients.


Subject(s)
Heart Failure , Humans , Heart Failure/drug therapy , Male , Female , Aged , Algorithms , Middle Aged , Body Weight , Diuretics/administration & dosage , Sodium Potassium Chloride Symporter Inhibitors/administration & dosage , Memory, Short-Term/drug effects
9.
Int J Surg ; 2024 Aug 08.
Article in English | MEDLINE | ID: mdl-39116448

ABSTRACT

BACKGROUND: Accurate forecasting of clinical outcomes after kidney transplantation is essential for improving patient care and increasing the success rates of transplants. Our study employs advanced machine learning (ML) algorithms to identify crucial prognostic indicators for kidney transplantation. By analyzing complex datasets with ML models, we aim to enhance prediction accuracy and provide valuable insights to support clinical decision-making. MATERIALS AND METHODS: Analyzing data from 4077 KT patients (June 1990 - May 2015) at a single center, this research included 27 features encompassing recipient/donor traits and peri-transplant data. The dataset was divided into training (80%) and testing (20%) sets. Four ML models-eXtreme Gradient Boosting (XGBoost), Feedforward Neural Network, Logistic Regression, and Support Vector Machine-were trained on carefully selected features to predict the success of graft survival. Performance was assessed by precision, sensitivity, F1 score, Area Under the Receiver Operating Characteristic (AUROC), and Area Under the Precision-Recall Curve. RESULTS: XGBoost emerged as the best model, with an AUROC of 0.828, identifying key survival predictors like T-cell flow crossmatch positivity, creatinine levels two years post-transplant and human leukocyte antigen mismatch. The study also examined the prognostic importance of histological features identified by the Banff criteria for renal biopsy, emphasizing the significance of intimal arteritis, interstitial inflammation, and chronic glomerulopathy. CONCLUSION: The study developed ML models that pinpoint clinical factors crucial for KT graft survival, aiding clinicians in making informed post-transplant care decisions. Incorporating these findings with the Banff classification could improve renal pathology diagnosis and treatment, offering a data-driven approach to prioritizing pathology scores.

10.
Heliyon ; 10(2): e24620, 2024 Jan 30.
Article in English | MEDLINE | ID: mdl-38304832

ABSTRACT

Background and Objective: Although interest in predicting drug-drug interactions is growing, many predictions are not verified by real-world data. This study aimed to confirm whether predicted polypharmacy side effects using public data also occur in data from actual patients. Methods: We utilized a deep learning-based polypharmacy side effects prediction model to identify cefpodoxime-chlorpheniramine-lung edema combination with a high prediction score and a significant patient population. The retrospective study analyzed patients over 18 years old who were admitted to the Asan medical center between January 2000 and December 2020 and took cefpodoxime or chlorpheniramine orally. The three groups, cefpodoxime-treated, chlorpheniramine-treated, and cefpodoxime & chlorpheniramine-treated were compared using inverse probability of treatment weighting (IPTW) to balance them. Differences between the three groups were analyzed using the Kaplan-Meier method and Cox proportional hazards model. Results: The study population comprised 54,043 patients with a history of taking cefpodoxime, 203,897 patients with a history of taking chlorpheniramine, and 1,628 patients with a history of taking cefpodoxime and chlorpheniramine simultaneously. After adjustment, the 1-year cumulative incidence of lung edema in the patient group that took cefpodoxime and chlorpheniramine simultaneously was significantly higher than in the patient groups that took cefpodoxime or chlorpheniramine only (p=0.001). Patients taking cefpodoxime and chlorpheniramine together had an increased risk of lung edema compared to those taking cefpodoxime alone [hazard ratio (HR) 2.10, 95% CI 1.26-3.52, p<0.005] and those taking chlorpheniramine alone, which also increased the risk of lung edema (HR 1.64, 95% CI 0.99-2.69, p=0.05). Conclusions: Validation of polypharmacy side effect predictions with real-world data can aid patient and clinician decision-making before conducting randomized controlled trials. Simultaneous use of cefpodoxime and chlorpheniramine was associated with a higher long-term risk of lung edema compared to the use of cefpodoxime or chlorpheniramine alone.

11.
JMIR Med Inform ; 12: e53400, 2024 Mar 21.
Article in English | MEDLINE | ID: mdl-38513229

ABSTRACT

BACKGROUND: Predicting the bed occupancy rate (BOR) is essential for efficient hospital resource management, long-term budget planning, and patient care planning. Although macro-level BOR prediction for the entire hospital is crucial, predicting occupancy at a detailed level, such as specific wards and rooms, is more practical and useful for hospital scheduling. OBJECTIVE: The aim of this study was to develop a web-based support tool that allows hospital administrators to grasp the BOR for each ward and room according to different time periods. METHODS: We trained time-series models based on long short-term memory (LSTM) using individual bed data aggregated hourly each day to predict the BOR for each ward and room in the hospital. Ward training involved 2 models with 7- and 30-day time windows, and room training involved models with 3- and 7-day time windows for shorter-term planning. To further improve prediction performance, we added 2 models trained by concatenating dynamic data with static data representing room-specific details. RESULTS: We confirmed the results of a total of 12 models using bidirectional long short-term memory (Bi-LSTM) and LSTM, and the model based on Bi-LSTM showed better performance. The ward-level prediction model had a mean absolute error (MAE) of 0.067, mean square error (MSE) of 0.009, root mean square error (RMSE) of 0.094, and R2 score of 0.544. Among the room-level prediction models, the model that combined static data exhibited superior performance, with a MAE of 0.129, MSE of 0.050, RMSE of 0.227, and R2 score of 0.600. Model results can be displayed on an electronic dashboard for easy access via the web. CONCLUSIONS: We have proposed predictive BOR models for individual wards and rooms that demonstrate high performance. The results can be visualized through a web-based dashboard, aiding hospital administrators in bed operation planning. This contributes to resource optimization and the reduction of hospital resource use.

12.
Sci Rep ; 14(1): 23443, 2024 10 08.
Article in English | MEDLINE | ID: mdl-39379478

ABSTRACT

Predicting major adverse cardiovascular events (MACE) is crucial due to its high readmission rate and severe sequelae. Current risk scoring model of MACE are based on a few features of a patient status at a single time point. We developed a self-attention-based model to predict MACE within 3 years from time series data utilizing numerous features in electronic medical records (EMRs). In addition, we demonstrated transfer learning for hospitals with insufficient data through code mapping and feature selection by the calculated importance using Xgboost. We established operational definitions and categories for diagnoses, medications, and laboratory tests to streamline scattered codes, enhancing clinical interpretability across hospitals. This resulted in reduced feature size and improved data quality for transfer learning. The pre-trained model demonstrated an increase in AUROC after transfer learning, from 0.564 to 0.821. Furthermore, to validate the effectiveness of the predicted scores, we analyzed the data using traditional survival analysis, which confirmed an elevated hazard ratio for a group with high scores.


Subject(s)
Cardiovascular Diseases , Electronic Health Records , Hospitals , Humans , Cardiovascular Diseases/epidemiology , Male , Female , Aged , Middle Aged , Risk Assessment/methods , Risk Factors
13.
PLoS One ; 18(5): e0286346, 2023.
Article in English | MEDLINE | ID: mdl-37228155

ABSTRACT

BACKGROUND: Dietary sodium intake is a crucial lifestyle factor that should be assessed in adult cancer survivors due to their increased risk of adverse health outcomes compared to the general population. However, its with impaired fasting glucose (IFG) in adult cancer survivors remains unclear. This study aimed to investigate the association of dietary sodium intake categorized by the American Heart Association (AHA) recommendation with IFG in the community-dwelling adult cancer survivors. METHODS: A total of 1,052 adult cancer survivors without diabetes were identified from the sixth and seventh Korea National Health and Nutrition Examination Survey (KNHANES), 2013-2018. Data on dietary sodium intake was categorized as <1,500 mg/day, 1,500-2,999 mg/day, 2,300-3,999 mg/day, and ≥4,000 mg/day according to the AHA recommendation. A multiple logistic regression model adjusted for demographic, lifestyle, and health status was used to compute odds ratios (OR) and 95% confidence intervals (95% CI) for IFG according to dietary sodium intake categories. RESULTS: After adjusting for confounding variables identified in the KNHANES, the adjusted OR among the adult cancer survivors who consumed 1,500-2,999 mg/day, 2,300-3,999 mg/day, and ≥4,000 mg/day of dietary sodium were 1.16 (95% CI: 0.25-5.27), 1.93 (95% CI: 0.40-9.37), and 2.67 (95% CI: 0.59-12.18), respectively, as compared to those who consumed <1,500 mg/day (P value for trend = 0.036). CONCLUSION: Among community-dwelling adult cancer survivors, high dietary sodium intake was marginally associated with increased odds of IFG. Well-designed cohort studies or randomized clinical trials are needed to establish more epidemiologic evidence on this association in adult cancer survivors.


Subject(s)
Cancer Survivors , Neoplasms , Prediabetic State , Sodium, Dietary , Humans , Adult , Cross-Sectional Studies , Nutrition Surveys , Prediabetic State/epidemiology , Fasting , Glucose , Neoplasms/epidemiology
14.
Sci Rep ; 13(1): 16837, 2023 10 06.
Article in English | MEDLINE | ID: mdl-37803039

ABSTRACT

Adult cancer survivors may have an increased risk of developing ischemic stroke, potentially influenced by cancer treatment-related factors and shared risk factors with stroke. However, the association between gamma-glutamyl transferase (GGT) levels and the risk of ischemic stroke in this population remains understudied. Therefore, our study aimed to examine the relationship between GGT levels and the risk of ischemic stroke using a population-based cohort of adult cancer survivors. A population-based cohort of adult cancer survivors was derived from the National Health Insurance Service-Health Screening Cohort between 2003 and 2005 who survived after diagnosis of primary cancer and participated in the biennial national health screening program between 2009 and 2010. Cox proportional hazards model adjusted for sociodemographic factors, health status and behavior, and clinical characteristics was used to investigate the association between GGT level and ischemic stroke in adult cancer survivors. Among 3095 adult cancer survivors, 80 (2.58%) incident cases of ischemic stroke occurred over a mean follow-up of 8.2 years. Compared to the lowest GGT quartile, the hazard ratios (HRs) for ischemic stroke were 1.56 (95% CI 0.75-3.26), 2.36 (95% CI 1.12-4.99), and 2.40 (95% CI 1.05-5.46) for the second, third, and fourth sex-specific quartiles, respectively (Ptrend = 0.013). No significant effect modification was observed by sex, insurance premium, and alcohol consumption. High GGT level is associated with an increased risk of ischemic stroke in adult cancer survivors independent of sex, insurance premium, and alcohol consumption.


Subject(s)
Cancer Survivors , Ischemic Stroke , Neoplasms , Stroke , Male , Female , Humans , Adult , Ischemic Stroke/complications , Cohort Studies , gamma-Glutamyltransferase , Risk Factors , Stroke/epidemiology , Stroke/etiology , Neoplasms/complications
15.
Sci Rep ; 13(1): 22461, 2023 12 18.
Article in English | MEDLINE | ID: mdl-38105280

ABSTRACT

As warfarin has a narrow therapeutic window and obvious response variability among individuals, it is difficult to rapidly determine personalized warfarin dosage. Adverse drug events(ADE) resulting from warfarin overdose can be critical, so that typically physicians adjust the warfarin dosage through the INR monitoring twice a week when starting warfarin. Our study aimed to develop machine learning (ML) models that predicts the discharge dosage of warfarin as the initial warfarin dosage using clinical data derived from electronic medical records within 2 days of hospitalization. During this retrospective study, adult patients who were prescribed warfarin at Asan Medical Center (AMC) between January 1, 2018, and October 31, 2020, were recruited as a model development cohort (n = 3168). Additionally, we created an external validation dataset (n = 891) from a Medical Information Mart for Intensive Care III (MIMIC-III). Variables for a model prediction were selected based on the clinical rationale that turned out to be associated with warfarin dosage, such as bleeding. The discharge dosage of warfarin was used the study outcome, because we assumed that patients achieved target INR at discharge. In this study, four ML models that predicted the warfarin discharge dosage were developed. We evaluated the model performance using the mean absolute error (MAE) and prediction accuracy. Finally, we compared the accuracy of the predictions of our models and the predictions of physicians for 40 data point to verify a clinical relevance of the models. The MAEs obtained using the internal validation set were as follows: XGBoost, 0.9; artificial neural network, 0.9; random forest, 1.0; linear regression, 1.0; and physicians, 1.3. As a result, our models had better prediction accuracy than the physicians, who have difficulty determining the warfarin discharge dosage using clinical information obtained within 2 days of hospitalization. We not only conducted the internal validation but also external validation. In conclusion, our ML model could help physicians predict the warfarin discharge dosage as the initial warfarin dosage from Korean population. However, conducting a successfully external validation in a further work is required for the application of the models.


Subject(s)
Patient Discharge , Warfarin , Adult , Humans , Warfarin/adverse effects , Retrospective Studies , Inpatients , Anticoagulants/adverse effects , Machine Learning
16.
Sci Rep ; 12(1): 21152, 2022 12 07.
Article in English | MEDLINE | ID: mdl-36477457

ABSTRACT

Graph representation learning is a method for introducing how to effectively construct and learn patient embeddings using electronic medical records. Adapting the integration will support and advance the previous methods to predict the prognosis of patients in network models. This study aims to address the challenge of implementing a complex and highly heterogeneous dataset, including the following: (1) demonstrating how to build a multi-attributed and multi-relational graph model (2) and applying a downstream disease prediction task of a patient's prognosis using the HinSAGE algorithm. We present a bipartite graph schema and a graph database construction in detail. The first constructed graph database illustrates a query of a predictive network that provides analytical insights using a graph representation of a patient's journey. Moreover, we demonstrate an alternative bipartite model where we apply the model to the HinSAGE to perform the link prediction task for predicting the event occurrence. Consequently, the performance evaluation indicated that our heterogeneous graph model was successfully predicted as a baseline model. Overall, our graph database successfully demonstrated efficient real-time query performance and showed HinSAGE implementation to predict cardiovascular disease event outcomes on supervised link prediction learning.


Subject(s)
Electronic Health Records , Humans
17.
JMIR Med Inform ; 10(3): e32313, 2022 Mar 07.
Article in English | MEDLINE | ID: mdl-35254275

ABSTRACT

BACKGROUND: Scoring systems developed for predicting survival after allogeneic hematopoietic cell transplantation (HCT) show suboptimal prediction power, and various factors affect posttransplantation outcomes. OBJECTIVE: A prediction model using a machine learning-based algorithm can be an alternative for concurrently applying multiple variables and can reduce potential biases. In this regard, the aim of this study is to establish and validate a machine learning-based predictive model for survival after allogeneic HCT in patients with hematologic malignancies. METHODS: Data from 1470 patients with hematologic malignancies who underwent allogeneic HCT between December 1993 and June 2020 at Asan Medical Center, Seoul, South Korea, were retrospectively analyzed. Using the gradient boosting machine algorithm, we evaluated a model predicting the 5-year posttransplantation survival through 10-fold cross-validation. RESULTS: The prediction model showed good performance with a mean area under the receiver operating characteristic curve of 0.788 (SD 0.03). Furthermore, we developed a risk score predicting probabilities of posttransplantation survival in 294 randomly selected patients, and an agreement between the estimated predicted and observed risks of overall death, nonrelapse mortality, and relapse incidence was observed according to the risk score. Additionally, the calculated score demonstrated the possibility of predicting survival according to the different transplantation-related factors, with the visualization of the importance of each variable. CONCLUSIONS: We developed a machine learning-based model for predicting long-term survival after allogeneic HCT in patients with hematologic malignancies. Our model provides a method for making decisions regarding patient and donor candidates or selecting transplantation-related resources, such as conditioning regimens.

18.
Comput Methods Programs Biomed ; 221: 106866, 2022 Jun.
Article in English | MEDLINE | ID: mdl-35594580

ABSTRACT

BACKGROUND AND OBJECTIVE: With the advent of bioinformatics, biological databases have been constructed to computerize data. Biological systems can be described as interactions and relationships between elements constituting the systems, and they are organized in various biomedical open databases. These open databases have been used in approaches to predict functional interactions such as protein-protein interactions (PPI), drug-drug interactions (DDI) and disease-disease relationships (DDR). However, just combining interaction data has limited effectiveness in predicting the complex relationships occurring in a whole context. Each contributing source contains information on each element in a specific field of knowledge but there is a lack of inter-disciplinary insight in combining them. METHODS: In this study, we propose the RWD Integrated platform for Discovering Associations in Biomedical research (RIDAB) to predict interactions between biomedical entities. RIDAB is established as a graph network to construct a platform that predicts the interactions of target entities. Biomedical open database is combined with EMRs each representing a biomedical network and a real-world data. To integrate databases from different domains to build the platform, mapping of the vocabularies was required. In addition, the appropriate structure of the network and the graph embedding method to be used were needed to be selected to fit the tasks. RESULTS: The feasibility of the platform was evaluated using node similarity and link prediction for drug repositioning task, a commonly used task for biomedical network. In addition, we compared the US Food and Drug Administration (FDA)-approved repositioned drugs with the predicted result. By integrating EMR database with biomedical networks, the platform showed increased f1 score in predicting repositioned drugs, from 45.62% to 57.26%, compared to platforms based on biomedical networks alone. CONCLUSIONS: This study demonstrates that the elements of biomedical research findings can be reflected by integrating EMR data with open-source biomedical networks. In addition, showed the feasibility of using the established platform to represent the integration of biomedical networks and reflected the relationship between real world networks.


Subject(s)
Biomedical Research , Electronic Health Records , Databases, Factual
19.
JMIR Med Inform ; 10(5): e26801, 2022 May 11.
Article in English | MEDLINE | ID: mdl-35544292

ABSTRACT

BACKGROUND: Although there is a growing interest in prediction models based on electronic medical records (EMRs) to identify patients at risk of adverse cardiac events following invasive coronary treatment, robust models fully utilizing EMR data are limited. OBJECTIVE: We aimed to develop and validate machine learning (ML) models by using diverse fields of EMR to predict the risk of 30-day adverse cardiac events after percutaneous intervention or bypass surgery. METHODS: EMR data of 5,184,565 records of 16,793 patients at a quaternary hospital between 2006 and 2016 were categorized into static basic (eg, demographics), dynamic time-series (eg, laboratory values), and cardiac-specific data (eg, coronary angiography). The data were randomly split into training, tuning, and testing sets in a ratio of 3:1:1. Each model was evaluated with 5-fold cross-validation and with an external EMR-based cohort at a tertiary hospital. Logistic regression (LR), random forest (RF), gradient boosting machine (GBM), and feedforward neural network (FNN) algorithms were applied. The primary outcome was 30-day mortality following invasive treatment. RESULTS: GBM showed the best performance with area under the receiver operating characteristic curve (AUROC) of 0.99; RF had a similar AUROC of 0.98. AUROCs of FNN and LR were 0.96 and 0.93, respectively. GBM had the highest area under the precision-recall curve (AUPRC) of 0.80, and the AUPRCs of RF, LR, and FNN were 0.73, 0.68, and 0.63, respectively. All models showed low Brier scores of <0.1 as well as highly fitted calibration plots, indicating a good fit of the ML-based models. On external validation, the GBM model demonstrated maximal performance with an AUROC of 0.90, while FNN had an AUROC of 0.85. The AUROCs of LR and RF were slightly lower at 0.80 and 0.79, respectively. The AUPRCs of GBM, LR, and FNN were similar at 0.47, 0.43, and 0.41, respectively, while that of RF was lower at 0.33. Among the categories in the GBM model, time-series dynamic data demonstrated a high AUROC of >0.95, contributing majorly to the excellent results. CONCLUSIONS: Exploiting the diverse fields of the EMR data set, the ML-based 30-day adverse cardiac event prediction models demonstrated outstanding results, and the applied framework could be generalized for various health care prediction models.

20.
Comput Methods Programs Biomed ; 208: 106281, 2021 Sep.
Article in English | MEDLINE | ID: mdl-34333207

ABSTRACT

Background and objectiveDetecting abnormal patterns within an electrocardiogram (ECG) is crucial for diagnosing cardiovascular diseases. We start from two unresolved problems in applying deep-learning-based ECG classification models to clinical practice: first, although multiple cardiac arrhythmia (CA) types may co-occur in real life, the majority of previous detection methods have focused on one-to-one relationships between ECG and CA type, and second, it has been difficult to explain how neural-network-based CA classifiers make decisions. We hypothesize that fine-tuning attention maps with regard to all possible combinations of ground-truth (GT) labels will improve both the detection and interpretability of co-occurring CAs. Methods To test our hypothesis, we propose an end-to-end convolutional neural network (CNN), xECGNet, that fine-tunes the attention map to resemble the averaged response maps of GT labels. Fine-tuning is achieved by adding to the objective function a regularization loss between the attention map and the reference (averaged) map. Performance is assessed by F1 score and subset accuracy. Results The main experiment demonstrates that fine-tuning alone significantly improves a model's multilabel subset accuracy from 75.8% to 84.5% when compared with the baseline model. Also, xECGNet shows the highest F1 score of 0.812 and yields a more explainable map that encompasses multiple CA types, when compared to other baseline methods. Conclusions xECGNet has implications in that it tackles the two obstacles for the clinical application of CNN-based CA detection models with a simple solution of adding one additional term to the objective function.


Subject(s)
Algorithms , Neural Networks, Computer , Arrhythmias, Cardiac/diagnosis , Attention , Electrocardiography , Humans
SELECTION OF CITATIONS
SEARCH DETAIL