Development and validation of a machine learning model to predict venous thromboembolism among hospitalized cancer patients.

Meng, Lingqi; Wei, Tao; Fan, Rongrong; Su, Haoze; Liu, Jiahui; Wang, Lijie; Huang, Xinjuan; Qi, Yi; Li, Xuying

Meng, Lingqi; Wei, Tao; Fan, Rongrong; Su, Haoze; Liu, Jiahui; Wang, Lijie; Huang, Xinjuan; Qi, Yi; Li, Xuying.

Affiliation

Meng L; Xiangya School of Nursing, Central South University, Changsha, China.
Wei T; Hunan Cancer Hospital, The Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha, China.
Fan R; Hunan Cancer Hospital, The Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha, China.
Su H; Xiangya School of Nursing, Central South University, Changsha, China.
Liu J; Nanjing University of Aeronautics and Astronautics, Nanjing, China.
Wang L; Hunan Cancer Hospital, The Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha, China.
Huang X; Xiangya School of Nursing, Central South University, Changsha, China.
Qi Y; Xiangya School of Nursing, Central South University, Changsha, China.
Li X; Hunan Cancer Hospital, The Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha, China.

Asia Pac J Oncol Nurs ; 9(12): 100128, 2022 Dec.

Article in En | MEDLINE | ID: mdl-36276886

ABSTRACT

ABSTRACT

Objective:

Hospitalized cancer patients are at high risk of venous thromboembolism (VTE). However, no predictive model has been specifically developed for this population. Machine learning (ML) is advantageous for model development. This study was aimed at developing predictive models using three different ML algorithms and logistic regression for VTE risk among hospitalized cancer patients and comparing their predictive performance.

Methods:

A retrospective case-control study was conducted on hospitalized cancer patients at Hunan Cancer Hospital, China, between October 1, 2021, and February 30, 2022. Patients diagnosed with vein thrombosis before or after admission were excluded. Patient, tumor, treatment, and laboratory indicator information was obtained from the hospital information system. The data were randomly split into distributions of 80% for training and 20% for testing. Logistic regression and three ML algorithms-the support vector machine, random forest, and extreme gradient boosting (XGBoost)-were used to develop the models. Model performance was compared using F1, G-mean, area under the receiver operating characteristic curve (AUROC), accuracy, precision, recall rate, and specificity. Feature rankings were achieved based on the permutation scores of the selected features in the optimal model.

Results:

A total of 1100 patients (mean [SD] age, 54.75 [11.08] years; 485 [44.09%] male) were included in this study. There were 340 patients (30.9%) in the VTE group. The XGBoost model achieved the best performance with the following evaluation metrics F1 (0.750), G-mean (0.816), AUROC (0.818), accuracy (0.845), precision (0.750), recall rate (0.750), and specificity (0.888). D-dimer level, diabetes, hypertension, pleural metastasis, and hematological malignancies were identified as the five most significant features of the XGBoost model.

Conclusions:

Four predictive models were developed using ML algorithms. The XGBoost model was the optimal predictive model compared with the other three models. This study indicates that ML may play an important role in VTE risk estimation among hospitalized patients with cancer and provides a reference for thromboprophylaxis.

Key words

Cancer; Hospitalization; Machine learning; Model; Venous thromboembolism

Fulltext

Add to My VHL

XML

PubMed Links

Search on Google

Full text: 1 Collection: 01-internacional Database: MEDLINE Type of study: Observational_studies / Prognostic_studies / Risk_factors_studies Language: En Journal: Asia Pac J Oncol Nurs Year: 2022 Document type: Article Affiliation country:

Fulltext

Add to My VHL

XML

PubMed Links

Search on Google