RESUMEN
PURPOSE: Preoperative prediction of postoperative complications (PCs) in inpatients with cancer is challenging. We developed an explainable machine learning (ML) model to predict PCs in a heterogenous population of inpatients with cancer undergoing same-hospitalization major operations. METHODS: Consecutive inpatients who underwent same-hospitalization operations from December 2017 to June 2021 at a single institution were retrospectively reviewed. The ML model was developed and tested using electronic health record (EHR) data to predict 30-day PCs for patients with Clavien-Dindo grade 3 or higher (CD 3+) per the CD classification system. Model performance was assessed using area under the receiver operating characteristic curve (AUROC), area under the precision recall curve (AUPRC), and calibration plots. Model explanation was performed using the Shapley additive explanations (SHAP) method at cohort and individual operation levels. RESULTS: A total of 988 operations in 827 inpatients were included. The ML model was trained using 788 operations and tested using a holdout set of 200 operations. The CD 3+ complication rates were 28.6% and 27.5% in the training and holdout test sets, respectively. Training and holdout test sets' model performance in predicting CD 3+ complications yielded an AUROC of 0.77 and 0.73 and an AUPRC of 0.56 and 0.52, respectively. Calibration plots demonstrated good reliability. The SHAP method identified features and the contributions of the features to the risk of PCs. CONCLUSION: We trained and tested an explainable ML model to predict the risk of developing PCs in patients with cancer. Using patient-specific EHR data, the ML model accurately discriminated the risk of developing CD 3+ complications and displayed top features at the individual operation and cohort level.
Asunto(s)
Pacientes Internos , Aprendizaje Automático , Neoplasias , Complicaciones Posoperatorias , Humanos , Complicaciones Posoperatorias/etiología , Complicaciones Posoperatorias/epidemiología , Complicaciones Posoperatorias/diagnóstico , Neoplasias/cirugía , Femenino , Masculino , Persona de Mediana Edad , Anciano , Estudios Retrospectivos , Registros Electrónicos de Salud , Curva ROC , Medición de Riesgo/métodosRESUMEN
Importance: To date, oncologist and model prognostic performance have been assessed independently and mostly retrospectively; however, how model prognostic performance compares with oncologist prognostic performance prospectively remains unknown. Objective: To compare oncologist performance with a model in predicting 3-month mortality for patients with metastatic solid tumors in an outpatient setting. Design, Setting, and Participants: This prognostic study evaluated prospective predictions for a cohort of patients with metastatic solid tumors seen in outpatient oncology clinics at a National Cancer Institute-designated cancer center and associated satellites between December 6, 2019, and August 6, 2021. Oncologists (57 physicians and 17 advanced practice clinicians) answered a 3-month surprise question (3MSQ) within clinical pathways. A model was trained with electronic health record data from January 1, 2013, to April 24, 2019, to identify patients at high risk of 3-month mortality and deployed silently in October 2019. Analysis was limited to oncologist prognostications with a model prediction within the preceding 30 days. Exposures: Three-month surprise question and gradient-boosting binary classifier. Main Outcomes and Measures: The primary outcome was performance comparison between oncologists and the model to predict 3-month mortality. The primary performance metric was the positive predictive value (PPV) at the sensitivity achieved by the medical oncologists with their 3MSQ answers. Results: A total of 74 oncologists answered 3099 3MSQs for 2041 patients with advanced cancer (median age, 62.6 [range, 18-96] years; 1271 women [62.3%]). In this cohort with a 15% prevalence of 3-month mortality and 30% sensitivity for both oncologists and the model, the PPV of oncologists was 34.8% (95% CI, 30.1%-39.5%) and the PPV of the model was 60.0% (95% CI, 53.6%-66.3%). Area under the receiver operating characteristic curve for the model was 81.2% (95% CI, 79.1%-83.3%). The model significantly outperformed the oncologists in short-term mortality. Conclusions and Relevance: In this prognostic study, the model outperformed oncologists overall and within the breast and gastrointestinal cancer cohorts in predicting 3-month mortality for patients with advanced cancer. These findings suggest that further studies may be useful to examine how model predictions could improve oncologists' prognostic confidence and patient-centered goal-concordant care at the end of life.
Asunto(s)
Neoplasias Primarias Secundarias , Neoplasias , Oncólogos , Femenino , Humanos , Aprendizaje Automático , Persona de Mediana Edad , Estudios Prospectivos , Estudios RetrospectivosRESUMEN
BACKGROUND AND OBJECTIVES: Post-discharge oncologic surgical complications are costly for patients, families, and healthcare systems. The capacity to predict complications and early intervention can improve postoperative outcomes. In this proof-of-concept study, we used a machine learning approach to explore the potential added value of patient-reported outcomes (PROs) and patient-generated health data (PGHD) in predicting post-discharge complications for gastrointestinal (GI) and lung cancer surgery patients. METHODS: We formulated post-discharge complication prediction as a binary classification task. Features were extracted from clinical variables, PROs (MD Anderson Symptom Inventory [MDASI]), and PGHD (VivoFit) from a cohort of 52 patients with 134 temporal observation points pre- and post-discharge that were collected from two pilot studies. We trained and evaluated supervised learning classifiers via nested cross-validation. RESULTS: A logistic regression model with L2 regularization trained with clinical data, PROs and PGHD from wearable pedometers achieved an area under the receiver operating characteristic of 0.74. CONCLUSIONS: PROs and PGHDs captured through remote patient telemonitoring approaches have the potential to improve prediction performance for postoperative complications.
Asunto(s)
Cuidados Posteriores/normas , Neoplasias/cirugía , Alta del Paciente , Evaluación del Resultado de la Atención al Paciente , Medición de Resultados Informados por el Paciente , Complicaciones Posoperatorias/fisiopatología , Tecnología Inalámbrica/instrumentación , Adulto , Anciano , Anciano de 80 o más Años , Estudios de Cohortes , Femenino , Estudios de Seguimiento , Humanos , Aprendizaje Automático , Masculino , Persona de Mediana Edad , Neoplasias/patología , Valor Predictivo de las Pruebas , Recuperación de la Función , Adulto JovenRESUMEN
PURPOSE: Thirty-day unplanned readmission is one of the key components in measuring quality in patient care. Risk of readmission in oncology patients may be associated with a wide variety of specific factors including laboratory results and diagnoses, and it is hard to include all such features using traditional approaches such as one-hot encoding in predictive models. METHODS: We used clinical embeddings to represent complex medical concepts in lower dimensional spaces. For predictive modeling, we used gradient-boosted trees and adopted the shapley additive explanation framework to offer consistent individualized predictions. We used retrospective inpatient data between 2013 and 2018 with temporal split for training and testing. RESULTS: Our best performing model predicting readmission at discharge using clinical embeddings showed a testing area under receiver operating characteristic curve of 0.78 (95% CI, 0.77 to 0.80). Use of clinical embeddings led to up to 23.1% gain in area under precision-recall curve and 6% in area under receiver operating characteristic curve. Hematology models had more performance gain over surgery and medical oncology. Our study was the first to develop (1) explainable predictive models for the hematology population and (2) dynamic models to keep track of readmission risk throughout the duration of patient visit. CONCLUSION: To our knowledge, our study was the first to develop (1) explainable predictive models for the hematology population and (2) dynamic models to keep track of readmission risk throughout the duration of patient visit.
Asunto(s)
Neoplasias , Readmisión del Paciente , Humanos , Neoplasias/terapia , Alta del Paciente , Curva ROC , Estudios RetrospectivosRESUMEN
In this review, we aim to assess the current state of science in relation to the integration of patient-generated health data (PGHD) and patient-reported outcomes (PROs) into routine clinical care with a focus on surgical oncology populations. We will also describe the critical role of artificial intelligence and machine-learning methodology in the efficient translation of PGHD, PROs, and traditional outcome measures into meaningful patient care models.