Supervised learning applied to classifying fallers versus non-fallers among older adults with cancer.
J Geriatr Oncol
; 14(4): 101498, 2023 05.
Article
en En
| MEDLINE
| ID: mdl-37084629
INTRODUCTION: Supervised machine learning approaches are increasingly used to analyze clinical data, including in geriatric oncology. This study presents a machine learning approach to understand falls in a cohort of older adults with advanced cancer starting chemotherapy, including fall prediction and identification of contributing factors. MATERIALS AND METHODS: This secondary analysis of prospectively collected data from the GAP 70+ Trial (NCT02054741; PI: Mohile) enrolled patients aged ≥70 with advanced cancer and ≥ 1 geriatric assessment domain impairment who planned to start a new cancer treatment regimen. Of ≥2000 baseline variables ("features") collected, 73 were selected based on clinical judgment. Machine learning models to predict falls at three months were developed, optimized, and tested using data from 522 patients. A custom data preprocessing pipeline was implemented to prepare data for analysis. Both undersampling and oversampling techniques were applied to balance the outcome measure. Ensemble feature selection was applied to identify and select the most relevant features. Four models (logistic regression [LR], k-nearest neighbor [kNN], random forest [RF], and MultiLayer Perceptron [MLP]) were trained and subsequently tested on a holdout set. Receiver operating characteristic (ROC) curves were generated and area under the curve (AUC) was calculated for each model. SHapley Additive exPlanations (SHAP) values were utilized to further understand individual feature contributions to observed predictions. RESULTS: Based on the ensemble feature selection algorithm, the top eight features were selected for inclusion in the final models. Selected features aligned with clinical intuition and prior literature. The LR, kNN, and RF models performed equivalently well in predicting falls in the test set, with AUC values 0.66-0.67, and the MLP model showed AUC 0.75. Ensemble feature selection resulted in improved AUC values compared to using LASSO alone. SHAP values, a model-agnostic technique, revealed logical associations between selected features and model predictions. DISCUSSION: Machine learning techniques can augment hypothesis-driven research, including in older adults for whom randomized trial data are limited. Interpretable machine learning is particularly important, as understanding which features impact predictions is a critical aspect of decision-making and intervention. Clinicians should understand the philosophy, strengths, and limitations of a machine learning approach applied to patient data.
Texto completo:
1
Colección:
01-internacional
Banco de datos:
MEDLINE
Asunto principal:
Neoplasias
Tipo de estudio:
Clinical_trials
/
Prognostic_studies
/
Risk_factors_studies
Límite:
Aged
/
Humans
Idioma:
En
Revista:
J Geriatr Oncol
Año:
2023
Tipo del documento:
Article