Survival analysis for lung cancer patients: A comparison of Cox regression and machine learning models.
Int J Med Inform
; 191: 105607, 2024 Nov.
Article
en En
| MEDLINE
| ID: mdl-39208536
ABSTRACT
INTRODUCTION:
Survival analysis based on cancer registry data is of paramount importance for monitoring the effectiveness of health care. As new methods arise, the compendium of statistical tools applicable to cancer registry data grows. In recent years, machine learning approaches for survival analysis were developed. The aim of this study is to compare the model performance of the well established Cox regression and novel machine learning approaches on a previously unused dataset. MATERIAL ANDMETHODS:
The study is based on lung cancer data from the Schleswig-Holstein Cancer Registry. Four survival analysis models are compared Cox Proportional Hazard Regression (CoxPH) as the most commonly used statistical model, as well as Random Survival Forests (RSF) and two neural network architectures based on the DeepSurv and TabNet approaches. The models are evaluated using the concordance index (C-I), the Brier score and the AUC-ROC score. In addition, to gain more insight in the decision process of the models, we identified the features that have an higher impact on patient survival using permutation feature importance scores and SHAP values.RESULTS:
Using a dataset including the cancer stage established by the Union for International Cancer Control (UICC), the best performing model is the CoxPH (C-I 0.698±0.005), while using a dataset which includes the tumor size, lymph node and metastasis status (TNM) leads to the RSF as best performing model (C-I 0.703±0.004). The explainability metrics show that the models rely on the combined UICC stage and the metastasis status in the first place, which corresponds to other studies.DISCUSSION:
The studied methods are highly relevant for epidemiological researchers to create more accurate survival models, which can help physicians make informed decisions about appropriate therapies and management of patients with lung cancer, ultimately improving survival and quality of life.Palabras clave
Texto completo:
1
Base de datos:
MEDLINE
Asunto principal:
Modelos de Riesgos Proporcionales
/
Aprendizaje Automático
/
Neoplasias Pulmonares
Límite:
Aged
/
Female
/
Humans
/
Male
/
Middle aged
Idioma:
En
Revista:
Int J Med Inform
Asunto de la revista:
INFORMATICA MEDICA
Año:
2024
Tipo del documento:
Article