Your browser doesn't support javascript.
loading
Machine Learning and Real-World Data to Predict Lung Cancer Risk in Routine Care.
Chandran, Urmila; Reps, Jenna; Yang, Robert; Vachani, Anil; Maldonado, Fabien; Kalsekar, Iftekhar.
Afiliación
  • Chandran U; Johnson & Johnson Global Epidemiology, Titusville, New Jersey.
  • Reps J; Lung Cancer Initiative, Johnson & Johnson, New Brunswick, New Jersey.
  • Yang R; Johnson & Johnson Global Epidemiology, Titusville, New Jersey.
  • Vachani A; Lung Cancer Initiative, Johnson & Johnson, New Brunswick, New Jersey.
  • Maldonado F; University of Pennsylvania, Perelman School of Medicine, Philadelphia, Pennsylvania.
  • Kalsekar I; Vanderbilt University, Nashville, Tennessee.
Cancer Epidemiol Biomarkers Prev ; 32(3): 337-343, 2023 03 06.
Article en En | MEDLINE | ID: mdl-36576991
ABSTRACT

BACKGROUND:

This study used machine learning to develop a 3-year lung cancer risk prediction model with large real-world data in a mostly younger population.

METHODS:

Over 4.7 million individuals, aged 45 to 65 years with no history of any cancer or lung cancer screening, diagnostic, or treatment procedures, with an outpatient visit in 2013 were identified in Optum's de-identified Electronic Health Record (EHR) dataset. A least absolute shrinkage and selection operator model was fit using all available data in the 365 days prior. Temporal validation was assessed with recent data. External validation was assessed with data from Mercy Health Systems EHR and Optum's de-identified Clinformatics Data Mart Database. Racial inequities in model discrimination were assessed with xAUCs.

RESULTS:

The model AUC was 0.76. Top predictors included age, smoking, race, ethnicity, and diagnosis of chronic obstructive pulmonary disease. The model identified a high-risk group with lung cancer incidence 9 times the average cohort incidence, representing 10% of patients with lung cancer. Model performed well temporally and externally, while performance was reduced for Asians and Hispanics.

CONCLUSIONS:

A high-dimensional model trained using big data identified a subset of patients with high lung cancer risk. The model demonstrated transportability to EHR and claims data, while underscoring the need to assess racial disparities when using machine learning methods. IMPACT This internally and externally validated real-world data-based lung cancer prediction model is available on an open-source platform for broad sharing and application. Model integration into an EHR system could minimize physician burden by automating identification of high-risk patients.
Asunto(s)

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Asunto principal: Enfermedad Pulmonar Obstructiva Crónica / Neoplasias Pulmonares Tipo de estudio: Etiology_studies / Prognostic_studies / Risk_factors_studies Límite: Humans Idioma: En Revista: Cancer Epidemiol Biomarkers Prev Asunto de la revista: BIOQUIMICA / EPIDEMIOLOGIA / NEOPLASIAS Año: 2023 Tipo del documento: Article

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Asunto principal: Enfermedad Pulmonar Obstructiva Crónica / Neoplasias Pulmonares Tipo de estudio: Etiology_studies / Prognostic_studies / Risk_factors_studies Límite: Humans Idioma: En Revista: Cancer Epidemiol Biomarkers Prev Asunto de la revista: BIOQUIMICA / EPIDEMIOLOGIA / NEOPLASIAS Año: 2023 Tipo del documento: Article