An empirical analysis of dealing with patients who are lost to follow-up when developing prognostic models using a cohort design.

Reps, Jenna M; Rijnbeek, Peter; Cuthbert, Alana; Ryan, Patrick B; Pratt, Nicole; Schuemie, Martijn

Reps, Jenna M; Rijnbeek, Peter; Cuthbert, Alana; Ryan, Patrick B; Pratt, Nicole; Schuemie, Martijn.

Afiliação

Reps JM; Janssen Research and Development, Titusville, NJ, USA. jreps@its.jnj.com.
Rijnbeek P; Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands.
Cuthbert A; South Australian Health and Medical Research Institute (SAHMRI), Adelaide, SA, Australia.
Ryan PB; Janssen Research and Development, Titusville, NJ, USA.
Pratt N; Quality Use of Medicines and Pharmacy Research Centre, Sansom Institute, School of Pharmacy and Medical Sciences, University of South Australia, Adelaide, SA, Australia.
Schuemie M; Janssen Research and Development, Titusville, NJ, USA.

BMC Med Inform Decis Mak ; 21(1): 43, 2021 02 06.

Article em En | MEDLINE | ID: mdl-33549087

ABSTRACT

ABSTRACT

BACKGROUND:

Researchers developing prediction models are faced with numerous design choices that may impact model performance. One key decision is how to include patients who are lost to follow-up. In this paper we perform a large-scale empirical evaluation investigating the impact of this decision. In addition, we aim to provide guidelines for how to deal with loss to follow-up.

METHODS:

We generate a partially synthetic dataset with complete follow-up and simulate loss to follow-up based either on random selection or on selection based on comorbidity. In addition to our synthetic data study we investigate 21 real-world data prediction problems. We compare four simple strategies for developing models when using a cohort design that encounters loss to follow-up. Three strategies employ a binary classifier with data that (1) include all patients (including those lost to follow-up), (2) exclude all patients lost to follow-up or (3) only exclude patients lost to follow-up who do not have the outcome before being lost to follow-up. The fourth strategy uses a survival model with data that include all patients. We empirically evaluate the discrimination and calibration performance.

RESULTS:

The partially synthetic data study results show that excluding patients who are lost to follow-up can introduce bias when loss to follow-up is common and does not occur at random. However, when loss to follow-up was completely at random, the choice of addressing it had negligible impact on model discrimination performance. Our empirical real-world data results showed that the four design choices investigated to deal with loss to follow-up resulted in comparable performance when the time-at-risk was 1-year but demonstrated differential bias when we looked into 3-year time-at-risk. Removing patients who are lost to follow-up before experiencing the outcome but keeping patients who are lost to follow-up after the outcome can bias a model and should be avoided.

CONCLUSION:

Based on this study we therefore recommend (1) developing models using data that includes patients that are lost to follow-up and (2) evaluate the discrimination and calibration of models twice on a test set including patients lost to follow-up and a test set excluding patients lost to follow-up.

Assuntos

Perda de Seguimento; Viés; Calibragem; Estudos de Coortes; Humanos; Prognóstico

Palavras-chave

Best practices; Censoring; Loss to follow-up; Model development; PatientLevelPrediction; Prognostic model

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Perda de Seguimento Tipo de estudo: Etiology_studies / Incidence_studies / Observational_studies / Prognostic_studies / Risk_factors_studies Limite: Humans Idioma: En Revista: BMC Med Inform Decis Mak Assunto da revista: INFORMATICA MEDICA Ano de publicação: 2021 Tipo de documento: Article País de afiliação: Estados Unidos

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google