Your browser doesn't support javascript.
loading
The impact of different imputation methods on estimates and model performance: an example using a risk prediction model for premature mortality.
Hurst, Mackenzie; O'Neill, Meghan; Pagalan, Lief; Diemert, Lori M; Rosella, Laura C.
Afiliação
  • Hurst M; Population Health Analytics Lab, Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada.
  • O'Neill M; ICES, Toronto, ON, Canada.
  • Pagalan L; Population Health Analytics Lab, Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada.
  • Diemert LM; Population Health Analytics Lab, Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada.
  • Rosella LC; Schwartz Reisman Institute for Technology and Society, University of Toronto, Toronto, ON, Canada.
Popul Health Metr ; 22(1): 13, 2024 Jun 17.
Article em En | MEDLINE | ID: mdl-38886744
ABSTRACT

OBJECTIVE:

To compare how different imputation methods affect the estimates and performance of a prediction model for premature mortality. STUDY DESIGN AND

SETTING:

Sex-specific Weibull accelerated failure time survival models were run on four separate datasets using complete case, mode, single and multiple imputation to impute missing values. Six performance measures were compared to access predictive accuracy (Nagelkerke R2, integrated brier score), discrimination (Harrell's c-index, discrimination slope) and calibration (calibration in the large, calibration slope).

RESULTS:

The highest proportion of missingness for a single variable was 10.86% for the female model and 8.24% for the male model. Comparing the performance measures for complete case, mode, single and multiple imputation the Nagelkerke R2 values for the female model was 0.1084, 0.1116, 0.1120 and 0.111-0.1120 with the male model exhibited similar variation of 0.1050, 0.1078, 0.1078 and 0.1078-0.1081. Harrell's c-index also demonstrated small variation with values of 0.8666, 0.8719, 0.8719 and 0.8711-0.8719 for the female model and 0.8549, 0.8548, 0.8550 and 0.8550-0.8553 for the male model.

CONCLUSION:

In the scenarios examined in this study, mode imputation performed well when using a population health survey compared to single and multiple imputation when predictive performance measures is the main model goal. To generate unbiased hazard ratios, multiple imputation methods were superior. This study shows the need to consider the best imputation approach for a predictive model development given the conditions of missing data and the goals of the analysis.
Assuntos
Palavras-chave

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Mortalidade Prematura Limite: Adult / Female / Humans / Male / Middle aged Idioma: En Revista: Popul Health Metr Ano de publicação: 2024 Tipo de documento: Article País de afiliação: Canadá

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Mortalidade Prematura Limite: Adult / Female / Humans / Male / Middle aged Idioma: En Revista: Popul Health Metr Ano de publicação: 2024 Tipo de documento: Article País de afiliação: Canadá