Meta-analysis of prediction model performance across multiple studies: Which scale helps ensure between-study normality for the C-statistic and calibration measures?

Snell, Kym Ie; Ensor, Joie; Debray, Thomas Pa; Moons, Karel Gm; Riley, Richard D

Snell, Kym Ie; Ensor, Joie; Debray, Thomas Pa; Moons, Karel Gm; Riley, Richard D.

Snell KI; 1 Research Institute for Primary Care and Health Sciences, Keele University, Staffordshire, UK.
Ensor J; 1 Research Institute for Primary Care and Health Sciences, Keele University, Staffordshire, UK.
Debray TP; 2 Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht, The Netherlands.
Moons KG; 3 Cochrane Netherlands, University Medical Center Utrecht, Utrecht, The Netherlands.
Riley RD; 2 Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht, The Netherlands.

Stat Methods Med Res ; 27(11): 3505-3522, 2018 11.

Article en En | MEDLINE | ID: mdl-28480827

ABSTRACT

ABSTRACT

If individual participant data are available from multiple studies or clusters, then a prediction model can be externally validated multiple times. This allows the model's discrimination and calibration performance to be examined across different settings. Random-effects meta-analysis can then be used to quantify overall (average) performance and heterogeneity in performance. This typically assumes a normal distribution of 'true' performance across studies. We conducted a simulation study to examine this normality assumption for various performance measures relating to a logistic regression prediction model. We simulated data across multiple studies with varying degrees of variability in baseline risk or predictor effects and then evaluated the shape of the between-study distribution in the C-statistic, calibration slope, calibration-in-the-large, and E/O statistic, and possible transformations thereof. We found that a normal between-study distribution was usually reasonable for the calibration slope and calibration-in-the-large; however, the distributions of the C-statistic and E/O were often skewed across studies, particularly in settings with large variability in the predictor effects. Normality was vastly improved when using the logit transformation for the C-statistic and the log transformation for E/O, and therefore we recommend these scales to be used for meta-analysis. An illustrated example is given using a random-effects meta-analysis of the performance of QRISK2 across 25 general practices.

Asunto(s)

Calibración; Predicción; Modelos Estadísticos; Algoritmos; Investigación Biomédica/estadística & datos numéricos; Resultado del Tratamiento; Estudios de Validación como Asunto

Palabras clave

C-statistic; Validation; between-study distribution; calibration; discrimination; heterogeneity; meta-analysis; performance statistics; simulation

Texto completo

Imprimir

XML

PubMed Links

Search on Google

Texto completo: 1 Ejes tematicos: Pesquisa_clinica Banco de datos: MEDLINE Asunto principal: Calibración / Modelos Estadísticos / Predicción Tipo de estudio: Prognostic_studies / Risk_factors_studies / Systematic_reviews Idioma: En Año: 2018 Tipo del documento: Article

Texto completo

Imprimir

XML

PubMed Links

Search on Google