Sample size considerations and predictive performance of multinomial logistic prediction models.

de Jong, Valentijn M T; Eijkemans, Marinus J C; van Calster, Ben; Timmerman, Dirk; Moons, Karel G M; Steyerberg, Ewout W; van Smeden, Maarten

de Jong, Valentijn M T; Eijkemans, Marinus J C; van Calster, Ben; Timmerman, Dirk; Moons, Karel G M; Steyerberg, Ewout W; van Smeden, Maarten.

Afiliación

de Jong VMT; Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands.
Eijkemans MJC; Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands.
van Calster B; Department of Development and Regeneration, KU Leuven, Leuven, Belgium.
Timmerman D; Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, The Netherlands.
Moons KGM; Department of Development and Regeneration, KU Leuven, Leuven, Belgium.
Steyerberg EW; Department of Obstetrics and Gynecology, University Hospitals Leuven, Leuven, Belgium.
van Smeden M; Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands.

Stat Med ; 38(9): 1601-1619, 2019 04 30.

Article en En | MEDLINE | ID: mdl-30614028

RESUMEN

Multinomial Logistic Regression (MLR) has been advocated for developing clinical prediction models that distinguish between three or more unordered outcomes. We present a full-factorial simulation study to examine the predictive performance of MLR models in relation to the relative size of outcome categories, number of predictors and the number of events per variable. It is shown that MLR estimated by Maximum Likelihood yields overfitted prediction models in small to medium sized data. In most cases, the calibration and overall predictive performance of the multinomial prediction model is improved by using penalized MLR. Our simulation study also highlights the importance of events per variable in the multinomial context as well as the total sample size. As expected, our study demonstrates the need for optimism correction of the predictive performance measures when developing the multinomial logistic prediction model. We recommend the use of penalized MLR when prediction models are developed in small data sets or in medium sized data sets with a small total sample size (ie, when the sizes of the outcome categories are balanced). Finally, we present a case study in which we illustrate the development and validation of penalized and unpenalized multinomial prediction models for predicting malignancy of ovarian cancer.

Asunto(s)

Funciones de Verosimilitud; Modelos Logísticos; Tamaño de la Muestra; Simulación por Computador; Humanos

Palabras clave

Multinomial Logistic Regression; overfit; prediction models; predictive performance; shrinkage

Texto completo

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Asunto principal: Funciones de Verosimilitud / Modelos Logísticos / Tamaño de la Muestra Tipo de estudio: Prognostic_studies / Risk_factors_studies Límite: Humans Idioma: En Revista: Stat Med Año: 2019 Tipo del documento: Article País de afiliación: Países Bajos

Texto completo

Imprimir

XML

PubMed Links

Buscar en Google