Leave-one-out cross-validation, penalization, and differential bias of some prediction model performance measures-a simulation study.

Geroldinger, Angelika; Lusa, Lara; Nold, Mariana; Heinze, Georg

Geroldinger, Angelika; Lusa, Lara; Nold, Mariana; Heinze, Georg.

Afiliação

Geroldinger A; Center for Medical Data Science, Institute of Clinical Biometrics, Medical University of Vienna, Spitalgasse 23, 1090, Vienna, Austria.
Lusa L; Faculty of Mathematics, Natural Sciences and Information Technologies, University of Primorska, Koper, Slovenia.
Nold M; Faculty of Medicine, University of Ljubljana, Ljubljana, Slovenia.
Heinze G; Department of Sociology, Friedrich Schiller University Jena, Jena, Germany.

Diagn Progn Res ; 7(1): 9, 2023 May 02.

Article em En | MEDLINE | ID: mdl-37127679

ABSTRACT

ABSTRACT

BACKGROUND:

The performance of models for binary outcomes can be described by measures such as the concordance statistic (c-statistic, area under the curve), the discrimination slope, or the Brier score. At internal validation, data resampling techniques, e.g., cross-validation, are frequently employed to correct for optimism in these model performance criteria. Especially with small samples or rare events, leave-one-out cross-validation is a popular choice.

METHODS:

Using simulations and a real data example, we compared the effect of different resampling techniques on the estimation of c-statistics, discrimination slopes, and Brier scores for three estimators of logistic regression models, including the maximum likelihood and two maximum penalized likelihood estimators.

RESULTS:

Our simulation study confirms earlier studies reporting that leave-one-out cross-validated c-statistics can be strongly biased towards zero. In addition, our study reveals that this bias is even more pronounced for model estimators shrinking estimated probabilities towards the observed event fraction, such as ridge regression. Leave-one-out cross-validation also provided pessimistic estimates of the discrimination slope but nearly unbiased estimates of the Brier score.

CONCLUSIONS:

We recommend to use leave-pair-out cross-validation, fivefold cross-validation with repetitions, the enhanced or the .632+ bootstrap to estimate c-statistics, and leave-pair-out or fivefold cross-validation to estimate discrimination slopes.

Palavras-chave

Bootstrap; Concordance statistic; Discrimination slope; Logistic regression; Resampling techniques

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Tipo de estudo: Prognostic_studies / Risk_factors_studies Idioma: En Revista: Diagn Progn Res Ano de publicação: 2023 Tipo de documento: Article País de afiliação: Áustria

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google