Developing risk models for multicenter data using standard logistic regression produced suboptimal predictions: A simulation study.

Falconieri, Nora; Van Calster, Ben; Timmerman, Dirk; Wynants, Laure

Falconieri, Nora; Van Calster, Ben; Timmerman, Dirk; Wynants, Laure.

Afiliação

Falconieri N; Department of Development and Regeneration, KU Leuven, Leuven, Belgium.
Van Calster B; Department of Development and Regeneration, KU Leuven, Leuven, Belgium.
Timmerman D; Department of Biomedical Data Sciences, Leiden University Medical Center (LUMC), Leiden, The Netherlands.
Wynants L; Department of Development and Regeneration, KU Leuven, Leuven, Belgium.

Biom J ; 62(4): 932-944, 2020 07.

Article em En | MEDLINE | ID: mdl-31957077

ABSTRACT

ABSTRACT

Although multicenter data are common, many prediction model studies ignore this during model development. The objective of this study is to evaluate the predictive performance of regression methods for developing clinical risk prediction models using multicenter data, and provide guidelines for practice. We compared the predictive performance of standard logistic regression, generalized estimating equations, random intercept logistic regression, and fixed effects logistic regression. First, we presented a case study on the diagnosis of ovarian cancer. Subsequently, a simulation study investigated the performance of the different models as a function of the amount of clustering, development sample size, distribution of center-specific intercepts, the presence of a center-predictor interaction, and the presence of a dependency between center effects and predictors. The results showed that when sample sizes were sufficiently large, conditional models yielded calibrated predictions, whereas marginal models yielded miscalibrated predictions. Small sample sizes led to overfitting and unreliable predictions. This miscalibration was worse with more heavily clustered data. Calibration of random intercept logistic regression was better than that of standard logistic regression even when center-specific intercepts were not normally distributed, a center-predictor interaction was present, center effects and predictors were dependent, or when the model was applied in a new center. Therefore, to make reliable predictions in a specific center, we recommend random intercept logistic regression.

Assuntos

Biometria/métodos; Modelos Estatísticos; Feminino; Humanos; Modelos Logísticos; Neoplasias Ovarianas/diagnóstico; Neoplasias Ovarianas/epidemiologia; Medição de Risco

Palavras-chave

calibration; discrimination; multicenter; random effects; risk prediction model

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Modelos Estatísticos / Biometria Tipo de estudo: Clinical_trials / Diagnostic_studies / Etiology_studies / Guideline / Prognostic_studies / Risk_factors_studies Limite: Female / Humans Idioma: En Ano de publicação: 2020 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google