Screening for data clustering in multicenter studies: the residual intraclass correlation.

Wynants, Laure; Timmerman, Dirk; Bourne, Tom; Van Huffel, Sabine; Van Calster, Ben

Wynants, Laure; Timmerman, Dirk; Bourne, Tom; Van Huffel, Sabine; Van Calster, Ben.

Afiliación

Van Calster B; KU Leuven Department of Development and Regeneration, Leuven, Belgium. ben.vancalster@med.kuleuven.be.

BMC Med Res Methodol ; 13: 128, 2013 Oct 23.

Article en En | MEDLINE | ID: mdl-24152372

RESUMEN

BACKGROUND: In multicenter studies, center-specific variations in measurements may arise for various reasons, such as low interrater reliability, differences in equipment, deviations from the protocol, sociocultural characteristics, and differences in patient populations due to e.g. local referral patterns. The aim of this research is to derive measures for the degree of clustering. We present a method to detect heavily clustered variables and to identify physicians with outlying measurements. METHODS: We use regression models with fixed effects to account for patient case-mix and a random cluster intercept to study clustering by physicians. We propose to use the residual intraclass correlation (RICC), the proportion of residual variance that is situated at the cluster level, to detect variables that are influenced by clustering. An RICC of 0 indicates that the variance in the measurements is not due to variation between clusters. We further suggest, where appropriate, to evaluate RICC in combination with R2, the proportion of variance that is explained by the fixed effects. Variables with a high R2 may have benefits that outweigh the disadvantages of clustering in terms of statistical analysis. We apply the proposed methods to a dataset collected for the development of models for ovarian tumor diagnosis. We study the variability in 18 tumor characteristics collected through ultrasound examination, 4 patient characteristics, and the serum marker CA-125 measured by 40 physicians on 2407 patients. RESULTS: The RICC showed large variation between variables: from 2.2% for age to 25.1% for the amount of fluid in the pouch of Douglas. Seven variables had an RICC above 15%, indicating that a considerable part of the variance is due to systematic differences at the physician level, rather than random differences at the patient level. Accounting for differences in ultrasound machine quality reduced the RICC for a number of blood flow measurements. CONCLUSIONS: We recommend that the degree of data clustering is addressed during the monitoring and analysis of multicenter studies. The RICC is a useful tool that expresses the degree of clustering as a percentage. Specific applications are data quality monitoring and variable screening prior to the development of a prediction model.

Asunto(s)

Estudios Multicéntricos como Asunto/métodos; Antígeno Ca-125/sangre; Análisis por Conglomerados; Interpretación Estadística de Datos; Femenino; Humanos; Modelos Estadísticos; Variaciones Dependientes del Observador; Neoplasias Ováricas/sangre; Neoplasias Ováricas/diagnóstico por imagen; Análisis de Regresión; Reproducibilidad de los Resultados; Ultrasonografía/normas

Texto completo

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Banco de datos: MEDLINE Asunto principal: Estudios Multicéntricos como Asunto Tipo de estudio: Clinical_trials / Diagnostic_studies / Guideline / Prognostic_studies / Risk_factors_studies / Screening_studies Límite: Female / Humans Idioma: En Revista: BMC Med Res Methodol Asunto de la revista: MEDICINA Año: 2013 Tipo del documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar en Google