Sources of variation in multicenter rectal MRI data and their effect on radiomics feature reproducibility.
Eur Radiol
; 32(3): 1506-1516, 2022 Mar.
Article
en En
| MEDLINE
| ID: mdl-34655313
ABSTRACT
OBJECTIVES:
To investigate sources of variation in a multicenter rectal cancer MRI dataset focusing on hardware and image acquisition, segmentation methodology, and radiomics feature extraction software.METHODS:
T2W and DWI/ADC MRIs from 649 rectal cancer patients were retrospectively acquired in 9 centers. Fifty-two imaging features (14 first-order/6 shape/32 higher-order) were extracted from each scan using whole-volume (expert/non-expert) and single-slice segmentations using two different software packages (PyRadiomics/CapTk). Influence of hardware, acquisition, and patient-intrinsic factors (age/gender/cTN-stage) on ADC was assessed using linear regression. Feature reproducibility was assessed between segmentation methods and software packages using the intraclass correlation coefficient.RESULTS:
Image features differed significantly (p < 0.001) between centers with more substantial variations in ADC compared to T2W-MRI. In total, 64.3% of the variation in mean ADC was explained by differences in hardware and acquisition, compared to 0.4% by patient-intrinsic factors. Feature reproducibility between expert and non-expert segmentations was good to excellent (median ICC 0.89-0.90). Reproducibility for single-slice versus whole-volume segmentations was substantially poorer (median ICC 0.40-0.58). Between software packages, reproducibility was good to excellent (median ICC 0.99) for most features (first-order/shape/GLCM/GLRLM) but poor for higher-order (GLSZM/NGTDM) features (median ICC 0.00-0.41).CONCLUSIONS:
Significant variations are present in multicenter MRI data, particularly related to differences in hardware and acquisition, which will likely negatively influence subsequent analysis if not corrected for. Segmentation variations had a minor impact when using whole volume segmentations. Between software packages, higher-order features were less reproducible and caution is warranted when implementing these in prediction models. KEY POINTS ⢠Features derived from T2W-MRI and in particular ADC differ significantly between centers when performing multicenter data analysis. ⢠Variations in ADC are mainly (> 60%) caused by hardware and image acquisition differences and less so (< 1%) by patient- or tumor-intrinsic variations. ⢠Features derived using different image segmentations (expert/non-expert) were reproducible, provided that whole-volume segmentations were used. When using different feature extraction software packages with similar settings, higher-order features were less reproducible.Palabras clave
Texto completo:
1
Colección:
01-internacional
Base de datos:
MEDLINE
Asunto principal:
Neoplasias del Recto
/
Imagen por Resonancia Magnética
Tipo de estudio:
Observational_studies
/
Prognostic_studies
Límite:
Humans
Idioma:
En
Revista:
Eur Radiol
Asunto de la revista:
RADIOLOGIA
Año:
2022
Tipo del documento:
Article
País de afiliación:
Países Bajos