RESUMEN
BACKGROUND AND PURPOSE: Automated volumetric analysis of structural MR imaging allows quantitative assessment of brain atrophy in neurodegenerative disorders. We compared the brain segmentation performance of the AI-Rad Companion brain MR imaging software against an in-house FreeSurfer 7.1.1/Individual Longitudinal Participant pipeline. MATERIALS AND METHODS: T1-weighted images of 45 participants with de novo memory symptoms were selected from the OASIS-4 database and analyzed through the AI-Rad Companion brain MR imaging tool and the FreeSurfer 7.1.1/Individual Longitudinal Participant pipeline. Correlation, agreement, and consistency between the 2 tools were compared among the absolute, normalized, and standardized volumes. Final reports generated by each tool were used to compare the rates of detection of abnormality and the compatibility of radiologic impressions made using each tool, compared with the clinical diagnoses. RESULTS: We observed strong correlation, moderate consistency, and poor agreement between absolute volumes of the main cortical lobes and subcortical structures measured by the AI-Rad Companion brain MR imaging tool compared with FreeSurfer. The strength of the correlations increased after normalizing the measurements to the total intracranial volume. Standardized measurements differed significantly between the 2 tools, likely owing to differences in the normative data sets used to calibrate each tool. When considering the FreeSurfer 7.1.1/Individual Longitudinal Participant pipeline as a reference standard, the AI-Rad Companion brain MR imaging tool had a specificity of 90.6%-100% and a sensitivity of 64.3%-100% in detecting volumetric abnormalities. There was no difference between the rate of compatibility of radiologic and clinical impressions when using the 2 tools. CONCLUSIONS: The AI-Rad Companion brain MR imaging tool reliably detects atrophy in cortical and subcortical regions implicated in the differential diagnosis of dementia.
Asunto(s)
Encéfalo , Imagen por Resonancia Magnética , Humanos , Encéfalo/diagnóstico por imagen , Encéfalo/patología , Imagen por Resonancia Magnética/métodos , Corteza Cerebral , Programas Informáticos , Atrofia/patología , Procesamiento de Imagen Asistido por Computador/métodos , Reproducibilidad de los ResultadosRESUMEN
BACKGROUND AND PURPOSE: Accurate and reliable detection of white matter hyperintensities and their volume quantification can provide valuable clinical information to assess neurologic disease progression. In this work, a stacked generalization ensemble of orthogonal 3D convolutional neural networks, StackGen-Net, is explored for improving automated detection of white matter hyperintensities in 3D T2-FLAIR images. MATERIALS AND METHODS: Individual convolutional neural networks in StackGen-Net were trained on 2.5D patches from orthogonal reformatting of 3D-FLAIR (n = 21) to yield white matter hyperintensity posteriors. A meta convolutional neural network was trained to learn the functional mapping from orthogonal white matter hyperintensity posteriors to the final white matter hyperintensity prediction. The impact of training data and architecture choices on white matter hyperintensity segmentation performance was systematically evaluated on a test cohort (n = 9). The segmentation performance of StackGen-Net was compared with state-of-the-art convolutional neural network techniques on an independent test cohort from the Alzheimer's Disease Neuroimaging Initiative-3 (n = 20). RESULTS: StackGen-Net outperformed individual convolutional neural networks in the ensemble and their combination using averaging or majority voting. In a comparison with state-of-the-art white matter hyperintensity segmentation techniques, StackGen-Net achieved a significantly higher Dice score (0.76 [SD, 0.08], F1-lesion (0.74 [SD, 0.13]), and area under precision-recall curve (0.84 [SD, 0.09]), and the lowest absolute volume difference (13.3% [SD, 9.1%]). StackGen-Net performance in Dice scores (median = 0.74) did not significantly differ (P = .22) from interobserver (median = 0.73) variability between 2 experienced neuroradiologists. We found no significant difference (P = .15) in white matter hyperintensity lesion volumes from StackGen-Net predictions and ground truth annotations. CONCLUSIONS: A stacked generalization of convolutional neural networks, utilizing multiplanar lesion information using 2.5D spatial context, greatly improved the segmentation performance of StackGen-Net compared with traditional ensemble techniques and some state-of-the-art deep learning models for 3D-FLAIR.