RESUMO
BACKGROUND: The Prostate Imaging Quality (PI-QUAL) score is the first step toward image quality assessment in multi-parametric prostate MRI (mpMRI). Previous studies have demonstrated moderate to excellent inter-rater agreement among expert readers; however, there is a need for studies to assess the inter-reader agreement of PI-QUAL scoring in basic prostate readers. OBJECTIVES: To assess the inter-reader agreement of the PI-QUAL score amongst basic prostate readers on multi-center prostate mpMRI. METHODS: Five basic prostate readers from different centers assessed the PI-QUAL scores independently using T2-weighted images, diffusion-weighted imaging (DWI) including apparent diffusion coefficient (ADC) maps, and dynamic-contrast-enhanced (DCE) images on mpMRI data obtained from five different centers following Prostate Imaging-Reporting and Data System Version 2.1. The inter-reader agreements amongst radiologists for PI-QUAL were evaluated using weighted Cohen's kappa. Further, the absolute agreements in assessing the diagnostic adequacy of each mpMRI sequence were calculated. RESULTS: A total of 355 men with a median age of 71 years (IQR, 60-78) were enrolled in the study. The pair-wise kappa scores ranged from 0.656 to 0.786 for the PI-QUAL scores, indicating good inter-reader agreements between the readers. The pair-wise absolute agreements ranged from 0.75 to 0.88 for T2W imaging, from 0.74 to 0.83 for the ADC maps, and from 0.77 to 0.86 for DCE images. CONCLUSIONS: Basic prostate radiologists from different institutions provided good inter-reader agreements on multi-center data for the PI-QUAL scores.
Assuntos
Próstata , Neoplasias da Próstata , Masculino , Humanos , Pessoa de Meia-Idade , Idoso , Próstata/diagnóstico por imagem , Neoplasias da Próstata/diagnóstico por imagem , Estudos Retrospectivos , Imageamento por Ressonância Magnética/métodos , Imagem de Difusão por Ressonância Magnética/métodosRESUMO
BACKGROUND: Although systems such as Prostate Imaging Quality (PI-QUAL) have been proposed for quality assessment, visual evaluations by human readers remain somewhat inconsistent, particularly among less-experienced readers. OBJECTIVES: To assess the feasibility of deep learning (DL) for the automated assessment of image quality in bi-parametric MRI scans and compare its performance to that of less-experienced readers. METHODS: We used bi-parametric prostate MRI scans from the PI-CAI dataset in this study. A 3-point Likert scale, consisting of poor, moderate, and excellent, was utilized for assessing image quality. Three expert readers established the ground-truth labels for the development (500) and testing sets (100). We trained a 3D DL model on the development set using probabilistic prostate masks and an ordinal loss function. Four less-experienced readers scored the testing set for performance comparison. RESULTS: The kappa scores between the DL model and the expert consensus for T2W images and ADC maps were 0.42 and 0.61, representing moderate and good levels of agreement. The kappa scores between the less-experienced readers and the expert consensus for T2W images and ADC maps ranged from 0.39 to 0.56 (fair to moderate) and from 0.39 to 0.62 (fair to good). CONCLUSIONS: Deep learning (DL) can offer performance comparable to that of less-experienced readers when assessing image quality in bi-parametric prostate MRI, making it a viable option for an automated quality assessment tool. We suggest that DL models trained on more representative datasets, annotated by a larger group of experts, could yield reliable image quality assessment and potentially substitute or assist visual evaluations by human readers.