RESUMO
OBJECTIVE. The objective of our study was to compare the performance of radiologicradiomic machine learning (ML) models and expert-level radiologists for differentiation of benign and malignant solid renal masses using contrast-enhanced CT examinations. MATERIALS AND METHODS. This retrospective study included a cohort of 254 renal cell carcinomas (RCCs) (190 clear cell RCCs [ccRCCs], 38 chromophobe RCCs [chrRCCs], and 26 papillary RCCs [pRCCs]), 26 fat-poor angioleiomyolipomas, and 10 oncocytomas with preoperative CT examinations. Lesions identified by four expert-level radiologists (> 3000 genitourinary CT and MRI studies) were manually segmented for radiologicradiomic analysis. Disease-specific support vector machine radiologic-radiomic ML models for classification of renal masses were trained and validated using a 10-fold cross-validation. Performance values for the expert-level radiologists and radiologic-radiomic ML models were compared using the McNemar test. RESULTS. The performance values for the four radiologists were as follows: sensitivity of 73.7-96.8% (median, 84.5%; variance, 122.7%) and specificity of 48.4-71.9% (median, 61.8%; variance, 161.6%) for differentiating ccRCCs from pRCCs and chrRCCs; sensitivity of 73.7-96.8% (median, 84.5%; variance, 122.7%) and specificity of 52.8-88.9% for differentiating ccRCCs from fat-poor angioleiomyolipomas and oncocytomas (median, 80.6%; variance, 269.1%); and sensitivity of 28.1-60.9% (median, 84.5%; variance, 122.7%) and specificity of 75.0-88.9% for differentiating pRCCs and chrRCCs from fat-poor angioleiomyolipomas and oncocytomas (median, 50.0%; variance, 191.1%). After a 10-fold cross-validation, the radiologic-radiomic ML model yielded the following performance values for differentiating ccRCCs from pRCCs and chrRCCs, ccRCCs from fat-poor angioleiomyolipomas and oncocytomas, and pRCCs and chrRCCs from fat-poor angioleiomyolipomas and oncocytomas: a sensitivity of 90.0%, 86.3%, and 73.4% and a specificity of 89.1%, 83.3%, and 91.7%, respectively. CONCLUSION. Expert-level radiologists had obviously large variances in performance for differentiating benign from malignant solid renal masses. Radiologic-radiomic ML can be a potential way to improve interreader concordance and performance.