Weakly Supervised MRI Slice-Level Deep Learning Classification of Prostate Cancer Approximates Full Voxel- and Slice-Level Annotation: Effect of Increasing Training Set Size.

Weißer, Cedric; Netzer, Nils; Görtz, Magdalena; Schütz, Viktoria; Hielscher, Thomas; Schwab, Constantin; Hohenfellner, Markus; Schlemmer, Heinz-Peter; Maier-Hein, Klaus H; Bonekamp, David

Weißer, Cedric; Netzer, Nils; Görtz, Magdalena; Schütz, Viktoria; Hielscher, Thomas; Schwab, Constantin; Hohenfellner, Markus; Schlemmer, Heinz-Peter; Maier-Hein, Klaus H; Bonekamp, David.

Afiliação

Weißer C; Division of Radiology, German Cancer Research Center (DKFZ), Heidelberg, Germany.
Netzer N; Heidelberg University Medical School, Heidelberg, Germany.
Görtz M; Division of Radiology, German Cancer Research Center (DKFZ), Heidelberg, Germany.
Schütz V; Heidelberg University Medical School, Heidelberg, Germany.
Hielscher T; Department of Urology, University of Heidelberg Medical Center, Heidelberg, Germany.
Schwab C; Junior Clinical Cooperation Unit, Multiparametric Methods for Early Detection of Prostate Cancer, German Cancer Research Center (DKFZ), Heidelberg, Germany.
Hohenfellner M; Department of Urology, University of Heidelberg Medical Center, Heidelberg, Germany.
Schlemmer HP; Division of Biostatistics, German Cancer Research Center (DKFZ), Heidelberg, Germany.
Maier-Hein KH; Institute of Pathology, University of Heidelberg Medical Center, Heidelberg, Germany.
Bonekamp D; Department of Urology, University of Heidelberg Medical Center, Heidelberg, Germany.

J Magn Reson Imaging ; 59(4): 1409-1422, 2024 Apr.

Article em En | MEDLINE | ID: mdl-37504495

ABSTRACT

ABSTRACT

BACKGROUND:

Weakly supervised learning promises reduced annotation effort while maintaining performance.

PURPOSE:

To compare weakly supervised training with full slice-wise annotated training of a deep convolutional classification network (CNN) for prostate cancer (PC). STUDY TYPE Retrospective.

SUBJECTS:

One thousand four hundred eighty-nine consecutive institutional prostate MRI examinations from men with suspicion for PC (65 ± 8 years) between January 2015 and November 2020 were split into training (N = 794, enriched with 204 PROSTATEx examinations) and test set (N = 695). FIELD STRENGTH/SEQUENCE 1.5 and 3T, T2-weighted turbo-spin-echo and diffusion-weighted echo-planar imaging. ASSESSMENT Histopathological ground truth was provided by targeted and extended systematic biopsy. Reference training was performed using slice-level annotation (SLA) and compared to iterative training utilizing patient-level annotations (PLAs) with supervised feedback of CNN estimates into the next training iteration at three incremental training set sizes (N = 200, 500, 998). Model performance was assessed by comparing specificity at fixed sensitivity of 0.97 [254/262] emulating PI-RADS ≥ 3, and 0.88-0.90 [231-236/262] emulating PI-RADS ≥ 4 decisions. STATISTICAL TESTS Receiver operating characteristic (ROC) and area under the curve (AUC) was compared using DeLong and Obuchowski test. Sensitivity and specificity were compared using McNemar test. Statistical significance threshold was P = 0.05.

RESULTS:

Test set (N = 695) ROC-AUC performance of SLA (trained with 200/500/998 exams) was 0.75/0.80/0.83, respectively. PLA achieved lower ROC-AUC of 0.64/0.72/0.78. Both increased performance significantly with increasing training set size. ROC-AUC for SLA at 500 exams was comparable to PLA at 998 exams (P = 0.28). ROC-AUC was significantly different between SLA and PLA at same training set sizes, however the ROC-AUC difference decreased significantly from 200 to 998 training exams. Emulating PI-RADS ≥ 3 decisions, difference between PLA specificity of 0.12 [51/433] and SLA specificity of 0.13 [55/433] became undetectable (P = 1.0) at 998 exams. Emulating PI-RADS ≥ 4 decisions, at 998 exams, SLA specificity of 0.51 [221/433] remained higher than PLA specificity at 0.39 [170/433]. However, PLA specificity at 998 exams became comparable to SLA specificity of 0.37 [159/433] at 200 exams (P = 0.70). DATA

CONCLUSION:

Weakly supervised training of a classification CNN using patient-level-only annotation had lower performance compared to training with slice-wise annotations, but improved significantly faster with additional training data. EVIDENCE LEVEL 3 TECHNICAL EFFICACY Stage 2.

Assuntos

Aprendizado Profundo; Neoplasias da Próstata; Masculino; Humanos; Imageamento por Ressonância Magnética/métodos; Neoplasias da Próstata/diagnóstico por imagem; Neoplasias da Próstata/patologia; Estudos Retrospectivos; Poliésteres

Palavras-chave

MRI; PI-RADS; deep learning; prostate cancer; weakly supervised training

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Neoplasias da Próstata / Aprendizado Profundo Tipo de estudo: Prognostic_studies Limite: Humans / Male Idioma: En Ano de publicação: 2024 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google