FDA-approved deep learning software application versus radiologists with different levels of expertise: detection of intracranial hemorrhage in a retrospective single-center study.

Kau, Thomas; Ziurlys, Mindaugas; Taschwer, Manuel; Kloss-Brandstätter, Anita; Grabner, Günther; Deutschmann, Hannes

Kau, Thomas; Ziurlys, Mindaugas; Taschwer, Manuel; Kloss-Brandstätter, Anita; Grabner, Günther; Deutschmann, Hannes.

Afiliación

Kau T; Department of Radiology, Landeskrankenhaus Villach, Nikolaigasse 43, 9500, Villach, Austria. thomas.kau@kabeg.at.
Ziurlys M; Division of Pediatric Radiology, Department of Radiology, Medical University of Graz, Auenbruggerplatz 9, 8036, Graz, Austria. thomas.kau@kabeg.at.
Taschwer M; Department of Radiology, Landeskrankenhaus Villach, Nikolaigasse 43, 9500, Villach, Austria.
Kloss-Brandstätter A; Department of Radiology, Landeskrankenhaus Villach, Nikolaigasse 43, 9500, Villach, Austria.
Grabner G; Carinthia University of Applied Sciences, Europastrasse 4, 9500, Villach, Austria.
Deutschmann H; Department of Medical Engineering, Carinthia University of Applied Sciences, Primoschgasse 8, 9020, Klagenfurt, Austria.

Neuroradiology ; 64(5): 981-990, 2022 May.

Article en En | MEDLINE | ID: mdl-34988593

ABSTRACT

ABSTRACT

PURPOSE:

To assess an FDA-approved and CE-certified deep learning (DL) software application compared to the performance of human radiologists in detecting intracranial hemorrhages (ICH).

METHODS:

Within a 20-week trial from January to May 2020, 2210 adult non-contrast head CT scans were performed in a single center and automatically analyzed by an artificial intelligence (AI) solution with workflow integration. After excluding 22 scans due to severe motion artifacts, images were retrospectively assessed for the presence of ICHs by a second-year resident and a certified radiologist under simulated time pressure. Disagreements were resolved by a subspecialized neuroradiologist serving as the reference standard. We calculated interrater agreement and diagnostic performance parameters, including the Breslow-Day and Cochran-Mantel-Haenszel tests.

RESULTS:

An ICH was present in 214 out of 2188 scans. The interrater agreement between the resident and the certified radiologist was very high (κ = 0.89) and even higher (κ = 0.93) between the resident and the reference standard. The software has delivered 64 false-positive and 68 false-negative results giving an overall sensitivity, specificity, positive predictive value, negative predictive value, and accuracy of 68.2%, 96.8%, 69.5%, 96.6%, and 94.0%, respectively. Corresponding values for the resident were 94.9%, 99.2%, 93.1%, 99.4%, and 98.8%. The accuracy of the DL application was inferior (p < 0.001) to that of both the resident and the certified neuroradiologist.

CONCLUSION:

A resident under time pressure outperformed an FDA-approved DL program in detecting ICH in CT scans. Our results underline the importance of thoughtful workflow integration and post-approval validation of AI applications in various clinical environments.

Asunto(s)

Inteligencia Artificial; Aprendizaje Profundo; Adulto; Humanos; Hemorragias Intracraneales/diagnóstico por imagen; Radiólogos; Estudios Retrospectivos; Programas Informáticos

Palabras clave

Artificial intelligence; Computed tomography; Deep learning; Diagnostic accuracy; Intracranial hemorrhage

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Inteligencia Artificial / Aprendizaje Profundo Tipo de estudio: Diagnostic_studies / Observational_studies / Risk_factors_studies Límite: Adult / Humans Idioma: En Revista: Neuroradiology Año: 2022 Tipo del documento: Article País de afiliación: Austria

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google