Diagnostic test accuracy of externally validated convolutional neural network (CNN) artificial intelligence (AI) models for emergency head CT scans - A systematic review.

Mäenpää, Saana M; Korja, Miikka

Mäenpää, Saana M; Korja, Miikka.

Afiliación

Mäenpää SM; Department of Neurosurgery, University of Helsinki and Helsinki University Hospital, Helsinki, Finland. Electronic address: saana.maenpaa@helsinki.fi.
Korja M; Department of Neurosurgery, University of Helsinki and Helsinki University Hospital, Helsinki, Finland. Electronic address: miikka.korja@hus.fi.

Int J Med Inform ; 189: 105523, 2024 Sep.

Article en En | MEDLINE | ID: mdl-38901270

ABSTRACT

ABSTRACT

BACKGROUND:

The surge in emergency head CT imaging and artificial intelligence (AI) advancements, especially deep learning (DL) and convolutional neural networks (CNN), have accelerated the development of computer-aided diagnosis (CADx) for emergency imaging. External validation assesses model generalizability, providing preliminary evidence of clinical potential.

OBJECTIVES:

This study systematically reviews externally validated CNN-CADx models for emergency head CT scans, critically appraises diagnostic test accuracy (DTA), and assesses adherence to reporting guidelines.

METHODS:

Studies comparing CNN-CADx model performance to reference standard were eligible. The review was registered in PROSPERO (CRD42023411641) and conducted on Medline, Embase, EBM-Reviews and Web of Science following PRISMA-DTA guideline. DTA reporting were systematically extracted and appraised using standardised checklists (STARD, CHARMS, CLAIM, TRIPOD, PROBAST, QUADAS-2).

RESULTS:

Six of 5636 identified studies were eligible. The common target condition was intracranial haemorrhage (ICH), and intended workflow roles auxiliary to experts. Due to methodological and clinical between-study variation, meta-analysis was inappropriate. The scan-level sensitivity exceeded 90 % in 5/6 studies, while specificities ranged from 58,0-97,7 %. The SROC 95 % predictive region was markedly broader than the confidence region, ranging above 50 % sensitivity and 20 % specificity. All studies had unclear or high risk of bias and concern for applicability (QUADAS-2, PROBAST), and reporting adherence was below 50 % in 20 of 32 TRIPOD items.

CONCLUSION:

0.01 % of identified studies met the eligibility criteria. The evidence on the DTA of CNN-CADx models for emergency head CT scans remains limited in the scope of this review, as the reviewed studies were scarce, inapt for meta-analysis and undermined by inadequate methodological conduct and reporting. Properly conducted, external validation remains preliminary for evaluating the clinical potential of AI-CADx models, but prospective and pragmatic clinical validation in comparative trials remains most crucial. In conclusion, future AI-CADx research processes should be methodologically standardized and reported in a clinically meaningful way to avoid research waste.

Asunto(s)

Redes Neurales de la Computación; Tomografía Computarizada por Rayos X; Humanos; Tomografía Computarizada por Rayos X/normas; Inteligencia Artificial; Aprendizaje Profundo; Cabeza/diagnóstico por imagen; Diagnóstico por Computador; Hemorragias Intracraneales/diagnóstico por imagen

Palabras clave

Artificial intelligence; Computer-Aided Diagnosis (CADx); Convolutional neural network; Deep learning; Emergency Head Computed Tomography (CT)

Texto completo

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Base de datos: MEDLINE Asunto principal: Tomografía Computarizada por Rayos X / Redes Neurales de la Computación Límite: Humans Idioma: En Revista: Int J Med Inform Asunto de la revista: INFORMATICA MEDICA Año: 2024 Tipo del documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar en Google