Búsqueda | Portal de Búsqueda de la BVS España

Can the generalizability issue of artificial intelligence be overcome? Pneumothorax detection algorithm.

Verdi, Elvan Burak; Yilmaz, Muhammed; Dogan Mülazimoglu, Deniz; Türker, Abdussamet; Gürün Kaya, Aslihan; Isik, Özlem; Bostanoglu Karaçin, Asli; Velioglu Yakut, Övgü; Yenigün, Bülent Mustafa; Uzun, Çaglar; Elhan, Atilla Halil; Özdemir Kumbasar, Özlem; Kaya, Akin; Kayi Cangir, Ayten; Tasçi, Cantürk; Özbayoglu, Ahmet Murat; Erol, Serhat.

J Investig Med ; 72(1): 88-99, 2024 01.

Artículo en Inglés | MEDLINE | ID: mdl-37840192

RESUMEN

The generalizability of artificial intelligence (AI) models is a major issue in the field of AI applications. Therefore, we aimed to overcome the generalizability problem of an AI model developed for a particular center for pneumothorax detection using a small dataset for external validation. Chest radiographs of patients diagnosed with pneumothorax (n = 648) and those without pneumothorax (n = 650) who visited the Ankara University Faculty of Medicine (AUFM; center 1) were obtained. A deep learning-based pneumothorax detection algorithm (PDA-Alpha) was developed using the AUFM dataset. For implementation at the Health Sciences University (HSU; center 2), PDA-Beta was developed through external validation of PDA-Alpha using 50 radiographs with pneumothorax obtained from HSU. Both PDA algorithms were assessed using the HSU test dataset (n = 200) containing 50 pneumothorax and 150 non-pneumothorax radiographs. We compared the results generated by the algorithms with those of physicians to demonstrate the reliability of the results. The areas under the curve for PDA-Alpha and PDA-Beta were 0.993 (95% confidence interval (CI): 0.985-1.000) and 0.986 (95% CI: 0.962-1.000), respectively. Both algorithms successfully detected the presence of pneumothorax on 49/50 radiographs; however, PDA-Alpha had seven false-positive predictions, whereas PDA-Beta had one. The positive predictive value increased from 0.525 to 0.886 after external validation (p = 0.041). The physicians' sensitivity and specificity for detecting pneumothorax were 0.585 and 0.988, respectively. The performance scores of the algorithms were increased with a small dataset; however, further studies are required to determine the optimal amount of external validation data to fully address the generalizability issue.

Asunto(s)

Aprendizaje Profundo , Neumotórax , Humanos , Inteligencia Artificial , Neumotórax/diagnóstico por imagen , Reproducibilidad de los Resultados , Estudios Retrospectivos , Algoritmos

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA