Your browser doesn't support javascript.
loading
Evaluating GPT-V4 (GPT-4 with Vision) on Detection of Radiologic Findings on Chest Radiographs.
Zhou, Yiliang; Ong, Hanley; Kennedy, Patrick; Wu, Carol C; Kazam, Jacob; Hentel, Keith; Flanders, Adam; Shih, George; Peng, Yifan.
  • Zhou Y; From the Departments of Population Health Sciences (Y.Z., Y.P.) and Radiology (H.O., P.K., J.K., K.H., G.S.), Weill Cornell Medicine, 425 E 61st St, Ste 301, New York, NY 10065; Department of Thoracic Imaging, University of Texas MD Anderson Cancer Center, Houston, Tex (C.C.W.); and Department of Ra
  • Ong H; From the Departments of Population Health Sciences (Y.Z., Y.P.) and Radiology (H.O., P.K., J.K., K.H., G.S.), Weill Cornell Medicine, 425 E 61st St, Ste 301, New York, NY 10065; Department of Thoracic Imaging, University of Texas MD Anderson Cancer Center, Houston, Tex (C.C.W.); and Department of Ra
  • Kennedy P; From the Departments of Population Health Sciences (Y.Z., Y.P.) and Radiology (H.O., P.K., J.K., K.H., G.S.), Weill Cornell Medicine, 425 E 61st St, Ste 301, New York, NY 10065; Department of Thoracic Imaging, University of Texas MD Anderson Cancer Center, Houston, Tex (C.C.W.); and Department of Ra
  • Wu CC; From the Departments of Population Health Sciences (Y.Z., Y.P.) and Radiology (H.O., P.K., J.K., K.H., G.S.), Weill Cornell Medicine, 425 E 61st St, Ste 301, New York, NY 10065; Department of Thoracic Imaging, University of Texas MD Anderson Cancer Center, Houston, Tex (C.C.W.); and Department of Ra
  • Kazam J; From the Departments of Population Health Sciences (Y.Z., Y.P.) and Radiology (H.O., P.K., J.K., K.H., G.S.), Weill Cornell Medicine, 425 E 61st St, Ste 301, New York, NY 10065; Department of Thoracic Imaging, University of Texas MD Anderson Cancer Center, Houston, Tex (C.C.W.); and Department of Ra
  • Hentel K; From the Departments of Population Health Sciences (Y.Z., Y.P.) and Radiology (H.O., P.K., J.K., K.H., G.S.), Weill Cornell Medicine, 425 E 61st St, Ste 301, New York, NY 10065; Department of Thoracic Imaging, University of Texas MD Anderson Cancer Center, Houston, Tex (C.C.W.); and Department of Ra
  • Flanders A; From the Departments of Population Health Sciences (Y.Z., Y.P.) and Radiology (H.O., P.K., J.K., K.H., G.S.), Weill Cornell Medicine, 425 E 61st St, Ste 301, New York, NY 10065; Department of Thoracic Imaging, University of Texas MD Anderson Cancer Center, Houston, Tex (C.C.W.); and Department of Ra
  • Shih G; From the Departments of Population Health Sciences (Y.Z., Y.P.) and Radiology (H.O., P.K., J.K., K.H., G.S.), Weill Cornell Medicine, 425 E 61st St, Ste 301, New York, NY 10065; Department of Thoracic Imaging, University of Texas MD Anderson Cancer Center, Houston, Tex (C.C.W.); and Department of Ra
  • Peng Y; From the Departments of Population Health Sciences (Y.Z., Y.P.) and Radiology (H.O., P.K., J.K., K.H., G.S.), Weill Cornell Medicine, 425 E 61st St, Ste 301, New York, NY 10065; Department of Thoracic Imaging, University of Texas MD Anderson Cancer Center, Houston, Tex (C.C.W.); and Department of Ra
Radiology ; 311(2): e233270, 2024 May.
Article en En | MEDLINE | ID: mdl-38713028
ABSTRACT
Background Generating radiologic findings from chest radiographs is pivotal in medical image analysis. The emergence of OpenAI's generative pretrained transformer, GPT-4 with vision (GPT-4V), has opened new perspectives on the potential for automated image-text pair generation. However, the application of GPT-4V to real-world chest radiography is yet to be thoroughly examined. Purpose To investigate the capability of GPT-4V to generate radiologic findings from real-world chest radiographs. Materials and Methods In this retrospective study, 100 chest radiographs with free-text radiology reports were annotated by a cohort of radiologists, two attending physicians and three residents, to establish a reference standard. Of 100 chest radiographs, 50 were randomly selected from the National Institutes of Health (NIH) chest radiographic data set, and 50 were randomly selected from the Medical Imaging and Data Resource Center (MIDRC). The performance of GPT-4V at detecting imaging findings from each chest radiograph was assessed in the zero-shot setting (where it operates without prior examples) and few-shot setting (where it operates with two examples). Its outcomes were compared with the reference standard with regards to clinical conditions and their corresponding codes in the International Statistical Classification of Diseases, Tenth Revision (ICD-10), including the anatomic location (hereafter, laterality). Results In the zero-shot setting, in the task of detecting ICD-10 codes alone, GPT-4V attained an average positive predictive value (PPV) of 12.3%, average true-positive rate (TPR) of 5.8%, and average F1 score of 7.3% on the NIH data set, and an average PPV of 25.0%, average TPR of 16.8%, and average F1 score of 18.2% on the MIDRC data set. When both the ICD-10 codes and their corresponding laterality were considered, GPT-4V produced an average PPV of 7.8%, average TPR of 3.5%, and average F1 score of 4.5% on the NIH data set, and an average PPV of 10.9%, average TPR of 4.9%, and average F1 score of 6.4% on the MIDRC data set. With few-shot learning, GPT-4V showed improved performance on both data sets. When contrasting zero-shot and few-shot learning, there were improved average TPRs and F1 scores in the few-shot setting, but there was not a substantial increase in the average PPV. Conclusion Although GPT-4V has shown promise in understanding natural images, it had limited effectiveness in interpreting real-world chest radiographs. © RSNA, 2024 Supplemental material is available for this article.
Asunto(s)

Texto completo: 1 Banco de datos: MEDLINE Asunto principal: Radiografía Torácica Límite: Adult / Aged / Female / Humans / Male / Middle aged Idioma: En Año: 2024 Tipo del documento: Article

Texto completo: 1 Banco de datos: MEDLINE Asunto principal: Radiografía Torácica Límite: Adult / Aged / Female / Humans / Male / Middle aged Idioma: En Año: 2024 Tipo del documento: Article