Use of Large Language Models to Predict Neuroimaging.

Nazario-Johnson, Lleayem; Zaki, Hossam A; Tung, Glenn A

Nazario-Johnson, Lleayem; Zaki, Hossam A; Tung, Glenn A.

Afiliación

Nazario-Johnson L; Department of Diagnostic Imaging, The Warren Alpert Medical School of Brown University/Rhode Island Hospital, Providence, Rhode Island.
Zaki HA; Department of Diagnostic Imaging, The Warren Alpert Medical School of Brown University/Rhode Island Hospital, Providence, Rhode Island. Electronic address: hossam_zaki@brown.edu.
Tung GA; Associate Dean for Clinical Affairs, Department of Diagnostic Imaging, The Warren Alpert Medical School of Brown University/Rhode Island Hospital, Providence, Rhode Island.

J Am Coll Radiol ; 20(10): 1004-1009, 2023 10.

Article en En | MEDLINE | ID: mdl-37423349

ABSTRACT

ABSTRACT

PURPOSE:

Large language models (LLMs) have demonstrated a level of competency within the medical field. The aim of this study was to explore the ability of LLMs to predict the best neuroradiologic imaging modality given specific clinical presentations. In addition, the authors seek to determine if LLMs can outperform an experienced neuroradiologist in this regard.

METHODS:

ChatGPT and Glass AI, a health care-based LLM by Glass Health, were used. ChatGPT was prompted to rank the three best neuroimaging modalities while taking the best responses from Glass AI and the neuroradiologist. The responses were compared with the ACR Appropriateness Criteria for 147 conditions. Clinical scenarios were passed into each LLM twice to account for stochasticity. Each output was scored out of 3 on the basis of the criteria. Partial scores were given for nonspecific answers.

RESULTS:

ChatGPT and Glass AI scored 1.75 and 1.83, respectively, with no statistically significant difference. The neuroradiologist scored 2.20, significantly outperforming both LLMs. ChatGPT was also found to be the more inconsistent of the two LLMs, with the score difference between both outputs being statistically significant. Additionally, scores between different ranks output by ChatGPT were statistically significant.

CONCLUSIONS:

LLMs perform well in selecting appropriate neuroradiologic imaging procedures when prompted with specific clinical scenarios. ChatGPT performed the same as Glass AI, suggesting that with medical text training, ChatGPT could significantly improve its function in this application. LLMs did not outperform an experienced neuroradiologist, indicating the need for continued improvement in the medical context.

Asunto(s)

Lenguaje; Neuroimagen; Humanos; Radiólogos

Palabras clave

Artificial intelligence; ChatGPT; clinical decision making

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Neuroimagen / Lenguaje Tipo de estudio: Prognostic_studies / Risk_factors_studies Límite: Humans Idioma: En Revista: J Am Coll Radiol Asunto de la revista: RADIOLOGIA Año: 2023 Tipo del documento: Article

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google