Your browser doesn't support javascript.
loading
Use of Large Language Models to Predict Neuroimaging.
Nazario-Johnson, Lleayem; Zaki, Hossam A; Tung, Glenn A.
Afiliación
  • Nazario-Johnson L; Department of Diagnostic Imaging, The Warren Alpert Medical School of Brown University/Rhode Island Hospital, Providence, Rhode Island.
  • Zaki HA; Department of Diagnostic Imaging, The Warren Alpert Medical School of Brown University/Rhode Island Hospital, Providence, Rhode Island. Electronic address: hossam_zaki@brown.edu.
  • Tung GA; Associate Dean for Clinical Affairs, Department of Diagnostic Imaging, The Warren Alpert Medical School of Brown University/Rhode Island Hospital, Providence, Rhode Island.
J Am Coll Radiol ; 20(10): 1004-1009, 2023 10.
Article en En | MEDLINE | ID: mdl-37423349
ABSTRACT

PURPOSE:

Large language models (LLMs) have demonstrated a level of competency within the medical field. The aim of this study was to explore the ability of LLMs to predict the best neuroradiologic imaging modality given specific clinical presentations. In addition, the authors seek to determine if LLMs can outperform an experienced neuroradiologist in this regard.

METHODS:

ChatGPT and Glass AI, a health care-based LLM by Glass Health, were used. ChatGPT was prompted to rank the three best neuroimaging modalities while taking the best responses from Glass AI and the neuroradiologist. The responses were compared with the ACR Appropriateness Criteria for 147 conditions. Clinical scenarios were passed into each LLM twice to account for stochasticity. Each output was scored out of 3 on the basis of the criteria. Partial scores were given for nonspecific answers.

RESULTS:

ChatGPT and Glass AI scored 1.75 and 1.83, respectively, with no statistically significant difference. The neuroradiologist scored 2.20, significantly outperforming both LLMs. ChatGPT was also found to be the more inconsistent of the two LLMs, with the score difference between both outputs being statistically significant. Additionally, scores between different ranks output by ChatGPT were statistically significant.

CONCLUSIONS:

LLMs perform well in selecting appropriate neuroradiologic imaging procedures when prompted with specific clinical scenarios. ChatGPT performed the same as Glass AI, suggesting that with medical text training, ChatGPT could significantly improve its function in this application. LLMs did not outperform an experienced neuroradiologist, indicating the need for continued improvement in the medical context.
Asunto(s)
Palabras clave

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Neuroimagen / Lenguaje Tipo de estudio: Prognostic_studies / Risk_factors_studies Límite: Humans Idioma: En Revista: J Am Coll Radiol Asunto de la revista: RADIOLOGIA Año: 2023 Tipo del documento: Article

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Neuroimagen / Lenguaje Tipo de estudio: Prognostic_studies / Risk_factors_studies Límite: Humans Idioma: En Revista: J Am Coll Radiol Asunto de la revista: RADIOLOGIA Año: 2023 Tipo del documento: Article