Your browser doesn't support javascript.
loading
Artificial intelligence model GPT4 narrowly fails simulated radiological protection exam.
Roemer, G; Li, A; Mahmood, U; Dauer, L; Bellamy, M.
Afiliación
  • Roemer G; MSKCC, 1275 York Avenue, New York, NY 10065, United States of America.
  • Li A; MSKCC, 1275 York Avenue, New York, NY 10065, United States of America.
  • Mahmood U; MSKCC, 1275 York Avenue, New York, NY 10065, United States of America.
  • Dauer L; MSKCC, 1275 York Avenue, New York, NY 10065, United States of America.
  • Bellamy M; MSKCC, 1275 York Avenue, New York, NY 10065, United States of America.
J Radiol Prot ; 44(1)2024 Jan 29.
Article en En | MEDLINE | ID: mdl-38232401
ABSTRACT
This study assesses the efficacy of Generative Pre-Trained Transformers (GPT) published by OpenAI in the specialised domains of radiological protection and health physics. Utilising a set of 1064 surrogate questions designed to mimic a health physics certification exam, we evaluated the models' ability to accurately respond to questions across five knowledge domains. Our results indicated that neither model met the 67% passing threshold, with GPT-3.5 achieving a 45.3% weighted average and GPT-4 attaining 61.7%. Despite GPT-4's significant parameter increase and multimodal capabilities, it demonstrated superior performance in all categories yet still fell short of a passing score. The study's methodology involved a simple, standardised prompting strategy without employing prompt engineering or in-context learning, which are known to potentially enhance performance. The analysis revealed that GPT-3.5 formatted answers more correctly, despite GPT-4's higher overall accuracy. The findings suggest that while GPT-3.5 and GPT-4 show promise in handling domain-specific content, their application in the field of radiological protection should be approached with caution, emphasising the need for human oversight and verification.
Asunto(s)
Palabras clave

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Asunto principal: Protección Radiológica / Inteligencia Artificial Tipo de estudio: Prognostic_studies Límite: Humans Idioma: En Revista: J Radiol Prot Asunto de la revista: RADIOLOGIA Año: 2024 Tipo del documento: Article País de afiliación: Estados Unidos

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Asunto principal: Protección Radiológica / Inteligencia Artificial Tipo de estudio: Prognostic_studies Límite: Humans Idioma: En Revista: J Radiol Prot Asunto de la revista: RADIOLOGIA Año: 2024 Tipo del documento: Article País de afiliación: Estados Unidos