Artificial intelligence model GPT4 narrowly fails simulated radiological protection exam.

Roemer, G; Li, A; Mahmood, U; Dauer, L; Bellamy, M

Roemer, G; Li, A; Mahmood, U; Dauer, L; Bellamy, M.

Afiliación

Roemer G; MSKCC, 1275 York Avenue, New York, NY 10065, United States of America.
Li A; MSKCC, 1275 York Avenue, New York, NY 10065, United States of America.
Mahmood U; MSKCC, 1275 York Avenue, New York, NY 10065, United States of America.
Dauer L; MSKCC, 1275 York Avenue, New York, NY 10065, United States of America.
Bellamy M; MSKCC, 1275 York Avenue, New York, NY 10065, United States of America.

J Radiol Prot ; 44(1)2024 Jan 29.

Article en En | MEDLINE | ID: mdl-38232401

ABSTRACT

ABSTRACT

This study assesses the efficacy of Generative Pre-Trained Transformers (GPT) published by OpenAI in the specialised domains of radiological protection and health physics. Utilising a set of 1064 surrogate questions designed to mimic a health physics certification exam, we evaluated the models' ability to accurately respond to questions across five knowledge domains. Our results indicated that neither model met the 67% passing threshold, with GPT-3.5 achieving a 45.3% weighted average and GPT-4 attaining 61.7%. Despite GPT-4's significant parameter increase and multimodal capabilities, it demonstrated superior performance in all categories yet still fell short of a passing score. The study's methodology involved a simple, standardised prompting strategy without employing prompt engineering or in-context learning, which are known to potentially enhance performance. The analysis revealed that GPT-3.5 formatted answers more correctly, despite GPT-4's higher overall accuracy. The findings suggest that while GPT-3.5 and GPT-4 show promise in handling domain-specific content, their application in the field of radiological protection should be approached with caution, emphasising the need for human oversight and verification.

Asunto(s)

Inteligencia Artificial; Protección Radiológica; Humanos; Física Sanitaria; Suministros de Energía Eléctrica

Palabras clave

GPT4; artificial; intelligence

Texto completo

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Asunto principal: Protección Radiológica / Inteligencia Artificial Tipo de estudio: Prognostic_studies Límite: Humans Idioma: En Revista: J Radiol Prot Asunto de la revista: RADIOLOGIA Año: 2024 Tipo del documento: Article País de afiliación: Estados Unidos

Texto completo

Imprimir

XML

PubMed Links

Buscar en Google