Can a Machine Ace the Test? Assessing GPT-4.0's Precision in Plastic Surgery Board Examinations.
Plast Reconstr Surg Glob Open
; 11(12): e5448, 2023 Dec.
Article
em En
| MEDLINE
| ID: mdl-38111723
ABSTRACT
Background:
As artificial intelligence makes rapid inroads across various fields, its value in medical education is becoming increasingly evident. This study evaluates the performance of the GPT-4.0 large language model in responding to plastic surgery board examination questions and explores its potential as a learning tool.Methods:
We used a selection of 50 questions from 19 different chapters of a widely-used plastic surgery reference. Responses generated by the GPT-4.0 model were assessed based on four parameters accuracy, clarity, completeness, and conciseness. Correlation analyses were conducted to ascertain the relationship between these parameters and the overall performance of the model.Results:
GPT-4.0 showed a strong performance with high mean scores for accuracy (2.88), clarity (3.00), completeness (2.88), and conciseness (2.92) on a three-point scale. Completeness of the model's responses was significantly correlated with accuracy (P < 0.0001), whereas no significant correlation was found between accuracy and clarity or conciseness. Performance variability across different chapters indicates potential limitations of the model in dealing with certain complex topics in plastic surgery.Conclusions:
The GPT-4.0 model exhibits considerable potential as an auxiliary tool for preparation for plastic surgery board examinations. Despite a few identified limitations, the generally high scores on key parameters suggest the model's ability to provide responses that are accurate, clear, complete, and concise. Future research should focus on enhancing the performance of artificial intelligence models in complex medical topics, further improving their applicability in medical education.
Texto completo:
1
Base de dados:
MEDLINE
Idioma:
En
Ano de publicação:
2023
Tipo de documento:
Article