Comparison of Artificial Intelligence to Resident Performance on Upper-Extremity Orthopaedic In-Training Examination Questions.

Ozdag, Yagiz; Hayes, Daniel S; Makar, Gabriel S; Manzar, Shahid; Foster, Brian K; Shultz, Mason J; Klena, Joel C; Grandizio, Louis C

Ozdag, Yagiz; Hayes, Daniel S; Makar, Gabriel S; Manzar, Shahid; Foster, Brian K; Shultz, Mason J; Klena, Joel C; Grandizio, Louis C.

Afiliación

Ozdag Y; Department of Orthopaedic Surgery, Geisinger Musculoskeletal Institute, Geisinger Commonwealth School of Medicine, Danville, PA.
Hayes DS; Department of Orthopaedic Surgery, Geisinger Musculoskeletal Institute, Geisinger Commonwealth School of Medicine, Danville, PA.
Makar GS; Department of Orthopaedic Surgery, Geisinger Musculoskeletal Institute, Geisinger Commonwealth School of Medicine, Danville, PA.
Manzar S; Department of Orthopaedic Surgery, Geisinger Musculoskeletal Institute, Geisinger Commonwealth School of Medicine, Danville, PA.
Foster BK; Department of Orthopaedic Surgery, Geisinger Musculoskeletal Institute, Geisinger Commonwealth School of Medicine, Danville, PA.
Shultz MJ; Department of Orthopaedic Surgery, Geisinger Musculoskeletal Institute, Geisinger Commonwealth School of Medicine, Danville, PA.
Klena JC; Department of Orthopaedic Surgery, Geisinger Musculoskeletal Institute, Geisinger Commonwealth School of Medicine, Danville, PA.
Grandizio LC; Department of Orthopaedic Surgery, Geisinger Musculoskeletal Institute, Geisinger Commonwealth School of Medicine, Danville, PA.

J Hand Surg Glob Online ; 6(2): 164-168, 2024 Mar.

Article en En | MEDLINE | ID: mdl-38903829

ABSTRACT

ABSTRACT

Purpose:

Currently, there is a paucity of prior investigations and studies examining applications for artificial intelligence (AI) in upper-extremity (UE) surgical education. The purpose of this investigation was to assess the performance of a novel AI tool (ChatGPT) on UE questions on the Orthopaedic In-Training Examination (OITE). We aimed to compare the performance of ChatGPT to the examination performance of hand surgery residents.

Methods:

We selected questions from the 2020-2022 OITEs that focused on both the hand and UE as well as the shoulder and elbow content domains. These questions were divided into two categories those with text-only prompts (text-only questions) and those that included supplementary images or videos (media questions). Two authors (B.K.F. and G.S.M.) converted the accompanying media into text-based descriptions. Included questions were inputted into ChatGPT (version 3.5) to generate responses. Each OITE question was entered into ChatGPT three times (1) open-ended response, which requested a free-text response; (2) multiple-choice responses without asking for justification; and (3) multiple-choice response with justification. We referred to the OITE scoring guide for each year in order to compare the percentage of correct AI responses to correct resident responses.

Results:

A total of 102 UE OITE questions were included; 59 were text-only questions, and 43 were media-based. ChatGPT correctly answered 46 (45%) of 102 questions using the Multiple Choice No Justification prompt requirement (42% for text-based and 44% for media questions). Compared to ChatGPT, postgraduate year 1 orthopaedic residents achieved an average score of 51% correct. Postgraduate year 5 residents answered 76% of the same questions correctly.

Conclusions:

ChatGPT answered fewer UE OITE questions correctly compared to hand surgery residents of all training levels. Clinical relevance Further development of novel AI tools may be necessary if this technology is going to have a role in UE education.

Palabras clave

Artificial intelligence; ChatGPT; Orthopaedic In-Training Examination; Resident education; Upper extremity

Texto completo

Imprimir

XML

PubMed Links

Buscar en Google