Can generative artificial intelligence pass the orthopaedic board examination?

Isleem, Ula N; Zaidat, Bashar; Ren, Renee; Geng, Eric A; Burapachaisri, Aonnicha; Tang, Justin E; Kim, Jun S; Cho, Samuel K

Isleem, Ula N; Zaidat, Bashar; Ren, Renee; Geng, Eric A; Burapachaisri, Aonnicha; Tang, Justin E; Kim, Jun S; Cho, Samuel K.

Affiliation

Isleem UN; Department of Orthopaedic Surgery, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
Zaidat B; Department of Orthopaedic Surgery, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
Ren R; Department of Orthopaedic Surgery, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
Geng EA; Department of Orthopaedic Surgery, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
Burapachaisri A; Department of Orthopaedic Surgery, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
Tang JE; Department of Orthopaedic Surgery, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
Kim JS; Department of Orthopaedic Surgery, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
Cho SK; Department of Orthopaedic Surgery, Icahn School of Medicine at Mount Sinai, New York, NY, USA.

J Orthop ; 53: 27-33, 2024 Jul.

Article in En | MEDLINE | ID: mdl-38450060

ABSTRACT

ABSTRACT

Background:

Resident training programs in the US use the Orthopaedic In-Training Examination (OITE) developed by the American Academy of Orthopaedic Surgeons (AAOS) to assess the current knowledge of their residents and to identify the residents at risk of failing the Amerian Board of Orthopaedic Surgery (ABOS) examination. Optimal strategies for OITE preparation are constantly being explored. There may be a role for Large Language Models (LLMs) in orthopaedic resident education. ChatGPT, an LLM launched in late 2022 has demonstrated the ability to produce accurate, detailed answers, potentially enabling it to aid in medical education and clinical decision-making. The purpose of this study is to evaluate the performance of ChatGPT on Orthopaedic In-Training Examinations using Self-Assessment Exams from the AAOS database and approved literature as a proxy for the Orthopaedic Board Examination.

Methods:

301 SAE questions from the AAOS database and associated AAOS literature were input into ChatGPT's interface in a question and multiple-choice format and the answers were then analyzed to determine which answer choice was selected. A new chat was used for every question. All answers were recorded, categorized, and compared to the answer given by the OITE and SAE exams, noting whether the answer was right or wrong.

Results:

Of the 301 questions asked, ChatGPT was able to correctly answer 183 (60.8%) of them. The subjects with the highest percentage of correct questions were basic science (81%), oncology (72.7%, shoulder and elbow (71.9%), and sports (71.4%). The questions were further subdivided into 3 groups those about management, diagnosis, or knowledge recall. There were 86 management questions and 47 were correct (54.7%), 45 diagnosis questions with 32 correct (71.7%), and 168 knowledge recall questions with 102 correct (60.7%).

Conclusions:

ChatGPT has the potential to provide orthopedic educators and trainees with accurate clinical conclusions for the majority of board-style questions, although its reasoning should be carefully analyzed for accuracy and clinical validity. As such, its usefulness in a clinical educational context is currently limited but rapidly evolving. Clinical relevance ChatGPT can access a multitude of medical data and may help provide accurate answers to clinical questions.

Fulltext

XML

PubMed Links

Search on Google

Full text: 1 Collection: 01-internacional Database: MEDLINE Language: En Journal: J Orthop Year: 2024 Type: Article Affiliation country: United States

Fulltext

XML

PubMed Links

Search on Google

Full text: 1 Collection: 01-internacional Database: MEDLINE Language: En Journal: J Orthop Year: 2024 Type: Article Affiliation country: United States