ChatGPT-4 Performs Clinical Information Retrieval Tasks Using Consistently More Trustworthy Resources Than Does Google Search for Queries Concerning the Latarjet Procedure.

Oeding, Jacob F; Lu, Amy Z; Mazzucco, Michael; Fu, Michael C; Taylor, Samuel A; Dines, David M; Warren, Russell F; Gulotta, Lawrence V; Dines, Joshua S; Kunze, Kyle N

Oeding, Jacob F; Lu, Amy Z; Mazzucco, Michael; Fu, Michael C; Taylor, Samuel A; Dines, David M; Warren, Russell F; Gulotta, Lawrence V; Dines, Joshua S; Kunze, Kyle N.

Afiliación

Oeding JF; School of Medicine, Mayo Clinic Alix School of Medicine, Rochester, Minnesota, U.S.A.
Lu AZ; Weill Cornell College of Medicine, New York, New York, U.S.A.
Mazzucco M; Weill Cornell College of Medicine, New York, New York, U.S.A.
Fu MC; Sports Medicine and Shoulder Service, Hospital for Special Surgery, New York, New York, U.S.A.; Department of Orthopaedic Surgery, Hospital for Special Surgery, New York, New York, U.S.A.
Taylor SA; Sports Medicine and Shoulder Service, Hospital for Special Surgery, New York, New York, U.S.A.; Department of Orthopaedic Surgery, Hospital for Special Surgery, New York, New York, U.S.A.
Dines DM; Sports Medicine and Shoulder Service, Hospital for Special Surgery, New York, New York, U.S.A.; Department of Orthopaedic Surgery, Hospital for Special Surgery, New York, New York, U.S.A.
Warren RF; Sports Medicine and Shoulder Service, Hospital for Special Surgery, New York, New York, U.S.A.; Department of Orthopaedic Surgery, Hospital for Special Surgery, New York, New York, U.S.A.
Gulotta LV; Sports Medicine and Shoulder Service, Hospital for Special Surgery, New York, New York, U.S.A.; Department of Orthopaedic Surgery, Hospital for Special Surgery, New York, New York, U.S.A.
Dines JS; Sports Medicine and Shoulder Service, Hospital for Special Surgery, New York, New York, U.S.A.; Department of Orthopaedic Surgery, Hospital for Special Surgery, New York, New York, U.S.A.
Kunze KN; Sports Medicine and Shoulder Service, Hospital for Special Surgery, New York, New York, U.S.A.; Department of Orthopaedic Surgery, Hospital for Special Surgery, New York, New York, U.S.A.. Electronic address: kylekunze7@gmail.com.

Arthroscopy ; 2024 Jun 25.

Article en En | MEDLINE | ID: mdl-38936557

ABSTRACT

ABSTRACT

PURPOSE:

To assess the ability of ChatGPT-4, an automated Chatbot powered by artificial intelligence, to answer common patient questions concerning the Latarjet procedure for patients with anterior shoulder instability and compare this performance with Google Search Engine.

METHODS:

Using previously validated methods, a Google search was first performed using the query "Latarjet." Subsequently, the top 10 frequently asked questions (FAQs) and associated sources were extracted. ChatGPT-4 was then prompted to provide the top 10 FAQs and answers concerning the procedure. This process was repeated to identify additional FAQs requiring discrete-numeric answers to allow for a comparison between ChatGPT-4 and Google. Discrete, numeric answers were subsequently assessed for accuracy on the basis of the clinical judgment of 2 fellowship-trained sports medicine surgeons who were blinded to search platform.

RESULTS:

Mean (± standard deviation) accuracy to numeric-based answers was 2.9 ± 0.9 for ChatGPT-4 versus 2.5 ± 1.4 for Google (P = .65). ChatGPT-4 derived information for answers only from academic sources, which was significantly different from Google Search Engine (P = .003), which used only 30% academic sources and websites from individual surgeons (50%) and larger medical practices (20%). For general FAQs, 40% of FAQs were found to be identical when comparing ChatGPT-4 and Google Search Engine. In terms of sources used to answer these questions, ChatGPT-4 again used 100% academic resources, whereas Google Search Engine used 60% academic resources, 20% surgeon personal websites, and 20% medical practices (P = .087).

CONCLUSIONS:

ChatGPT-4 demonstrated the ability to provide accurate and reliable information about the Latarjet procedure in response to patient queries, using multiple academic sources in all cases. This was in contrast to Google Search Engine, which more frequently used single-surgeon and large medical practice websites. Despite differences in the resources accessed to perform information retrieval tasks, the clinical relevance and accuracy of information provided did not significantly differ between ChatGPT-4 and Google Search Engine. CLINICAL RELEVANCE Commercially available large language models (LLMs), such as ChatGPT-4, can perform diverse information retrieval tasks on-demand. An important medical information retrieval application for LLMs consists of the ability to provide comprehensive, relevant, and accurate information for various use cases such as investigation about a recently diagnosed medical condition or procedure. Understanding the performance and abilities of LLMs for use cases has important implications for deployment within health care settings.

Texto completo

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Idioma: En Revista: Arthroscopy Asunto de la revista: ORTOPEDIA Año: 2024 Tipo del documento: Article País de afiliación: Estados Unidos

Texto completo

Imprimir

XML

PubMed Links

Buscar en Google