Evaluating ChatGPT responses in the context of a 53-year-old male with a femoral neck fracture: a qualitative analysis.

Zhou, Yushy; Moon, Charles; Szatkowski, Jan; Moore, Derek; Stevens, Jarrad

Zhou, Yushy; Moon, Charles; Szatkowski, Jan; Moore, Derek; Stevens, Jarrad.

Afiliación

Zhou Y; Department of Surgery, The University of Melbourne, St. Vincent's Hospital Melbourne, 29 Regent Street, Clinical Sciences Block Level 2, Melbourne, VIC, 3010, Australia. yushy.zhou@student.unimelb.edu.au.
Moon C; Department of Orthopaedic Surgery, St. Vincent's Hospital, Melbourne, Australia. yushy.zhou@student.unimelb.edu.au.
Szatkowski J; Department of Orthopaedic Surgery, Cedars-Sinai Medical Centre, Los Angeles, CA, USA.
Moore D; Department of Orthopaedic Surgery, Indiana University Health Methodist Hospital, Indianapolis, IN, USA.
Stevens J; Santa Barbara Orthopedic Associates, Santa Barbara, CA, USA.

Eur J Orthop Surg Traumatol ; 34(2): 927-955, 2024 Feb.

Article en En | MEDLINE | ID: mdl-37776392

ABSTRACT

ABSTRACT

PURPOSE:

The integration of artificial intelligence (AI) tools, such as ChatGPT, in clinical medicine and medical education has gained significant attention due to their potential to support decision-making and improve patient care. However, there is a need to evaluate the benefits and limitations of these tools in specific clinical scenarios.

METHODS:

This study used a case study approach within the field of orthopaedic surgery. A clinical case report featuring a 53-year-old male with a femoral neck fracture was used as the basis for evaluation. ChatGPT, a large language model, was asked to respond to clinical questions related to the case. The responses generated by ChatGPT were evaluated qualitatively, considering their relevance, justification, and alignment with the responses of real clinicians. Alternative dialogue protocols were also employed to assess the impact of additional prompts and contextual information on ChatGPT responses.

RESULTS:

ChatGPT generally provided clinically appropriate responses to the questions posed in the clinical case report. However, the level of justification and explanation varied across the generated responses. Occasionally, clinically inappropriate responses and inconsistencies were observed in the generated responses across different dialogue protocols and on separate days.

CONCLUSIONS:

The findings of this study highlight both the potential and limitations of using ChatGPT in clinical practice. While ChatGPT demonstrated the ability to provide relevant clinical information, the lack of consistent justification and occasional clinically inappropriate responses raise concerns about its reliability. These results underscore the importance of careful consideration and validation when using AI tools in healthcare. Further research and clinician training are necessary to effectively integrate AI tools like ChatGPT, ensuring their safe and reliable use in clinical decision-making.

Asunto(s)

Fracturas del Cuello Femoral; Procedimientos Ortopédicos; Masculino; Humanos; Persona de Mediana Edad; Inteligencia Artificial; Reproducibilidad de los Resultados; Fracturas del Cuello Femoral/cirugía; Toma de Decisiones Clínicas

Palabras clave

Artificial intelligence; Chatgpt; Clinical; Decision-making; Large language model; Orthopaedic surgery

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Procedimientos Ortopédicos / Fracturas del Cuello Femoral Tipo de estudio: Guideline / Prognostic_studies / Qualitative_research Límite: Humans / Male / Middle aged Idioma: En Revista: Eur J Orthop Surg Traumatol Año: 2024 Tipo del documento: Article País de afiliación: Australia

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google