Chat Generative Pretraining Transformer Answers Patient-focused Questions in Cervical Spine Surgery.

Subramanian, Tejas; Araghi, Kasra; Amen, Troy B; Kaidi, Austin; Sosa, Branden; Shahi, Pratyush; Qureshi, Sheeraz; Iyer, Sravisht

Subramanian, Tejas; Araghi, Kasra; Amen, Troy B; Kaidi, Austin; Sosa, Branden; Shahi, Pratyush; Qureshi, Sheeraz; Iyer, Sravisht.

Afiliación

Subramanian T; Department of Orthopedic Surgery, Hospital for Special Surgery.
Araghi K; Weill Cornell Medicine, New York, NY.
Amen TB; Department of Orthopedic Surgery, Hospital for Special Surgery.
Kaidi A; Department of Orthopedic Surgery, Hospital for Special Surgery.
Sosa B; Department of Orthopedic Surgery, Hospital for Special Surgery.
Shahi P; Weill Cornell Medicine, New York, NY.
Qureshi S; Department of Orthopedic Surgery, Hospital for Special Surgery.
Iyer S; Department of Orthopedic Surgery, Hospital for Special Surgery.

Clin Spine Surg ; 37(6): E278-E281, 2024 Jul 01.

Article en En | MEDLINE | ID: mdl-38531823

ABSTRACT

ABSTRACT

STUDY

DESIGN:

Review of Chat Generative Pretraining Transformer (ChatGPT) outputs to select patient-focused questions.

OBJECTIVE:

We aimed to examine the quality of ChatGPT responses to cervical spine questions.

BACKGROUND:

Artificial intelligence and its utilization to improve patient experience across medicine is seeing remarkable growth. One such usage is patient education. For the first time on a large scale, patients can ask targeted questions and receive similarly targeted answers. Although patients may use these resources to assist in decision-making, there still exists little data regarding their accuracy, especially within orthopedic surgery and more specifically spine surgery.

METHODS:

We compiled 9 frequently asked questions cervical spine surgeons receive in the clinic to test ChatGPT's version 3.5 ability to answer a nuanced topic. Responses were reviewed by 2 independent reviewers on a Likert Scale for the accuracy of information presented (0-5 points), appropriateness in giving a specific answer (0-3 points), and readability for a layperson (0-2 points). Readability was assessed through the Flesh-Kincaid grade level analysis for the original prompt and for a second prompt asking for rephrasing at the sixth-grade reading level.

RESULTS:

On average, ChatGPT's responses scored a 7.1/10. Accuracy was rated on average a 4.1/5. Appropriateness was 1.8/3. Readability was a 1.2/2. Readability was determined to be at the 13.5 grade level originally and at the 11.2 grade level after prompting.

CONCLUSIONS:

ChatGPT has the capacity to be a powerful means for patients to gain important and specific information regarding their pathologies and surgical options. These responses are limited in their accuracy, and we, in addition, noted readability is not optimal for the average patient. Despite these limitations in ChatGPT's capability to answer these nuanced questions, the technology is impressive, and surgeons should be aware patients will likely increasingly rely on it.

Asunto(s)

Vértebras Cervicales; Humanos; Vértebras Cervicales/cirugía; Educación del Paciente como Asunto; Encuestas y Cuestionarios

Texto completo

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Asunto principal: Vértebras Cervicales Límite: Humans Idioma: En Revista: Clin Spine Surg Año: 2024 Tipo del documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar en Google