Your browser doesn't support javascript.
loading
An analysis of ChatGPT recommendations for the diagnosis and treatment of cervical radiculopathy.
Hoang, Timothy; Liou, Lathan; Rosenberg, Ashley M; Zaidat, Bashar; Duey, Akiro H; Shrestha, Nancy; Ahmed, Wasil; Tang, Justin; Kim, Jun S; Cho, Samuel K.
Afiliação
  • Hoang T; 1Department of Orthopaedics, Icahn School of Medicine at Mount Sinai, New York, New York; and.
  • Liou L; 1Department of Orthopaedics, Icahn School of Medicine at Mount Sinai, New York, New York; and.
  • Rosenberg AM; 1Department of Orthopaedics, Icahn School of Medicine at Mount Sinai, New York, New York; and.
  • Zaidat B; 1Department of Orthopaedics, Icahn School of Medicine at Mount Sinai, New York, New York; and.
  • Duey AH; 1Department of Orthopaedics, Icahn School of Medicine at Mount Sinai, New York, New York; and.
  • Shrestha N; 2Chicago Medical School, Rosalind Franklin University, North Chicago, Illinois.
  • Ahmed W; 1Department of Orthopaedics, Icahn School of Medicine at Mount Sinai, New York, New York; and.
  • Tang J; 1Department of Orthopaedics, Icahn School of Medicine at Mount Sinai, New York, New York; and.
  • Kim JS; 1Department of Orthopaedics, Icahn School of Medicine at Mount Sinai, New York, New York; and.
  • Cho SK; 1Department of Orthopaedics, Icahn School of Medicine at Mount Sinai, New York, New York; and.
J Neurosurg Spine ; 41(3): 385-395, 2024 Sep 01.
Article em En | MEDLINE | ID: mdl-38941643
ABSTRACT

OBJECTIVE:

The objective of this study was to assess the safety and accuracy of ChatGPT recommendations in comparison to the evidence-based guidelines from the North American Spine Society (NASS) for the diagnosis and treatment of cervical radiculopathy.

METHODS:

ChatGPT was prompted with questions from the 2011 NASS clinical guidelines for cervical radiculopathy and evaluated for concordance. Selected key phrases within the NASS guidelines were identified. Completeness was measured as the number of overlapping key phrases between ChatGPT responses and NASS guidelines divided by the total number of key phrases. A senior spine surgeon evaluated the ChatGPT responses for safety and accuracy. ChatGPT responses were further evaluated on their readability, similarity, and consistency. Flesch Reading Ease scores and Flesch-Kincaid reading levels were measured to assess readability. The Jaccard Similarity Index was used to assess agreement between ChatGPT responses and NASS clinical guidelines.

RESULTS:

A total of 100 key phrases were identified across 14 NASS clinical guidelines. The mean completeness of ChatGPT-4 was 46%. ChatGPT-3.5 yielded a completeness of 34%. ChatGPT-4 outperformed ChatGPT-3.5 by a margin of 12%. ChatGPT-4.0 outputs had a mean Flesch reading score of 15.24, which is very difficult to read, requiring a college graduate education to understand. ChatGPT-3.5 outputs had a lower mean Flesch reading score of 8.73, indicating that they are even more difficult to read and require a professional education level to do so. However, both versions of ChatGPT were more accessible than NASS guidelines, which had a mean Flesch reading score of 4.58. Furthermore, with NASS guidelines as a reference, ChatGPT-3.5 registered a mean ± SD Jaccard Similarity Index score of 0.20 ± 0.078 while ChatGPT-4 had a mean of 0.18 ± 0.068. Based on physician evaluation, outputs from ChatGPT-3.5 and ChatGPT-4.0 were safe 100% of the time. Thirteen of 14 (92.8%) ChatGPT-3.5 responses and 14 of 14 (100%) ChatGPT-4.0 responses were in agreement with current best clinical practices for cervical radiculopathy according to a senior spine surgeon.

CONCLUSIONS:

ChatGPT models were able to provide safe and accurate but incomplete responses to NASS clinical guideline questions about cervical radiculopathy. Although the authors' results suggest that improvements are required before ChatGPT can be reliably deployed in a clinical setting, future versions of the LLM hold promise as an updated reference for guidelines on cervical radiculopathy. Future versions must prioritize accessibility and comprehensibility for a diverse audience.
Assuntos
Palavras-chave

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Radiculopatia Limite: Humans Idioma: En Revista: J Neurosurg Spine Assunto da revista: NEUROCIRURGIA Ano de publicação: 2024 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Radiculopatia Limite: Humans Idioma: En Revista: J Neurosurg Spine Assunto da revista: NEUROCIRURGIA Ano de publicação: 2024 Tipo de documento: Article