ChatGPT's adherence to otolaryngology clinical practice guidelines.

Tessler, Idit; Wolfovitz, Amit; Alon, Eran E; Gecel, Nir A; Livneh, Nir; Zimlichman, Eyal; Klang, Eyal

Tessler, Idit; Wolfovitz, Amit; Alon, Eran E; Gecel, Nir A; Livneh, Nir; Zimlichman, Eyal; Klang, Eyal.

Afiliação

Tessler I; Department of Otolaryngology and Head and Neck Surgery, Sheba Medical Center, Ramat Gan, Israel. idit.tessler@gmail.com.
Wolfovitz A; School of Medicine, Tel Aviv University, Tel Aviv, Israel. idit.tessler@gmail.com.
Alon EE; ARC Innovation Center, Sheba Medical Center, Ramat Gan, Israel. idit.tessler@gmail.com.
Gecel NA; Department of Otolaryngology and Head and Neck Surgery, Sheba Medical Center, Ramat Gan, Israel.
Livneh N; School of Medicine, Tel Aviv University, Tel Aviv, Israel.
Zimlichman E; Department of Otolaryngology and Head and Neck Surgery, Sheba Medical Center, Ramat Gan, Israel.
Klang E; School of Medicine, Tel Aviv University, Tel Aviv, Israel.

Eur Arch Otorhinolaryngol ; 281(7): 3829-3834, 2024 Jul.

Article em En | MEDLINE | ID: mdl-38647684

ABSTRACT

ABSTRACT

OBJECTIVES:

Large language models, including ChatGPT, has the potential to transform the way we approach medical knowledge, yet accuracy in clinical topics is critical. Here we assessed ChatGPT's performance in adhering to the American Academy of Otolaryngology-Head and Neck Surgery guidelines.

METHODS:

We presented ChatGPT with 24 clinical otolaryngology questions based on the guidelines of the American Academy of Otolaryngology. This was done three times (N = 72) to test the model's consistency. Two otolaryngologists evaluated the responses for accuracy and relevance to the guidelines. Cohen's Kappa was used to measure evaluator agreement, and Cronbach's alpha assessed the consistency of ChatGPT's responses.

RESULTS:

The study revealed mixed results; 59.7% (43/72) of ChatGPT's responses were highly accurate, while only 2.8% (2/72) directly contradicted the guidelines. The model showed 100% accuracy in Head and Neck, but lower accuracy in Rhinology and Otology/Neurotology (66%), Laryngology (50%), and Pediatrics (8%). The model's responses were consistent in 17/24 (70.8%), with a Cronbach's alpha value of 0.87, indicating a reasonable consistency across tests.

CONCLUSIONS:

Using a guideline-based set of structured questions, ChatGPT demonstrates consistency but variable accuracy in otolaryngology. Its lower performance in some areas, especially Pediatrics, suggests that further rigorous evaluation is needed before considering real-world clinical use.

Assuntos

Fidelidade a Diretrizes; Otolaringologia; Guias de Prática Clínica como Assunto; Otolaringologia/normas; Humanos; Estados Unidos

Palavras-chave

Artificial intelligence; ChatGPT; Clinical practice guidelines; Medical knowledge; Otolaryngology

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Otolaringologia / Guias de Prática Clínica como Assunto / Fidelidade a Diretrizes Limite: Humans País/Região como assunto: America do norte Idioma: En Revista: Eur Arch Otorhinolaryngol Assunto da revista: OTORRINOLARINGOLOGIA Ano de publicação: 2024 Tipo de documento: Article País de afiliação: Israel

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google