A comparative study of English and Japanese ChatGPT responses to anaesthesia-related medical questions.

Ando, Kazuo; Sato, Masaki; Wakatsuki, Shin; Nagai, Ryotaro; Chino, Kumiko; Kai, Hinata; Sasaki, Tomomi; Kato, Rie; Nguyen, Teresa Phuongtram; Guo, Nan; Sultan, Pervez

Ando, Kazuo; Sato, Masaki; Wakatsuki, Shin; Nagai, Ryotaro; Chino, Kumiko; Kai, Hinata; Sasaki, Tomomi; Kato, Rie; Nguyen, Teresa Phuongtram; Guo, Nan; Sultan, Pervez.

Afiliação

Ando K; Department of Anesthesiology, Perioperative and Pain Medicine. Stanford University School of Medicine, Stanford, CA, USA.
Sato M; Department of Anesthesiology, Perioperative and Pain Medicine. Stanford University School of Medicine, Stanford, CA, USA.
Wakatsuki S; Private Practice Group, Pacific Anesthesia Inc., Honolulu, HI, USA.
Nagai R; Private Practice Group, Pacific Anesthesia Inc., Honolulu, HI, USA.
Chino K; University of Pittsburgh Medical Center, Magee-Women's Hospital, PA, USA.
Kai H; Department of Anesthesiology, Indiana University School of Medicine, Indianapolis, IN, USA.
Sasaki T; Department of Anesthesiology, Showa University School of Medicine, Tokyo, Japan.
Kato R; Department of Anesthesiology, Showa University School of Medicine, Tokyo, Japan.
Nguyen TP; Department of Anesthesiology, Perioperative and Pain Medicine. Stanford University School of Medicine, Stanford, CA, USA.
Guo N; Department of Anesthesiology, Perioperative and Pain Medicine. Stanford University School of Medicine, Stanford, CA, USA.
Sultan P; Department of Anesthesiology, Perioperative and Pain Medicine. Stanford University School of Medicine, Stanford, CA, USA.

BJA Open ; 10: 100296, 2024 Jun.

Article em En | MEDLINE | ID: mdl-38975242

ABSTRACT

ABSTRACT

Background:

The expansion of artificial intelligence (AI) within large language models (LLMs) has the potential to streamline healthcare delivery. Despite the increased use of LLMs, disparities in their performance particularly in different languages, remain underexplored. This study examines the quality of ChatGPT responses in English and Japanese, specifically to questions related to anaesthesiology.

Methods:

Anaesthesiologists proficient in both languages were recruited as experts in this study. Ten frequently asked questions in anaesthesia were selected and translated for evaluation. Three non-sequential responses from ChatGPT were assessed for content quality (accuracy, comprehensiveness, and safety) and communication quality (understanding, empathy/tone, and ethics) by expert evaluators.

Results:

Eight anaesthesiologists evaluated English and Japanese LLM responses. The overall quality for all questions combined was higher in English compared with Japanese responses. Content and communication quality were significantly higher in English compared with Japanese LLMs responses (both P<0.001) in all three responses. Comprehensiveness, safety, and understanding were higher scores in English LLM responses. In all three responses, more than half of the evaluators marked overall English responses as better than Japanese responses.

Conclusions:

English LLM responses to anaesthesia-related frequently asked questions were superior in quality to Japanese responses when assessed by bilingual anaesthesia experts in this report. This study highlights the potential for language-related disparities in healthcare information and the need to improve the quality of AI responses in underrepresented languages. Future studies are needed to explore these disparities in other commonly spoken languages and to compare the performance of different LLMs.

Palavras-chave

ChatGPT; anaesthesia; artificial intelligence; digital health

Texto completo

Adicionar na Minha BVS

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Revista: BJA Open Ano de publicação: 2024 Tipo de documento: Article País de afiliação: Estados Unidos País de publicação: Reino Unido

Texto completo

Adicionar na Minha BVS

Imprimir

XML

PubMed Links

Buscar no Google