Your browser doesn't support javascript.
loading
Clinical application potential of large language model: a study based on thyroid nodules.
Xia, Shujun; Hua, Qing; Mei, Zihan; Xu, Wenwen; Lai, Limei; Wei, Minyan; Qin, Yu; Luo, Lin; Wang, Changhua; Huo, ShengNan; Fu, Lijun; Zhou, Feidu; Wu, Jiang; Zhang, Li; Lv, De; Li, Jianxin; Wang, Xin; Li, Ning; Song, Yanyan; Zhou, Jianqiao.
Afiliação
  • Xia S; Department of Ultrasound, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China.
  • Hua Q; College of Health Science and Technology, Shanghai Jiao Tong University School of Medicine, Shanghai, China.
  • Mei Z; Department of Ultrasound, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China.
  • Xu W; Department of Ultrasound, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China.
  • Lai L; Department of Ultrasound, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China.
  • Wei M; Department of Ultrasound, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China.
  • Qin Y; Department of Ultrasound, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China.
  • Luo L; Department of Ultrasound, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China.
  • Wang C; Department of Endocrinology, Kongjiang Hospital, Yangpu District, Shanghai, China.
  • Huo S; Department of Thyroid and Breast Surgery, Xianning NO.1 People's Hospital, Xianning, China.
  • Fu L; Department of Thyroid, Handan Hangang Hospital, Hanshan District, Handan City, Hebei, China.
  • Zhou F; Department of Thyroid Surgery, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China.
  • Wu J; Thyroid and Breast Surgery, LiuYang People's Hospital, Changsha, China.
  • Zhang L; Department of Thyroid, Breast and Vascular Surgery, Xijing Hospital, The Fourth Military Medical University, Xi'an, Shanxi, China.
  • Lv; Department of Head and Neck Surgery, Shanxi Province Cancer Hospital, Taiyuan, China.
  • Li J; Department of Endocrinology, Hospital of Chengdu University of Traditional Chinese Medicine, Chengdu, China.
  • Wang X; Department of Surgery, Mazhanghuiwen Hospital, Ma Zhang District, Zhanjiang, Guangdong, China.
  • Li N; Endocrine Department, Lianshui People's Hospital, Huaian, Jiangsu, China.
  • Song Y; Department of Ultrasound, Anning First People's Hospital, Affliated to Kunming University of Science and Technology, Anning, Yunnan Province, China.
  • Zhou J; Department of Biostatistics, Institute of Medical Sciences, Shanghai Jiao Tong University School of Medicine, Shanghai, China.
Endocrine ; 2024 Jul 30.
Article em En | MEDLINE | ID: mdl-39080210
ABSTRACT

BACKGROUND:

Limited data indicated the performance of large language model (LLM) taking on the role of doctors. We aimed to investigate the potential for ChatGPT-3.5 and New Bing Chat acting as doctors using thyroid nodules as an example.

METHODS:

A total of 145 patients with thyroid nodules were included for generating questions. Each question was entered into chatbot of ChatGPT-3.5 and New Bing Chat five times and five responses were acquired respectively. These responses were compared with answers given by five junior doctors. Responses from five senior doctors were regarded as gold standard. Accuracy and reproducibility of responses from ChatGPT-3.5 and New Bing Chat were evaluated.

RESULTS:

The accuracy of ChatGPT-3.5 and New Bing Chat in answering Q2, Q3, Q5 were lower than that of junior doctors (all P < 0.05), while both LLMs were comparable to junior doctors when answering Q4 and Q6. In terms of "high reproducibility and accuracy", ChatGPT-3.5 outperformed New Bing Chat in Q1 and Q5 (P < 0.001 and P = 0.008, respectively), but showed no significant difference in Q2, Q3, Q4, and Q6 (P > 0.05 for all). New Bing Chat generated higher accuracy than ChatGPT-3.5 (72.41% vs 58.62%) (P = 0.003) in decision making of thyroid nodules, and both were less accurate than junior doctors (89.66%, P < 0.001 for both).

CONCLUSIONS:

The exploration of ChatGPT-3.5 and New Bing Chat in the diagnosis and management of thyroid nodules illustrates that LLMs currently demonstrate the potential for medical applications, but do not yet reach the clinical decision-making capacity of doctors.
Palavras-chave

Texto completo: 1 Base de dados: MEDLINE Idioma: En Ano de publicação: 2024 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Idioma: En Ano de publicação: 2024 Tipo de documento: Article