Assessing the role of GPT-4 in thyroid ultrasound diagnosis and treatment recommendations: enhancing interpretability with a chain of thought approach.

Wang, Zhixiang; Zhang, Zhen; Traverso, Alberto; Dekker, Andre; Qian, Linxue; Sun, Pengfei

Wang, Zhixiang; Zhang, Zhen; Traverso, Alberto; Dekker, Andre; Qian, Linxue; Sun, Pengfei.

Affiliation

Wang Z; Department of Ultrasound, Beijing Friendship Hospital, Capital Medical University, Beijing, China.
Zhang Z; Department of Radiation Oncology (Maastro), GROW-School for Oncology, Maastricht University Medical Centre+, Maastricht, The Netherlands.
Traverso A; Department of Radiation Oncology (Maastro), GROW-School for Oncology, Maastricht University Medical Centre+, Maastricht, The Netherlands.
Dekker A; Department of Radiation Oncology (Maastro), GROW-School for Oncology, Maastricht University Medical Centre+, Maastricht, The Netherlands.
Qian L; Department of Radiation Oncology (Maastro), GROW-School for Oncology, Maastricht University Medical Centre+, Maastricht, The Netherlands.
Sun P; Department of Ultrasound, Beijing Friendship Hospital, Capital Medical University, Beijing, China.

Quant Imaging Med Surg ; 14(2): 1602-1615, 2024 Feb 01.

Article de En | MEDLINE | ID: mdl-38415150

ABSTRACT

ABSTRACT

Background:

As artificial intelligence (AI) becomes increasingly prevalent in the medical field, the effectiveness of AI-generated medical reports in disease diagnosis remains to be evaluated. ChatGPT is a large language model developed by open AI with a notable capacity for text abstraction and comprehension. This study aimed to explore the capabilities, limitations, and potential of Generative Pre-trained Transformer (GPT)-4 in analyzing thyroid cancer ultrasound reports, providing diagnoses, and recommending treatment plans.

Methods:

Using 109 diverse thyroid cancer cases, we evaluated GPT-4's performance by comparing its generated reports to those from doctors with various levels of experience. We also conducted a Turing Test and a consistency analysis. To enhance the interpretability of the model, we applied the Chain of Thought (CoT) method to deconstruct the decision-making chain of the GPT model.

Results:

GPT-4 demonstrated proficiency in report structuring, professional terminology, and clarity of expression, but showed limitations in diagnostic accuracy. In addition, our consistency analysis highlighted certain discrepancies in the AI's performance. The CoT method effectively enhanced the interpretability of the AI's decision-making process.

Conclusions:

GPT-4 exhibits potential as a supplementary tool in healthcare, especially for generating thyroid gland diagnostic reports. Our proposed online platform, "ThyroAIGuide", alongside the CoT method, underscores the potential of AI to augment diagnostic processes, elevate healthcare accessibility, and advance patient education. However, the journey towards fully integrating AI into healthcare is ongoing, requiring continuous research, development, and careful monitoring by medical professionals to ensure patient safety and quality of care.

Mots clés

ChatGPT; artificial intelligence (AI); diagnosis; thyroid cancer; ultrasound

Texte intégral

Ajouter à My VHL

Imprimer

XML

PubMed Links

Recherche sur Google

Texte intégral: 1 Collection: 01-internacional Base de données: MEDLINE Langue: En Journal: Quant Imaging Med Surg Année: 2024 Type de document: Article Pays d'affiliation: Chine

Texte intégral

Ajouter à My VHL

Imprimer

XML

PubMed Links

Recherche sur Google