Performance of AI chatbots on controversial topics in oral medicine, pathology, and radiology.

Mohammad-Rahimi, Hossein; Khoury, Zaid H; Alamdari, Mina Iranparvar; Rokhshad, Rata; Motie, Parisa; Parsa, Azin; Tavares, Tiffany; Sciubba, James J; Price, Jeffery B; Sultan, Ahmed S

Mohammad-Rahimi, Hossein; Khoury, Zaid H; Alamdari, Mina Iranparvar; Rokhshad, Rata; Motie, Parisa; Parsa, Azin; Tavares, Tiffany; Sciubba, James J; Price, Jeffery B; Sultan, Ahmed S.

Afiliação

Mohammad-Rahimi H; Division of Artificial Intelligence Research, University of Maryland School of Dentistry, Baltimore, MD, USA; Topic Group Dental Diagnostics and Digital Dentistry, ITU/WHO Focus Group AI on Health, Berlin, Germany.
Khoury ZH; Department of Oral Diagnostic Sciences and Research, Meharry Medical College School of Dentistry, Nashville, TN, USA.
Alamdari MI; Department of Oral and Maxillofacial Radiology, School of Dentistry, Shahid Beheshti University of Medical Sciences, Tehran, Iran.
Rokhshad R; Topic Group Dental Diagnostics and Digital Dentistry, ITU/WHO Focus Group AI on Health, Berlin, Germany.
Motie P; Medical Image and Signal Processing Research Center, Medical University of Isfahan, Isfahan, Iran.
Parsa A; Department of Oncology and Diagnostic Sciences, University of Maryland School of Dentistry, Baltimore, MD, USA.
Tavares T; Department of Comprehensive Dentistry, UT Health San Antonio School of Dentistry, San Antonio, TX, USA.
Sciubba JJ; Department of Otolaryngology, Head & Neck Surgery, The Johns Hopkins University, Baltimore, MD, USA.
Price JB; Division of Artificial Intelligence Research, University of Maryland School of Dentistry, Baltimore, MD, USA; Department of Oncology and Diagnostic Sciences, University of Maryland School of Dentistry, Baltimore, MD, USA.
Sultan AS; Division of Artificial Intelligence Research, University of Maryland School of Dentistry, Baltimore, MD, USA; Department of Oncology and Diagnostic Sciences, University of Maryland School of Dentistry, Baltimore, MD, USA; University of Maryland Marlene and Stewart Greenebaum Comprehensive Cancer Cen

Oral Surg Oral Med Oral Pathol Oral Radiol ; 137(5): 508-514, 2024 05.

Article em En | MEDLINE | ID: mdl-38553304

ABSTRACT

ABSTRACT

OBJECTIVES:

In this study, we assessed 6 different artificial intelligence (AI) chatbots (Bing, GPT-3.5, GPT-4, Google Bard, Claude, Sage) responses to controversial and difficult questions in oral pathology, oral medicine, and oral radiology. STUDY

DESIGN:

The chatbots' answers were evaluated by board-certified specialists using a modified version of the global quality score on a 5-point Likert scale. The quality and validity of chatbot citations were evaluated.

RESULTS:

Claude had the highest mean score of 4.341 ± 0.582 for oral pathology and medicine. Bing had the lowest scores of 3.447 ± 0.566. In oral radiology, GPT-4 had the highest mean score of 3.621 ± 1.009 and Bing the lowest score of 2.379 ± 0.978. GPT-4 achieved the highest mean score of 4.066 ± 0.825 for performance across all disciplines. 82 out of 349 (23.50%) of generated citations from chatbots were fake.

CONCLUSIONS:

The most superior chatbot in providing high-quality information for controversial topics in various dental disciplines was GPT-4. Although the majority of chatbots performed well, it is suggested that developers of AI medical chatbots incorporate scientific citation authenticators to validate the outputted citations given the relatively high number of fabricated citations.

Assuntos

Inteligência Artificial; Medicina Bucal; Humanos; Radiologia; Patologia Bucal

Texto completo

Adicionar na Minha BVS

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Inteligência Artificial / Medicina Bucal Limite: Humans Idioma: En Revista: Oral Surg Oral Med Oral Pathol Oral Radiol Ano de publicação: 2024 Tipo de documento: Article País de afiliação: Alemanha

Texto completo

Adicionar na Minha BVS

Imprimir

XML

PubMed Links

Buscar no Google