Evaluating GPT-4V's performance in the Japanese national dental examination: A challenge explored.

Morishita, Masaki; Fukuda, Hikaru; Muraoka, Kosuke; Nakamura, Taiji; Hayashi, Masanari; Yoshioka, Izumi; Ono, Kentaro; Awano, Shuji

Morishita, Masaki; Fukuda, Hikaru; Muraoka, Kosuke; Nakamura, Taiji; Hayashi, Masanari; Yoshioka, Izumi; Ono, Kentaro; Awano, Shuji.

Afiliação

Morishita M; Division of Clinical Education Development and Research, Department of Oral Function, Kyushu Dental University, Kitakyushu, Japan.
Fukuda H; Health Information Management Office, Kyushu Dental University Hospital, Kitakyushu, Japan.
Muraoka K; Division of Maxillofacial Surgery, Department of Physical Function, Kyushu Dental University, Kitakyushu, Japan.
Nakamura T; Division of Clinical Education Development and Research, Department of Oral Function, Kyushu Dental University, Kitakyushu, Japan.
Hayashi M; Division of Periodontology, Department of Oral Function, Kyushu Dental University, Kitakyushu, Japan.
Yoshioka I; Administration Department, Kyushu Dental University Hospital, Kitakyushu, Japan.
Ono K; Division of Oral Medicine, Department of Physical Function, Kitakyushu, Japan.
Awano S; Division of Physiology, Department of Health Promotion, Kyushu Dental University, Kitakyushu, Japan.

J Dent Sci ; 19(3): 1595-1600, 2024 Jul.

Article em En | MEDLINE | ID: mdl-39035269

ABSTRACT

ABSTRACT

Background/

purpose:

Rapid advancements in AI technology have led to significant interest in its application across various fields, including medicine and dentistry. This study aimed to assess the capabilities of ChatGPT-4V with image recognition in answering image-based questions from the Japanese National Dental Examination (JNDE) to explore its potential as an educational support tool for dental students. Materials and

methods:

The dataset used questions from the JNDE, which was conducted in January 2023, with a focus on image-related queries. ChatGPT-4V was utilized, and standardized prompts, question texts, and images were input. Data and statistical analyses were conducted using Qlik Sense® and GraphPad Prism.

Results:

The overall correct response rate of ChatGPT-4V for image-based JNDE questions was 35.0 %. The correct response rates were 57.1 % for compulsory questions, 43.6 % for general questions, and 28.6 % for clinical practical questions. In specialties like Dental Anesthesiology and Endodontics, ChatGPT-4V achieved correct response rates above 70 %, while response rates for Orthodontics and Oral Surgery were lower. A higher number of images in questions was correlated with lower accuracy, suggesting an impact of the number of images on correct and incorrect responses.

Conclusion:

While innovative, ChatGPT-4V's image recognition feature exhibited limitations, especially in handling image-intensive and complex clinical practical questions, and is not yet fully suitable as an educational support tool for dental students at its current stage. Further technological refinement and re-evaluation with a broader dataset are recommended.

Palavras-chave

ChatGPT-4V; Image recognition; Medical image analysis; National dental examination

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Ano de publicação: 2024 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Ano de publicação: 2024 Tipo de documento: Article