Comparative evaluation of a language model and human specialists in the application of European guidelines for the management of inflammatory bowel diseases and malignancies.

Ghersin, Itai; Weisshof, Roni; Koifman, Eduard; Bar-Yoseph, Haggai; Ben Hur, Dana; Maza, Itay; Hasnis, Erez; Nasser, Roni; Ovadia, Baruch; Dror Zur, Dikla; Waterman, Matti; Gorelik, Yuri

Ghersin, Itai; Weisshof, Roni; Koifman, Eduard; Bar-Yoseph, Haggai; Ben Hur, Dana; Maza, Itay; Hasnis, Erez; Nasser, Roni; Ovadia, Baruch; Dror Zur, Dikla; Waterman, Matti; Gorelik, Yuri.

Afiliação

Ghersin I; Department of Gastroenterology, Rambam Health Care Campus, Haifa, Israel.
Weisshof R; Department of Gastroenterology, Rambam Health Care Campus, Haifa, Israel.
Koifman E; Department of Gastroenterology, Rambam Health Care Campus, Haifa, Israel.
Bar-Yoseph H; Department of Gastroenterology, Rambam Health Care Campus, Haifa, Israel.
Ben Hur D; Rappaport Faculty of Medicine, Technion, Israel Institute of Technology, Haifa, Israel.
Maza I; Department of Gastroenterology, Rambam Health Care Campus, Haifa, Israel.
Hasnis E; Department of Gastroenterology, Rambam Health Care Campus, Haifa, Israel.
Nasser R; Department of Gastroenterology, Rambam Health Care Campus, Haifa, Israel.
Ovadia B; Department of Gastroenterology, Rambam Health Care Campus, Haifa, Israel.
Dror Zur D; Department of Gastroenterology and Hepatology, Hillel Yaffe Medical Center, Hadera, Israel.
Waterman M; Department of Gastroenterology, Galilee Medical Center, Nahariya, Israel.
Gorelik Y; Department of Gastroenterology, Rambam Health Care Campus, Haifa, Israel.

Endoscopy ; 2024 Apr 18.

Article em En | MEDLINE | ID: mdl-38499197

ABSTRACT

ABSTRACT

BACKGROUND:

Society guidelines on colorectal dysplasia screening, surveillance, and endoscopic management in inflammatory bowel disease (IBD) are complex, and physician adherence to them is suboptimal. We aimed to evaluate the use of ChatGPT, a large language model, in generating accurate guideline-based recommendations for colorectal dysplasia screening, surveillance, and endoscopic management in IBD in line with European Crohn's and Colitis Organization (ECCO) guidelines.

METHODS:

30 clinical scenarios in the form of free text were prepared and presented to three separate sessions of ChatGPT and to eight gastroenterologists (four IBD specialists and four non-IBD gastroenterologists). Two additional IBD specialists subsequently assessed all responses provided by ChatGPT and the eight gastroenterologists, judging their accuracy according to ECCO guidelines.

RESULTS:

ChatGPT had a mean correct response rate of 87.8%. Among the eight gastroenterologists, the mean correct response rates were 85.8% for IBD experts and 89.2% for non-IBD experts. No statistically significant differences in accuracy were observed between ChatGPT and all gastroenterologists (P=0.95), or between ChatGPT and the IBD experts and non-IBD expert gastroenterologists, respectively (P=0.82).

CONCLUSIONS:

This study highlights the potential of language models in enhancing guideline adherence regarding colorectal dysplasia in IBD. Further investigation of additional resources and prospective evaluation in real-world settings are warranted.

Texto completo

Adicionar na Minha BVS

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Revista: Endoscopy Ano de publicação: 2024 Tipo de documento: Article País de afiliação: Israel

Texto completo

Adicionar na Minha BVS

Imprimir

XML

PubMed Links

Buscar no Google