Your browser doesn't support javascript.
loading
Clinical artificial intelligence: teaching a large language model to generate recommendations that align with guidelines for the surgical management of GERD.
Huo, Bright; Marfo, Nana; Sylla, Patricia; Calabrese, Elisa; Kumar, Sunjay; Slater, Bethany J; Walsh, Danielle S; Vosburg, Wesley.
Afiliação
  • Huo B; Division of General Surgery, Department of Surgery, McMaster University, Hamilton, ON, Canada.
  • Marfo N; Ross University School of Medicine, Miramar, FL, USA.
  • Sylla P; Division of Colon and Rectal Surgery, Department of Surgery, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
  • Calabrese E; University of Adelaide, Adelaide, SA, Australia.
  • Kumar S; Department of General Surgery, Thomas Jefferson University Hospital, Philadelphia, PA, USA.
  • Slater BJ; Department of Surgery, University of Chicago, Chicago, IL, USA.
  • Walsh DS; Department of Surgery, University of Kentucky, Lexington, KY, USA.
  • Vosburg W; Department of Surgery, Mount Auburn Hospital, Harvard Medical School, Cambridge, MA, USA. wesvosburg@gmail.com.
Surg Endosc ; 38(10): 5668-5677, 2024 Oct.
Article em En | MEDLINE | ID: mdl-39134725
ABSTRACT

BACKGROUND:

Large Language Models (LLMs) provide clinical guidance with inconsistent accuracy due to limitations with their training dataset. LLMs are "teachable" through customization. We compared the ability of the generic ChatGPT-4 model and a customized version of ChatGPT-4 to provide recommendations for the surgical management of gastroesophageal reflux disease (GERD) to both surgeons and patients.

METHODS:

Sixty patient cases were developed using eligibility criteria from the Society of American Gastrointestinal and Endoscopic Surgeons (SAGES) & United European Gastroenterology (UEG)-European Association of Endoscopic. Surgery (EAES) guidelines for the surgical management of GERD. Standardized prompts were engineered for physicians as the end-user, with separate layperson prompts for patients. A customized GPT was developed to generate recommendations based on guidelines, called the GERD Tool for Surgery (GTS). Both the GTS and generic ChatGPT-4 were queried July 21st, 2024. Model performance was evaluated by comparing responses to SAGES & UEG-EAES guideline recommendations. Outcome data was presented using descriptive statistics including counts and percentages.

RESULTS:

The GTS provided accurate recommendations for the surgical management of GERD for 60/60 (100.0%) surgeon inquiries and 40/40 (100.0%) patient inquiries based on guideline recommendations. The Generic ChatGPT-4 model generated accurate guidance for 40/60 (66.7%) surgeon inquiries and 19/40 (47.5%) patient inquiries. The GTS produced recommendations based on the 2021 SAGES & UEG-EAES guidelines on the surgical management of GERD, while the generic ChatGPT-4 model generated guidance without citing evidence to support its recommendations.

CONCLUSION:

ChatGPT-4 can be customized to overcome limitations with its training dataset to provide recommendations for the surgical management of GERD with reliable accuracy and consistency. The training of LLM models can be used to help integrate this efficient technology into the creation of robust and accurate information for both surgeons and patients. Prospective data is needed to assess its effectiveness in a pragmatic clinical environment.
Assuntos
Palavras-chave

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Inteligência Artificial / Refluxo Gastroesofágico / Guias de Prática Clínica como Assunto Limite: Female / Humans / Male Idioma: En Revista: Surg Endosc Ano de publicação: 2024 Tipo de documento: Article

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Inteligência Artificial / Refluxo Gastroesofágico / Guias de Prática Clínica como Assunto Limite: Female / Humans / Male Idioma: En Revista: Surg Endosc Ano de publicação: 2024 Tipo de documento: Article