Your browser doesn't support javascript.
loading
Accuracy, readability, and understandability of large language models for prostate cancer information to the public.
Hershenhouse, Jacob S; Mokhtar, Daniel; Eppler, Michael B; Rodler, Severin; Storino Ramacciotti, Lorenzo; Ganjavi, Conner; Hom, Brian; Davis, Ryan J; Tran, John; Russo, Giorgio Ivan; Cocci, Andrea; Abreu, Andre; Gill, Inderbir; Desai, Mihir; Cacciamani, Giovanni E.
Afiliação
  • Hershenhouse JS; USC Institute of Urology and Catherine and Joseph Aresty Department of Urology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA.
  • Mokhtar D; Artificial Intelligence Center, USC Institute of Urology, University of Southern California, Los Angeles, CA, USA.
  • Eppler MB; USC Institute of Urology and Catherine and Joseph Aresty Department of Urology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA.
  • Rodler S; Artificial Intelligence Center, USC Institute of Urology, University of Southern California, Los Angeles, CA, USA.
  • Storino Ramacciotti L; USC Institute of Urology and Catherine and Joseph Aresty Department of Urology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA.
  • Ganjavi C; Artificial Intelligence Center, USC Institute of Urology, University of Southern California, Los Angeles, CA, USA.
  • Hom B; USC Institute of Urology and Catherine and Joseph Aresty Department of Urology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA.
  • Davis RJ; Artificial Intelligence Center, USC Institute of Urology, University of Southern California, Los Angeles, CA, USA.
  • Tran J; USC Institute of Urology and Catherine and Joseph Aresty Department of Urology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA.
  • Russo GI; Artificial Intelligence Center, USC Institute of Urology, University of Southern California, Los Angeles, CA, USA.
  • Cocci A; USC Institute of Urology and Catherine and Joseph Aresty Department of Urology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA.
  • Abreu A; Artificial Intelligence Center, USC Institute of Urology, University of Southern California, Los Angeles, CA, USA.
  • Gill I; USC Institute of Urology and Catherine and Joseph Aresty Department of Urology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA.
  • Desai M; Artificial Intelligence Center, USC Institute of Urology, University of Southern California, Los Angeles, CA, USA.
  • Cacciamani GE; USC Institute of Urology and Catherine and Joseph Aresty Department of Urology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA.
Article em En | MEDLINE | ID: mdl-38744934
ABSTRACT

BACKGROUND:

Generative Pretrained Model (GPT) chatbots have gained popularity since the public release of ChatGPT. Studies have evaluated the ability of different GPT models to provide information about medical conditions. To date, no study has assessed the quality of ChatGPT outputs to prostate cancer related questions from both the physician and public perspective while optimizing outputs for patient consumption.

METHODS:

Nine prostate cancer-related questions, identified through Google Trends (Global), were categorized into diagnosis, treatment, and postoperative follow-up. These questions were processed using ChatGPT 3.5, and the responses were recorded. Subsequently, these responses were re-inputted into ChatGPT to create simplified summaries understandable at a sixth-grade level. Readability of both the original ChatGPT responses and the layperson summaries was evaluated using validated readability tools. A survey was conducted among urology providers (urologists and urologists in training) to rate the original ChatGPT responses for accuracy, completeness, and clarity using a 5-point Likert scale. Furthermore, two independent reviewers evaluated the layperson summaries on correctness trifecta accuracy, completeness, and decision-making sufficiency. Public assessment of the simplified summaries' clarity and understandability was carried out through Amazon Mechanical Turk (MTurk). Participants rated the clarity and demonstrated their understanding through a multiple-choice question.

RESULTS:

GPT-generated output was deemed correct by 71.7% to 94.3% of raters (36 urologists, 17 urology residents) across 9 scenarios. GPT-generated simplified layperson summaries of this output was rated as accurate in 8 of 9 (88.9%) scenarios and sufficient for a patient to make a decision in 8 of 9 (88.9%) scenarios. Mean readability of layperson summaries was higher than original GPT outputs ([original ChatGPT v. simplified ChatGPT, mean (SD), p-value] Flesch Reading Ease 36.5(9.1) v. 70.2(11.2), <0.0001; Gunning Fog 15.8(1.7) v. 9.5(2.0), p < 0.0001; Flesch Grade Level 12.8(1.2) v. 7.4(1.7), p < 0.0001; Coleman Liau 13.7(2.1) v. 8.6(2.4), 0.0002; Smog index 11.8(1.2) v. 6.7(1.8), <0.0001; Automated Readability Index 13.1(1.4) v. 7.5(2.1), p < 0.0001). MTurk workers (n = 514) rated the layperson summaries as correct (89.5-95.7%) and correctly understood the content (63.0-87.4%).

CONCLUSION:

GPT shows promise for correct patient education for prostate cancer-related contents, but the technology is not designed for delivering patients information. Prompting the model to respond with accuracy, completeness, clarity and readability may enhance its utility when used for GPT-powered medical chatbots.

Texto completo: 1 Base de dados: MEDLINE Idioma: En Ano de publicação: 2024 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Idioma: En Ano de publicação: 2024 Tipo de documento: Article