Evaluation of ChatGPT for Pelvic Floor Surgery Counseling.

Johnson, Colin M; Bradley, Catherine S; Kenne, Kimberly A; Rabice, Sarah; Takacs, Elizabeth; Vollstedt, Annah; Kowalski, Joseph T

Johnson, Colin M; Bradley, Catherine S; Kenne, Kimberly A; Rabice, Sarah; Takacs, Elizabeth; Vollstedt, Annah; Kowalski, Joseph T.

Afiliação

Johnson CM; From the Division of Urogynecology and Reconstructive Pelvic Surgery, Department of Obstetrics and Gynecology.
Bradley CS; From the Division of Urogynecology and Reconstructive Pelvic Surgery, Department of Obstetrics and Gynecology.
Kenne KA; From the Division of Urogynecology and Reconstructive Pelvic Surgery, Department of Obstetrics and Gynecology.
Rabice S; From the Division of Urogynecology and Reconstructive Pelvic Surgery, Department of Obstetrics and Gynecology.
Takacs E; Department of Urology, University of Iowa Hospitals and Clinics, Iowa City, IA.
Vollstedt A; Department of Urology, University of Iowa Hospitals and Clinics, Iowa City, IA.
Kowalski JT; From the Division of Urogynecology and Reconstructive Pelvic Surgery, Department of Obstetrics and Gynecology.

Urogynecology (Phila) ; 30(3): 245-250, 2024 03 01.

Article em En | MEDLINE | ID: mdl-38484238

ABSTRACT

ABSTRACT

IMPORTANCE Large language models are artificial intelligence applications that can comprehend and produce human-like text and language. ChatGPT is one such model. Recent advances have increased interest in the utility of large language models in medicine. Urogynecology counseling is complex and time-consuming. Therefore, we evaluated ChatGPT as a potential adjunct for patient counseling.

OBJECTIVE:

Our primary objective was to compare the accuracy and completeness of ChatGPT responses to information in standard patient counseling leaflets regarding common urogynecological procedures. STUDY

DESIGN:

Seven urogynecologists compared the accuracy and completeness of ChatGPT responses to standard patient leaflets using 5-point Likert scales with a score of 3 being "equally accurate" and "equally complete," and a score of 5 being "much more accurate" and much more complete, respectively. This was repeated 3 months later to evaluate the consistency of ChatGPT. Additional analysis of the understandability and actionability was completed by 2 authors using the Patient Education Materials Assessment Tool. Analysis was primarily descriptive. First and second ChatGPT queries were compared with the Wilcoxon signed rank test.

RESULTS:

The median (interquartile range) accuracy was 3 (2-3) and completeness 3 (2-4) for the first ChatGPT query and 3 (3-3) and 4 (3-4), respectively, for the second query. Accuracy and completeness were significantly higher in the second query (P < 0.01). Understandability and actionability of ChatGPT responses were lower than the standard leaflets.

CONCLUSIONS:

ChatGPT is similarly accurate and complete when compared with standard patient information leaflets for common urogynecological procedures. Large language models may be a helpful adjunct to direct patient-provider counseling. Further research to determine the efficacy and patient satisfaction of ChatGPT for patient counseling is needed.

Assuntos

Inteligência Artificial; Medicina; Humanos; Diafragma da Pelve/cirurgia; Aconselhamento; Idioma

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Inteligência Artificial / Medicina Idioma: En Ano de publicação: 2024 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Inteligência Artificial / Medicina Idioma: En Ano de publicação: 2024 Tipo de documento: Article