ChatGPT's Ability to Assess Quality and Readability of Online Medical Information: Evidence From a Cross-Sectional Study.

Golan, Roei; Ripps, Sarah J; Reddy, Raghuram; Loloi, Justin; Bernstein, Ari P; Connelly, Zachary M; Golan, Noa S; Ramasamy, Ranjith

Golan, Roei; Ripps, Sarah J; Reddy, Raghuram; Loloi, Justin; Bernstein, Ari P; Connelly, Zachary M; Golan, Noa S; Ramasamy, Ranjith.

Afiliación

Golan R; Department of Clinical Sciences, Florida State University College of Medicine, Tallahassee, USA.
Ripps SJ; Department of Clinical Sciences, Florida State University College of Medicine, Tallahassee, USA.
Reddy R; Herbert Wertheim College of Medicine, Florida International University, Miami, USA.
Loloi J; Department of Urology, Montefiore Medical Center, Bronx, USA.
Bernstein AP; Department of Urology, New York University Langone Health, New York, USA.
Connelly ZM; Department of Surgery, Louisiana State University Health Shreveport, Shreveport, USA.
Golan NS; Department of Psychology, University of Florida, Gainesville, USA.
Ramasamy R; Department of Urology, Desai Sethi Urology Institute, Miami, USA.

Cureus ; 15(7): e42214, 2023 Jul.

Article en En | MEDLINE | ID: mdl-37484787

ABSTRACT

ABSTRACT

Introduction Artificial Intelligence (AI) platforms have gained widespread attention for their distinct ability to generate automated responses to various prompts. However, its role in assessing the quality and readability of a provided text remains unclear. Thus, the purpose of this study is to evaluate the proficiency of the conversational generative pre-trained transformer (ChatGPT) in utilizing the DISCERN tool to evaluate the quality of online content regarding shock wave therapy for erectile dysfunction. Methods Websites were generated using a Google search of "shock wave therapy for erectile dysfunction" with location filters disabled. Readability was analyzed using Readable software (Readable.com, Horsham, United Kingdom). Quality was assessed independently by three reviewers using the DISCERN tool. The same plain text files collected were inputted into ChatGPT to determine whether they produced comparable metrics for readability and quality. Results The study results revealed a notable disparity between ChatGPT's readability assessment and that obtained from a reliable tool, Readable.com (p<0.05). This indicates a lack of alignment between ChatGPT's algorithm and that of established tools, such as Readable.com. Similarly, the DISCERN score generated by ChatGPT differed significantly from the scores generated manually by human evaluators (p<0.05), suggesting that ChatGPT may not be capable of accurately identifying poor-quality information sources regarding shock wave therapy as a treatment for erectile dysfunction. Conclusion ChatGPT's evaluation of the quality and readability of online text regarding shockwave therapy for erectile dysfunction differs from that of human raters and trusted tools. Therefore, ChatGPT's current capabilities were not sufficient for reliably assessing the quality and readability of textual content. Further research is needed to elucidate the role of AI in the objective evaluation of online medical content in other fields. Continued development in AI and incorporation of tools such as DISCERN into AI software may enhance the way patients navigate the web in search of high-quality medical content in the future.

Palabras clave

artificial intelligence in medicine; chatgpt; healthcare ai and robotics; online medical information; readability; shock wave therapy

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Tipo de estudio: Observational_studies / Prevalence_studies / Prognostic_studies Idioma: En Revista: Cureus Año: 2023 Tipo del documento: Article País de afiliación: Estados Unidos

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google