Differentiating ChatGPT-Generated and Human-Written Medical Texts: Quantitative Study.

Liao, Wenxiong; Liu, Zhengliang; Dai, Haixing; Xu, Shaochen; Wu, Zihao; Zhang, Yiyang; Huang, Xiaoke; Zhu, Dajiang; Cai, Hongmin; Li, Quanzheng; Liu, Tianming; Li, Xiang

Liao, Wenxiong; Liu, Zhengliang; Dai, Haixing; Xu, Shaochen; Wu, Zihao; Zhang, Yiyang; Huang, Xiaoke; Zhu, Dajiang; Cai, Hongmin; Li, Quanzheng; Liu, Tianming; Li, Xiang.

Afiliação

Liao W; School of Computer Science and Engineering, South China University of Technology, Guangzhou, China.
Liu Z; School of Computing, University of Georgia, Athens, GA, United States.
Dai H; School of Computing, University of Georgia, Athens, GA, United States.
Xu S; School of Computing, University of Georgia, Athens, GA, United States.
Wu Z; School of Computing, University of Georgia, Athens, GA, United States.
Zhang Y; School of Computer Science and Engineering, South China University of Technology, Guangzhou, China.
Huang X; School of Computer Science and Engineering, South China University of Technology, Guangzhou, China.
Zhu D; Department of Computer Science and Engineering, University of Texas at Arlington, Arlington, TX, United States.
Cai H; School of Computer Science and Engineering, South China University of Technology, Guangzhou, China.
Li Q; Department of Radiology, Massachusetts General Hospital, Boston, MA, United States.
Liu T; School of Computing, University of Georgia, Athens, GA, United States.
Li X; Department of Radiology, Massachusetts General Hospital, Boston, MA, United States.

JMIR Med Educ ; 9: e48904, 2023 Dec 28.

Article em En | MEDLINE | ID: mdl-38153785

ABSTRACT

ABSTRACT

BACKGROUND:

Large language models, such as ChatGPT, are capable of generating grammatically perfect and human-like text content, and a large number of ChatGPT-generated texts have appeared on the internet. However, medical texts, such as clinical notes and diagnoses, require rigorous validation, and erroneous medical content generated by ChatGPT could potentially lead to disinformation that poses significant harm to health care and the general public.

OBJECTIVE:

This study is among the first on responsible artificial intelligence-generated content in medicine. We focus on analyzing the differences between medical texts written by human experts and those generated by ChatGPT and designing machine learning workflows to effectively detect and differentiate medical texts generated by ChatGPT.

METHODS:

We first constructed a suite of data sets containing medical texts written by human experts and generated by ChatGPT. We analyzed the linguistic features of these 2 types of content and uncovered differences in vocabulary, parts-of-speech, dependency, sentiment, perplexity, and other aspects. Finally, we designed and implemented machine learning methods to detect medical text generated by ChatGPT. The data and code used in this paper are published on GitHub.

RESULTS:

Medical texts written by humans were more concrete, more diverse, and typically contained more useful information, while medical texts generated by ChatGPT paid more attention to fluency and logic and usually expressed general terminologies rather than effective information specific to the context of the problem. A bidirectional encoder representations from transformers-based model effectively detected medical texts generated by ChatGPT, and the F1 score exceeded 95%.

CONCLUSIONS:

Although text generated by ChatGPT is grammatically perfect and human-like, the linguistic characteristics of generated medical texts were different from those written by human experts. Medical text generated by ChatGPT could be effectively detected by the proposed machine learning algorithms. This study provides a pathway toward trustworthy and accountable use of large language models in medicine.

Assuntos

Algoritmos; Inteligência Artificial; Humanos; Desinformação; Fontes de Energia Elétrica; Instalações de Saúde

Palavras-chave

ChatGPT; artificial intelligence; linguistic analysis; machine learning; medical ethics; medical texts; text classification

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Algoritmos / Inteligência Artificial Idioma: En Ano de publicação: 2023 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Algoritmos / Inteligência Artificial Idioma: En Ano de publicação: 2023 Tipo de documento: Article