Your browser doesn't support javascript.
loading
Unveiling ChatGPT text using writing style.
Berriche, Lamia; Larabi-Marie-Sainte, Souad.
Afiliação
  • Berriche L; College of Computer & Information Sciences, Prince Sultan University, Saudi Arabia.
  • Larabi-Marie-Sainte S; College of Computer & Information Sciences, Prince Sultan University, Saudi Arabia.
Heliyon ; 10(12): e32976, 2024 Jun 30.
Article em En | MEDLINE | ID: mdl-38984302
ABSTRACT
Extensive use of AI-generated texts culminated recently after the advent of large language models. Although the use of AI text generators, such as ChatGPT, is beneficial, it also threatens the academic level as students may resort to it. In this work, we propose a technique leveraging the intrinsic stylometric features of documents to detect ChatGPT-based plagiarism. The stylometric features were normalized and fed to classical classifiers, such as k-Nearest Neighbors, Decision Tree, and Naïve Bayes, as well as ensemble classifiers, such as XGBoost and Stacking. A thorough examination of the classifier was conducted by using Cross-Fold validation, hyperparameters tuning, and multiple training iterations. The results show the efficacy of both classical and ensemble learning classifiers in distinguishing between human and ChatGPT writing styles with a noteworthy performance of XGBoost where 100 % was achieved for accuracy, recall, and precision metrics. Moreover, the proposed XGBoost classifier outperformed the state-of-the-art result on the same dataset and same classifier highlighting the superiority of the proposed feature style extraction method over TF-IDF techniques. The ensemble learning classifiers were also applied to the generated dataset with mixed texts, where paragraphs are written by ChatGPT and humans. The results show that 98 % of the documents were classified correctly as either mixed or human. The last contribution consists in the authorship attribution of the paragraphs of a single document where the accuracy reached 92.3 %.
Palavras-chave

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Revista: Heliyon Ano de publicação: 2024 Tipo de documento: Article

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Revista: Heliyon Ano de publicação: 2024 Tipo de documento: Article