Your browser doesn't support javascript.
loading
Identification of ChatGPT-Generated Abstracts Within Shoulder and Elbow Surgery Poses a Challenge for Reviewers.
Stadler, Ryan D; Sudah, Suleiman Y; Moverman, Michael A; Denard, Patrick J; Duralde, Xavier A; Garrigues, Grant E; Klifto, Christopher S; Levy, Jonathan C; Namdari, Surena; Sanchez-Sotelo, Joaquin; Menendez, Mariano E.
Afiliación
  • Stadler RD; Rutgers Robert Wood Johnson Medical School, New Brunswick, NJ, USA. Electronic address: ryanstadler23@gmail.com.
  • Sudah SY; Department of Orthopaedic Surgery, Monmouth Medical Center, Monmouth, NJ, USA.
  • Moverman MA; Department of Orthopaedics, University of Utah School of Medicine, Salt Lake City, Utah.
  • Denard PJ; Oregon Shoulder Institute, Medford, OR, USA.
  • Duralde XA; Peachtree Orthopedics, Atlanta, GA, USA.
  • Garrigues GE; Midwest Orthopaedics at Rush University Medical Center, Chicago, IL, USA.
  • Klifto CS; Department of Orthopaedic Surgery, Duke University School of Medicine, Durham, NC, USA.
  • Levy JC; Levy Shoulder Center at Paley Orthopedic & Spine Institute, Boca Raton, FL, USA.
  • Namdari S; Rothman Orthopaedic Institute at Thomas Jefferson University Hospitals. Philadelphia, PA, USA.
  • Sanchez-Sotelo J; Mayo Clinic Department of Orthopedic Surgery, Rochester, MN, USA.
  • Menendez ME; Department of Orthopaedics, University of California Davis, Sacramento, CA, USA.
Arthroscopy ; 2024 Jul 09.
Article en En | MEDLINE | ID: mdl-38992513
ABSTRACT

PURPOSE:

To evaluate the extent to which experienced reviewers can accurately discern between AI-generated and original research abstracts published in the field of shoulder and elbow surgery and compare this to the performance of an AI-detection tool.

METHODS:

Twenty-five shoulder and elbow-related articles published in high-impact journals in 2023 were randomly selected. ChatGPT was prompted with only the abstract title to create an AI-generated version of each abstract. The resulting 50 abstracts were randomly distributed to and evaluated by 8 blinded peer reviewers with at least 5 years of experience. Reviewers were tasked with distinguishing between original and AI-generated text. A Likert scale assessed reviewer confidence for each interpretation and the primary reason guiding assessment of generated text was collected. AI output detector (0-100%) and plagiarism (0-100%) scores were evaluated using GPTZero.

RESULTS:

Reviewers correctly identified 62% of AI-generated abstracts and misclassified 38% of original abstracts as being AI-generated. GPTZero reported a significantly higher probability of AI output among generated abstracts (median 56%, IQR 51-77%) compared to original abstracts (median 10%, IQR 4-37%; p < 0.01). Generated abstracts scored significantly lower on the plagiarism detector (median 7%, IQR 5-14%) relative to original abstracts (median 82%, IQR 72-92%; p < 0.01). Correct identification of AI-generated abstracts was predominately attributed to the presence of unrealistic data/values. The primary reason for misidentifying original abstracts as AI was attributed to writing style.

CONCLUSIONS:

Experienced reviewers faced difficulties in distinguishing between human and AI-generated research content within shoulder and elbow surgery. The presence of unrealistic data facilitated correct identification of AI abstracts, whereas misidentification of original abstracts was often ascribed to writing style.

Texto completo: 1 Banco de datos: MEDLINE Idioma: En Año: 2024 Tipo del documento: Article

Texto completo: 1 Banco de datos: MEDLINE Idioma: En Año: 2024 Tipo del documento: Article