Human versus artificial intelligence-generated arthroplasty literature: A single-blinded analysis of perceived communication, quality, and authorship source.

Lawrence, Kyle W; Habibi, Akram A; Ward, Spencer A; Lajam, Claudette M; Schwarzkopf, Ran; Rozell, Joshua C

Lawrence, Kyle W; Habibi, Akram A; Ward, Spencer A; Lajam, Claudette M; Schwarzkopf, Ran; Rozell, Joshua C.

Afiliação

Lawrence KW; Department of Orthopedic Surgery, NYU Langone Health, New York, New York, USA.
Habibi AA; Department of Orthopedic Surgery, NYU Langone Health, New York, New York, USA.
Ward SA; Department of Orthopedic Surgery, NYU Langone Health, New York, New York, USA.
Lajam CM; Department of Orthopedic Surgery, NYU Langone Health, New York, New York, USA.
Schwarzkopf R; Department of Orthopedic Surgery, NYU Langone Health, New York, New York, USA.
Rozell JC; Department of Orthopedic Surgery, NYU Langone Health, New York, New York, USA.

Int J Med Robot ; 20(1): e2621, 2024 Feb.

Article em En | MEDLINE | ID: mdl-38348740

ABSTRACT

ABSTRACT

BACKGROUND:

Large language models (LLM) have unknown implications for medical research. This study assessed whether LLM-generated abstracts are distinguishable from human-written abstracts and to compare their perceived quality.

METHODS:

The LLM ChatGPT was used to generate 20 arthroplasty abstracts (AI-generated) based on full-text manuscripts, which were compared to originally published abstracts (human-written). Six blinded orthopaedic surgeons rated abstracts on overall quality, communication, and confidence in the authorship source. Authorship-confidence scores were compared to a test value representing complete inability to discern authorship.

RESULTS:

Modestly increased confidence in human authorship was observed for human-written abstracts compared with AI-generated abstracts (p = 0.028), though AI-generated abstract authorship-confidence scores were statistically consistent with inability to discern authorship (p = 0.999). Overall abstract quality was higher for human-written abstracts (p = 0.019).

CONCLUSIONS:

AI-generated abstracts' absolute authorship-confidence ratings demonstrated difficulty in discerning authorship but did not achieve the perceived quality of human-written abstracts. Caution is warranted in implementing LLMs into scientific writing.

Assuntos

Inteligência Artificial; Autoria; Humanos; Comunicação; Idioma; Artroplastia

Palavras-chave

ChatGPT; artificial intelligence; large language models; medical literature; total hip arthroplasty; total knee arthroplasty

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Temas: Geral Base de dados: MEDLINE Assunto principal: Autoria / Inteligência Artificial Tipo de estudo: Clinical_trials / Prognostic_studies Limite: Humans Idioma: En Revista: Int J Med Robot Ano de publicação: 2024 Tipo de documento: Article País de afiliação: Estados Unidos

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google