Human versus artificial intelligence-generated arthroplasty literature: A single-blinded analysis of perceived communication, quality, and authorship source.
Int J Med Robot
; 20(1): e2621, 2024 Feb.
Article
em En
| MEDLINE
| ID: mdl-38348740
ABSTRACT
BACKGROUND:
Large language models (LLM) have unknown implications for medical research. This study assessed whether LLM-generated abstracts are distinguishable from human-written abstracts and to compare their perceived quality.METHODS:
The LLM ChatGPT was used to generate 20 arthroplasty abstracts (AI-generated) based on full-text manuscripts, which were compared to originally published abstracts (human-written). Six blinded orthopaedic surgeons rated abstracts on overall quality, communication, and confidence in the authorship source. Authorship-confidence scores were compared to a test value representing complete inability to discern authorship.RESULTS:
Modestly increased confidence in human authorship was observed for human-written abstracts compared with AI-generated abstracts (p = 0.028), though AI-generated abstract authorship-confidence scores were statistically consistent with inability to discern authorship (p = 0.999). Overall abstract quality was higher for human-written abstracts (p = 0.019).CONCLUSIONS:
AI-generated abstracts' absolute authorship-confidence ratings demonstrated difficulty in discerning authorship but did not achieve the perceived quality of human-written abstracts. Caution is warranted in implementing LLMs into scientific writing.Palavras-chave
Texto completo:
1
Coleções:
01-internacional
Temas:
Geral
Base de dados:
MEDLINE
Assunto principal:
Autoria
/
Inteligência Artificial
Tipo de estudo:
Clinical_trials
/
Prognostic_studies
Limite:
Humans
Idioma:
En
Revista:
Int J Med Robot
Ano de publicação:
2024
Tipo de documento:
Article
País de afiliação:
Estados Unidos