RESUMO
PURPOSE: To evaluate the extent to which experienced reviewers can accurately discern between artificial intelligence (AI)-generated and original research abstracts published in the field of shoulder and elbow surgery and compare this with the performance of an AI detection tool. METHODS: Twenty-five shoulder- and elbow-related articles published in high-impact journals in 2023 were randomly selected. ChatGPT was prompted with only the abstract title to create an AI-generated version of each abstract. The resulting 50 abstracts were randomly distributed to and evaluated by 8 blinded peer reviewers with at least 5 years of experience. Reviewers were tasked with distinguishing between original and AI-generated text. A Likert scale assessed reviewer confidence for each interpretation, and the primary reason guiding assessment of generated text was collected. AI output detector (0%-100%) and plagiarism (0%-100%) scores were evaluated using GPTZero. RESULTS: Reviewers correctly identified 62% of AI-generated abstracts and misclassified 38% of original abstracts as being AI generated. GPTZero reported a significantly higher probability of AI output among generated abstracts (median, 56%; interquartile range [IQR], 51%-77%) compared with original abstracts (median, 10%; IQR, 4%-37%; P < .01). Generated abstracts scored significantly lower on the plagiarism detector (median, 7%; IQR, 5%-14%) relative to original abstracts (median, 82%; IQR, 72%-92%; P < .01). Correct identification of AI-generated abstracts was predominately attributed to the presence of unrealistic data/values. The primary reason for misidentifying original abstracts as AI was attributed to writing style. CONCLUSIONS: Experienced reviewers faced difficulties in distinguishing between human and AI-generated research content within shoulder and elbow surgery. The presence of unrealistic data facilitated correct identification of AI abstracts, whereas misidentification of original abstracts was often ascribed to writing style. CLINICAL RELEVANCE: With rapidly increasing AI advancements, it is paramount that ethical standards of scientific reporting are upheld. It is therefore helpful to understand the ability of reviewers to identify AI-generated content.
RESUMO
Background: Vestibular loss and dysfunction has been associated with cognitive deficits, decreased spatial navigation, spatial memory, visuospatial ability, attention, executive function, and processing speed among others. Superior semicircular canal dehiscence (SSCD) is a vestibular-cochlear disorder in humans in which a pathological third mobile window of the otic capsule creates changes to the flow of sound pressure energy through the perilymph/endolymph. The primary symptoms include sound-induced dizziness/vertigo, inner ear conductive hearing loss, autophony, headaches, and visual problems; however, individuals also experience measurable deficits in basic decision-making, short-term memory, concentration, spatial cognition, and depression. These suggest central mechanisms of impairment are associated with vestibular disorders; therefore, we directly tested this hypothesis using both an auditory and visual decision-making task of varying difficulty levels in our model of SSCD. Methods: Adult Mongolian gerbils (n = 33) were trained on one of four versions of a Go-NoGo stimulus presentation rate discrimination task that included standard ("easy") or more difficult ("hard") auditory and visual stimuli. After 10 days of training, preoperative ABR and c+VEMP testing was followed by a surgical fenestration of the left superior semicircular canal. Animals with persistent circling or head tilt were excluded to minimize effects from acute vestibular injury. Testing recommenced at postoperative day 5 and continued through postoperative day 15 at which point final ABR and c+VEMP testing was carried out. Results: Behavioral data (d-primes) were compared between preoperative performance (training day 8-10) and postoperative days 6-8 and 13-15. Behavioral performance was measured during the peak of SSCD induced ABR and c + VEMP impairment and the return towards baseline as the dehiscence began to resurface by osteoneogenesis. There were significant differences in behavioral performance (d-prime) and its behavioral components (Hits, Misses, False Alarms, and Correct Rejections). These changes were highly correlated with persistent deficits in c + VEMPs at the end of training (postoperative day 15). The controls demonstrated additional learning post procedure that was absent in the SSCD group. Conclusion: These results suggest that aberrant asymmetric vestibular output results in decision-making impairments in these discrimination tasks and could be associated with the other cognitive impairments resulting from vestibular dysfunction.