Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
Arthroscopy ; 2024 Aug 20.
Artículo en Inglés | MEDLINE | ID: mdl-39173690

RESUMEN

PURPOSE: To determine whether leading, commercially-available LLMs provide treatment recommendations concordant with evidenced-based clinical practice guidelines (CPGs) developed by the American Academy of Orthopedic Surgeons (AAOS). METHODS: All CPGs concerning the management of rotator cuff tears(n=33) and anterior cruciate ligament (ACL) injuries(n=15) were extracted from the AAOS. Treatment recommendations from Chat-generative pretrained transformer version-4 [ChatGPT-4; OpenAI], Gemini (Google), Mistral-7B (Mistral AI), and Claude-3 (Anthropic) were graded by two blinded physicians as being "concordant," "discordant," or "indeterminate" (i.e., neutral response without definitive recommendation) with respect to AAOS CPGs. The overall concordance between LLM and AAOS recommendations were quantified, while the comparative overall concordance of recommendations amongst the four LLMs was evaluated through the Fischer's-exact test. RESULTS: Overall 135(70.3%) responses were concordant, 43(22.4%) indeterminate, and 14(7.3%) discordant. Inter-rater reliability for concordance classification was excellent (Kappa=0.92). Concordance with AAOS CPGs was most frequently observed with ChatGPT-4 (n=38, 79.2%), and least frequently with Mistral-7B (n=28,58.3%). Indeterminate recommendations were most frequently observed with Mistral-7B (n=17,35.4%) and least frequently with Claude-3 (n=8, 6.7%). Discordant recommendations were most frequently observed with Gemini (n=6,12.5%) and least frequently with ChatGPT-4 (n=1,2.1%). Overall, no statistically significant differences in concordant recommendations was observed across LLMs (p=0.12). Only 20 (10.4%) of all recommendations were transparent and provided references with full bibliographic details or links to specific peer-reviewed content to support recommendations. CONCLUSION: Among leading commercially-available LLMs, more than one-in-four recommendations concerning the evaluation and management of rotator cuff and ACL injuries do not reflect current evidenced-based CPGs. Although ChatGPT-4 demonstrated the highest performance, clinically significant rates of recommendations without concordance or supporting evidence were observed. Only 10% of responses by LLMs were transparent, precluding users from fully interpreting the sources from which recommendations were provided. CLINICAL RELEVANCE: While leading LLMs generally provide recommendations concordant with CPGs, a substantial error-rate exists, and the proportion of recommendations that do not align with these CPGs suggest that LLMs are not trustworthy clinical support tools at this time. Each off-the-shelf, closed-source LLM has strengths and weaknesses. Future research should evaluate and compare multiple LLMs to avoid bias associated with narrow evaluation of few models as observed in current literature.

2.
Orthop J Sports Med ; 12(7): 23259671241257516, 2024 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-39139744

RESUMEN

Background: The consumer availability and automated response functions of chat generator pretrained transformer (ChatGPT-4), a large language model, poise this application to be utilized for patient health queries and may have a role in serving as an adjunct to minimize administrative and clinical burden. Purpose: To evaluate the ability of ChatGPT-4 to respond to patient inquiries concerning ulnar collateral ligament (UCL) injuries and compare these results with the performance of Google. Study Design: Cross-sectional study. Methods: Google Web Search was used as a benchmark, as it is the most widely used search engine worldwide and the only search engine that generates frequently asked questions (FAQs) when prompted with a query, allowing comparisons through a systematic approach. The query "ulnar collateral ligament reconstruction" was entered into Google, and the top 10 FAQs, answers, and their sources were recorded. ChatGPT-4 was prompted to perform a Google search of FAQs with the same query and to record the sources of answers for comparison. This process was again replicated to obtain 10 new questions requiring numeric instead of open-ended responses. Finally, responses were graded independently for clinical accuracy (grade 0 = inaccurate, grade 1 = somewhat accurate, grade 2 = accurate) by 2 fellowship-trained sports medicine surgeons (D.W.A, J.S.D.) blinded to the search engine and answer source. Results: ChatGPT-4 used a greater proportion of academic sources than Google to provide answers to the top 10 FAQs, although this was not statistically significant (90% vs 50%; P = .14). In terms of question overlap, 40% of the most common questions on Google and ChatGPT-4 were the same. When comparing FAQs with numeric responses, 20% of answers were completely overlapping, 30% demonstrated partial overlap, and the remaining 50% did not demonstrate any overlap. All sources used by ChatGPT-4 to answer these FAQs were academic, while only 20% of sources used by Google were academic (P = .0007). The remaining Google sources included social media (40%), medical practices (20%), single-surgeon websites (10%), and commercial websites (10%). The mean (± standard deviation) accuracy for answers given by ChatGPT-4 was significantly greater compared with Google for the top 10 FAQs (1.9 ± 0.2 vs 1.2 ± 0.6; P = .001) and top 10 questions with numeric answers (1.8 ± 0.4 vs 1 ± 0.8; P = .013). Conclusion: ChatGPT-4 is capable of providing responses with clinically relevant content concerning UCL injuries and reconstruction. ChatGPT-4 utilized a greater proportion of academic websites to provide responses to FAQs representative of patient inquiries compared with Google Web Search and provided significantly more accurate answers. Moving forward, ChatGPT has the potential to be used as a clinical adjunct when answering queries about UCL injuries and reconstruction, but further validation is warranted before integrated or autonomous use in clinical settings.

3.
Artículo en Inglés | MEDLINE | ID: mdl-39126271

RESUMEN

PURPOSE: To define the minimal clinically important difference (MCID) for measures of pain and function at 2, 5 and 10 years after osteochondral autograft transplantations (OATs). METHODS: Patients undergoing OATs of the knee were identified from a prospectively maintained cartilage surgery registry. Baseline demographic, injury and surgical factors were collected. Patient-reported outcome scores (PROMs) were collected at baseline, 2-, 5- and 10-year follow-up, including the International Knee Documentation Committee (IKDC) score, Knee Outcome Survey Activities of Daily Living Scale (KOS-ADLS), Marx activity scale and Visual Analogue Scale (VAS) for pain. The MCIDs were quantified for each metric utilizing a distribution-based method equivalent to one-half the standard deviation of the mean change in outcome score. The percentage of patients achieving MCID as a function of time was assessed. RESULTS: Of 63 consecutive patients who underwent OATs, 47 (74.6%) patients were eligible for follow-up (surgical date before October 2021) and had fully completed preoperative PROMs. A total of 39 patients (83%) were available for a minimum 2-year follow-up, with a mean (±standard deviation) follow-up of 5.8 ± 3.4 years. The MCIDs were determined to be 9.3 for IKDC, 2.5 for Marx, 7.4 for KOS-ADLS and 12.9 for pain. At 2 years, 78.1% of patients achieved MCID for IKDC, 77.8% for Marx, 75% for KOS-ADLS and 57.9% for pain. These results were generally maintained through 10-year follow-ups, with 75% of patients achieving MCID for IKDC, 80% for Marx, 80% for KOS-ADLS and 69.8% for pain. CONCLUSIONS: The majority of patients achieved a clinically relevant outcome improvement after OATs of the knee, with results sustained through 10-year follow-up. Patients who experience clinically relevant outcome improvement after OATs in the short term continue to experience sustained benefits at longer-term follow-up. These data provide valuable prognostic information when discussing patient candidacy and the expected trajectory of recovery. LEVEL OF EVIDENCE: Level III.

4.
Arthrosc Sports Med Rehabil ; 6(3): 100940, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-39006790

RESUMEN

Purpose: To develop a deep learning model for the detection of Segond fractures on anteroposterior (AP) knee radiographs and to compare model performance to that of trained human experts. Methods: AP knee radiographs were retrieved from the Hospital for Special Surgery ACL Registry, which enrolled patients between 2009 and 2013. All images corresponded to patients who underwent anterior cruciate ligament reconstruction by 1 of 23 surgeons included in the registry data. Images were categorized into 1 of 2 classes based on radiographic evidence of a Segond fracture and manually annotated. Seventy percent of the images were used to populate the training set, while 20% and 10% were reserved for the validation and test sets, respectively. Images from the test set were used to compare model performance to that of expert human observers, including an orthopaedic surgery sports medicine fellow and a fellowship-trained orthopaedic sports medicine surgeon with over 10 years of experience. Results: A total of 324 AP knee radiographs were retrieved, of which 34 (10.4%) images demonstrated evidence of a Segond fracture. The overall mean average precision (mAP) was 0.985, and this was maintained on the Segond fracture class (mAP = 0.978, precision = 0.844, recall = 1). The model demonstrated 100% accuracy with perfect sensitivity and specificity when applied to the independent testing set and the ability to meet or exceed human sensitivity and specificity in all cases. Compared to an orthopaedic surgery sports medicine fellow, the model required 0.3% of the total time needed to evaluate and classify images in the independent test set. Conclusions: A deep learning model was developed and internally validated for Segond fracture detection on AP radiographs and demonstrated perfect accuracy, sensitivity, and specificity on a small test set of radiographs with and without Segond fractures. The model demonstrated superior performance compared with expert human observers. Clinical Relevance: Deep learning can be used for automated Segond fracture identification on radiographs, leading to improved diagnosis of easily missed concomitant injuries, including lateral meniscus tears. Automated identification of Segond fractures can also enable large-scale studies on the incidence and clinical significance of these fractures, which may lead to improved management and outcomes for patients with knee injuries.

5.
Curr Rev Musculoskelet Med ; 17(8): 313-320, 2024 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-38833135

RESUMEN

PURPOSE OF REVIEW: Management of meniscal injuries in the elite athlete is a difficult problem secondary to the high demands of athletic competition, the need for a timely return to sport, and the desire to maximize performance over time. The purpose of this review is to provide an up-to-date summary on the current literature and trends regarding the management of meniscus injuries with a special consideration for elite athletes. RECENT FINDINGS: Historically, partial meniscectomy has been the primary treatment option for meniscus injuries. However, in recent years there has been an increased emphasis on meniscus preservation due to the increased risk of cartilage degeneration over time. Moreover, while partial meniscectomy still provides a quicker return to sport (RTS), recent literature has demonstrated similar rates of RTS and return to pre-injury levels between partial meniscectomy and meniscus repair. In the setting of symptomatic meniscal deficiency, meniscus allograft transplantation has become an increasingly utilized salvage procedure with promising yet variable outcomes on the ability to withstand elite competition. Currently, there is no uniform approach to treating meniscal injuries in elite athletes. Therefore, an individualized approach is required with consideration of the meniscus tear type, location, concomitant injuries, athlete expectations, rehabilitation timeline, and desire to prevent or delay knee osteoarthritis. In athletes with anatomically repairable tears, meniscus repair should be performed given the ability to restore native anatomy, provide high rates of RTS, and mitigate long-term chondral damage. However, partial meniscectomy can be indicated for unrepairable tears.

6.
Orthop J Sports Med ; 12(6): 23259671241253591, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38867918

RESUMEN

Background: Primary anterior cruciate ligament (ACL) repair has gained renewed interest in select centers for patients with proximal or midsubstance ACL tears. Therefore, it is important to reassess contemporary clinical outcomes of ACL repair to determine whether a clinical benefit exists over the gold standard of ACL reconstruction (ACLR). Purpose: To (1) perform a meta-analysis of comparative trials to determine whether differences in clinical outcomes and adverse events exist between ACL repair versus ACLR and (2) synthesize the midterm outcomes of available trials. Study Design: Systematic review; Level of evidence, 3. Methods: The PubMed, OVID/Medline, and Cochrane databases were queried in August 2023 for prospective and retrospective clinical trials comparing ACL repair and ACLR. Data pertaining to tear location, surgical technique, adverse events, and clinical outcome measures were recorded. DerSimonian-Laird random-effects models were constructed to quantitatively evaluate the association between ACL repair/ACLR, adverse events, and clinical outcomes. A subanalysis of minimum 5-year outcomes was performed. Results: Twelve studies (893 patients; 464 ACLR and 429 ACL repair) were included. Random-effects models demonstrated a higher relative risk (RR) of recurrent instability/clinical failure (RR = 1.64; 95% confidence interval [CI], 1.04-2.57; P = .032), revision ACLR (RR = 1.63; 95% CI, 1.03-2.59; P = .039), and hardware removal (RR = 4.94; 95% CI, 2.10-11.61; P = .0003) in patients who underwent primary ACL repair versus ACLR. The RR of reoperations and complications (knee-related) were not significantly different between groups. No significant differences were observed when comparing patient-reported outcome scores. In studies with minimum 5-year outcomes, no significant differences in adverse events or Lysholm scores were observed. Conclusion: In contemporary comparative trials of ACL repair versus ACLR, the RR of clinical failure, revision surgery due to ACL rerupture, and hardware removal was greater for primary ACL repair compared with ACLR. There were no observed differences in patient-reported outcome scores, reoperations, or knee-related complications between approaches. In the limited literature reporting on minimum 5-year outcomes, significant differences in adverse events or the International Knee Documentation Committee score were not observed.

7.
Arthroscopy ; 2024 Jun 24.
Artículo en Inglés | MEDLINE | ID: mdl-38925234

RESUMEN

PURPOSE: To provide a proof-of-concept analysis of the appropriateness and performance of ChatGPT-4 to triage, synthesize differential diagnoses, and generate treatment plans concerning common presentations of knee pain. METHODS: Twenty knee complaints warranting triage and expanded scenarios were input into ChatGPT-4, with memory cleared prior to each new input to mitigate bias. For the 10 triage complaints, ChatGPT-4 was asked to generate a differential diagnosis that was graded for accuracy and suitability in comparison to a differential created by 2 orthopaedic sports medicine physicians. For the 10 clinical scenarios, ChatGPT-4 was prompted to provide treatment guidance for the patient, which was again graded. To test the higher-order capabilities of ChatGPT-4, further inquiry into these specific management recommendations was performed and graded. RESULTS: All ChatGPT-4 diagnoses were deemed appropriate within the spectrum of potential pathologies on a differential. The top diagnosis on the differential was identical between surgeons and ChatGPT-4 for 70% of scenarios, and the top diagnosis provided by the surgeon appeared as either the first or second diagnosis in 90% of scenarios. Overall, 16 of 30 diagnoses (53.3%) in the differential were identical. When provided with 10 expanded vignettes with a single diagnosis, the accuracy of ChatGPT-4 increased to 100%, with the suitability of management graded as appropriate in 90% of cases. Specific information pertaining to conservative management, surgical approaches, and related treatments was appropriate and accurate in 100% of cases. CONCLUSIONS: ChatGPT-4 provided clinically reasonable diagnoses to triage patient complaints of knee pain due to various underlying conditions that were generally consistent with differentials provided by sports medicine physicians. Diagnostic performance was enhanced when providing additional information, allowing ChatGPT-4 to reach high predictive accuracy for recommendations concerning management and treatment options. However, ChatGPT-4 may show clinically important error rates for diagnosis depending on prompting strategy and information provided; therefore, further refinements are necessary prior to implementation into clinical workflows. CLINICAL RELEVANCE: Although ChatGPT-4 is increasingly being used by patients for health information, the potential for ChatGPT-4 to serve as a clinical support tool is unclear. In this study, we found that ChatGPT-4 was frequently able to diagnose and triage knee complaints appropriately as rated by sports medicine surgeons, suggesting that it may eventually be a useful clinical support tool.

8.
Surg Open Sci ; 18: 62-69, 2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-38419945

RESUMEN

Background: There is a lack of physician ethnic and gender diversity amongst surgical specialties. This study analyzes the literature that promotes diversity amongst surgical trainees. Specifically, this study sought to answer (i) how the number of publications regarding diversity in orthopaedic surgery compares to other surgical specialties, (ii) how the number of publications amongst all surgical subspecialties trends over time and (iii) which specific topics regarding diversity are discussed in the surgical literature. Methods: The Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines were used to query articles from PubMed, Web of Science, Embase and the Cumulative Index to Nursing and Allied Health Literature. Broad inclusion criteria for both ethnic and gender diversity of any surgical specialty were utilized. Results: Our query resulted 1429 publications, of which 408 duplicates were removed, and 701 were excluded on title and abstract screening, leaving 320 to be included. The highest number of related publications was in orthopaedic surgery (n = 73) followed by general surgery (n = 56). Out of 320 total articles, 260 (81.3 %) were published after 2015, and 56 of 73 (76.7 %) orthopaedic-specific articles were published after 2015. Conclusion: Orthopaedic surgery published the most about ethnic and gender diversity, however, still remains one of the least diverse surgical specialties. With the recent increase in publications on diversity in surgical training, close attention should be paid to ethnic and gender diversity amongst surgical trainees over the coming years. Should diversity remain stagnant, diversification efforts may need to be restructured to achieve a diverse surgeon workforce. Key message: Orthopaedic surgery is the surgical subspecialty that publishes the most about trainee ethnic and gender diversity followed by general surgery. With most of this literature being published over the last eight years, it is imperative to pay close attention to the ethnic and gender landscape of the surgeon workforce over the coming years.

9.
Nat Rev Dis Primers ; 10(1): 8, 2024 Feb 08.
Artículo en Inglés | MEDLINE | ID: mdl-38332156

RESUMEN

Rotator cuff tears are the most common upper extremity condition seen by primary care and orthopaedic surgeons, with a spectrum ranging from tendinopathy to full-thickness tears with arthritic change. Some tears are traumatic, but most rotator cuff problems are degenerative. Not all tears are symptomatic and not all progress, and many patients in whom tears become more extensive do not experience symptom worsening. Hence, a standard algorithm for managing patients is challenging. The pathophysiology of rotator cuff tears is complex and encompasses an interplay between the tendon, bone and muscle. Rotator cuff tears begin as degenerative changes within the tendon, with matrix disorganization and inflammatory changes. Subsequently, tears progress to partial-thickness and then full-thickness tears. Muscle quality, as evidenced by the overall size of the muscle and intramuscular fatty infiltration, also influences symptoms, tear progression and the outcomes of surgery. Treatment depends primarily on symptoms, with non-operative management sufficient for most patients with rotator cuff problems. Modern arthroscopic repair techniques have improved recovery, but outcomes are still limited by a lack of understanding of how to improve tendon to bone healing in many patients.


Asunto(s)
Lesiones del Manguito de los Rotadores , Humanos , Lesiones del Manguito de los Rotadores/cirugía , Artroscopía/métodos , Manguito de los Rotadores/cirugía , Resultado del Tratamiento
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA