Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 199
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Arthroscopy ; 2024 Sep 14.
Artículo en Inglés | MEDLINE | ID: mdl-39278424

RESUMEN

There is no shortage of literature surrounding ChatGPT and whether this large language model can provide accurate and clinically relevant information in response to simulated patient queries. Unfortunately, there is a shortage of literature addressing important considerations beyond these experimental and entertaining uses. Indeed, a trend for redundancy has emerged where most of the literature has applied ChatGPT to the same tasks while simply swapping the subject matter, resulting in a failure to expand the impact and reach of this potentially transformational artificial intelligence (AI) solution. Instead, research addressing pressing health care challenges and a renewed focus on novel use cases will allow for more meaningful research initiatives, product development, and tangible changes at both the system and point-of-care levels. Current target areas of interest in medicine that remain obstacles to patient care include prior authorization, administrative burden, documentation generation, medical triage and diagnosis, and patient communication efficiency. To advance this area of research toward such meaningful applications, a structured framework is necessary. Such frameworks should include problem identification; definition of key performance indicators; multidisciplinary and multi-institutional collaboration of those with domain expertise, including AI engineers and information technology specialists; policy and strategy development driven by executive-level personnel; institutional financial support and investment from key stakeholders for AI infrastructure and maintenance; and critical assessment of AI performance, bias, and equity.

2.
Arthroscopy ; 40(2): 579-580, 2024 02.
Artículo en Inglés | MEDLINE | ID: mdl-38296452

RESUMEN

An important domain of artificial intelligence is deep learning, which comprises computed vision tasks used for recognizing complex patterns in orthopaedic imaging, thus automating the identification of pathology. Purported benefits include an expedited clinical workflow; improved performance and consistency in diagnostic tasks; decreased time allocation burden; augmentation of diagnostic performance, decreased inter-reader discrepancies in measurements and diagnosis as a function of reducing subjectivity in the setting of differences in imaging quality, resolution, penetrance, or orientation; and the ability to function autonomously without rest (unlike human observers). Detection may include the presence or absence of an entity or identification of a specific landmark. Within the field of musculoskeletal health, such capabilities have been shown across a wide range of tasks such as detecting the presence or absence of a rotator cuff tear or automatically identifying the center of the hip joint. The clinical relevance and success of these research endeavors have led to a plethora of novel algorithms. However, few of these algorithms have been externally validated, and evidence remains inconclusive as to whether they provide a diagnostic benefit when compared with the current, human gold standard.


Asunto(s)
Ortopedia , Lesiones del Manguito de los Rotadores , Humanos , Manguito de los Rotadores , Inteligencia Artificial , Algoritmos
3.
Arthroscopy ; 2024 Feb 05.
Artículo en Inglés | MEDLINE | ID: mdl-38325497

RESUMEN

PURPOSE: To (1) review definitions and concepts necessary to interpret applications of deep learning (DL; a domain of artificial intelligence that leverages neural networks to make predictions on media inputs such as images) and (2) identify knowledge and translational gaps in the literature to provide insight into specific areas for improvement as adoption of this technology continues. METHODS: A comprehensive search of the literature was performed in December 2023 for articles regarding the use of DL in sports medicine. For each study, information regarding the joint of focus, specific anatomic structure/pathology to which DL was applied, imaging modality utilized, source of images used for model training and testing, data set size, model performance, and whether the DL model was externally validated was recorded. A numerical scale was used to rate each DL model's clinical impact, with 1 corresponding to proof-of-concept studies with little to no direct clinical impact and 5 corresponding to practice-changing clinical impact and ready for clinical deployment. RESULTS: Fifty-five studies were identified, all of which were published within the past 5 years, while 82% were published within the past 3 years. Of the DL models identified, 84% were developed for classification tasks, 9% for automated measurements, and 7% for segmentation. A total of 62% of studies utilized magnetic resonance imaging as the imaging modality, 25% radiographs, and 7% ultrasound, while 1 study each used computed tomography, arthroscopic images, or arthroscopic video. Sixty-five percent of studies focused on the detection of tears (anterior cruciate ligament [ACL], rotator cuff [RC], and meniscus). The diagnostic performance of ACL tears, as determined by the area under the receiver operator curve (AUROC), ranged from 0.81 to 0.99 for ACL tears (excellent to near perfect), 0.83 to 0.94 for RC tears (excellent), and from 0.75 to 0.96 for meniscus tears (acceptable to excellent). In addition, 3 studies focused on detection of cartilage lesions had AUROC ranging from 0.90 to 0.92 (excellent performance). However, only 4 (7%) studies externally validated their models, suggesting that they may not be generalizable or may not perform well when applied to populations other than that used to develop the model. Finally, the mean clinical impact score was 2 (range, 1-3) on scale of 1 to 5, corresponding to limited clinical applicability. CONCLUSIONS: DL models in orthopaedic sports medicine show generally excellent performance (high internal validity) but require external validation to facilitate clinical deployment. In addition, current models have low clinical applicability and fail to advance the field due to a focus on routine tasks and a narrow conceptual framework. LEVEL OF EVIDENCE: Level IV, scoping review of Level I to IV studies.

4.
Arthroscopy ; 2024 Aug 22.
Artículo en Inglés | MEDLINE | ID: mdl-39173690

RESUMEN

PURPOSE: To determine whether several leading, commercially available large language models (LLMs) provide treatment recommendations concordant with evidence-based clinical practice guidelines (CPGs) developed by the American Academy of Orthopaedic Surgeons (AAOS). METHODS: All CPGs concerning the management of rotator cuff tears (n = 33) and anterior cruciate ligament injuries (n = 15) were extracted from the AAOS. Treatment recommendations from Chat-Generative Pretrained Transformer version 4 (ChatGPT-4), Gemini, Mistral-7B, and Claude-3 were graded by 2 blinded physicians as being concordant, discordant, or indeterminate (i.e., neutral response without definitive recommendation) with respect to AAOS CPGs. The overall concordance between LLM and AAOS recommendations was quantified, and the comparative overall concordance of recommendations among the 4 LLMs was evaluated through the Fisher exact test. RESULTS: Overall, 135 responses (70.3%) were concordant, 43 (22.4%) were indeterminate, and 14 (7.3%) were discordant. Inter-rater reliability for concordance classification was excellent (κ = 0.92). Concordance with AAOS CPGs was most frequently observed with ChatGPT-4 (n = 38, 79.2%) and least frequently observed with Mistral-7B (n = 28, 58.3%). Indeterminate recommendations were most frequently observed with Mistral-7B (n = 17, 35.4%) and least frequently observed with Claude-3 (n = 8, 6.7%). Discordant recommendations were most frequently observed with Gemini (n = 6, 12.5%) and least frequently observed with ChatGPT-4 (n = 1, 2.1%). Overall, no statistically significant difference in concordant recommendations was observed across LLMs (P = .12). Of all recommendations, only 20 (10.4%) were transparent and provided references with full bibliographic details or links to specific peer-reviewed content to support recommendations. CONCLUSIONS: Among leading commercially available LLMs, more than 1-in-4 recommendations concerning the evaluation and management of rotator cuff and anterior cruciate ligament injuries do not reflect current evidence-based CPGs. Although ChatGPT-4 showed the highest performance, clinically significant rates of recommendations without concordance or supporting evidence were observed. Only 10% of responses by LLMs were transparent, precluding users from fully interpreting the sources from which recommendations were provided. CLINICAL RELEVANCE: Although leading LLMs generally provide recommendations concordant with CPGs, a substantial error rate exists, and the proportion of recommendations that do not align with these CPGs suggests that LLMs are not trustworthy clinical support tools at this time. Each off-the-shelf, closed-source LLM has strengths and weaknesses. Future research should evaluate and compare multiple LLMs to avoid bias associated with narrow evaluation of few models as observed in the current literature.

5.
Arthroscopy ; 2024 Jun 24.
Artículo en Inglés | MEDLINE | ID: mdl-38925234

RESUMEN

PURPOSE: To provide a proof-of-concept analysis of the appropriateness and performance of ChatGPT-4 to triage, synthesize differential diagnoses, and generate treatment plans concerning common presentations of knee pain. METHODS: Twenty knee complaints warranting triage and expanded scenarios were input into ChatGPT-4, with memory cleared prior to each new input to mitigate bias. For the 10 triage complaints, ChatGPT-4 was asked to generate a differential diagnosis that was graded for accuracy and suitability in comparison to a differential created by 2 orthopaedic sports medicine physicians. For the 10 clinical scenarios, ChatGPT-4 was prompted to provide treatment guidance for the patient, which was again graded. To test the higher-order capabilities of ChatGPT-4, further inquiry into these specific management recommendations was performed and graded. RESULTS: All ChatGPT-4 diagnoses were deemed appropriate within the spectrum of potential pathologies on a differential. The top diagnosis on the differential was identical between surgeons and ChatGPT-4 for 70% of scenarios, and the top diagnosis provided by the surgeon appeared as either the first or second diagnosis in 90% of scenarios. Overall, 16 of 30 diagnoses (53.3%) in the differential were identical. When provided with 10 expanded vignettes with a single diagnosis, the accuracy of ChatGPT-4 increased to 100%, with the suitability of management graded as appropriate in 90% of cases. Specific information pertaining to conservative management, surgical approaches, and related treatments was appropriate and accurate in 100% of cases. CONCLUSIONS: ChatGPT-4 provided clinically reasonable diagnoses to triage patient complaints of knee pain due to various underlying conditions that were generally consistent with differentials provided by sports medicine physicians. Diagnostic performance was enhanced when providing additional information, allowing ChatGPT-4 to reach high predictive accuracy for recommendations concerning management and treatment options. However, ChatGPT-4 may show clinically important error rates for diagnosis depending on prompting strategy and information provided; therefore, further refinements are necessary prior to implementation into clinical workflows. CLINICAL RELEVANCE: Although ChatGPT-4 is increasingly being used by patients for health information, the potential for ChatGPT-4 to serve as a clinical support tool is unclear. In this study, we found that ChatGPT-4 was frequently able to diagnose and triage knee complaints appropriately as rated by sports medicine surgeons, suggesting that it may eventually be a useful clinical support tool.

6.
Arthroscopy ; 2024 Mar 19.
Artículo en Inglés | MEDLINE | ID: mdl-38513878

RESUMEN

PURPOSE: To (1) compare the efficacy of immersive virtual reality (iVR) to nonimmersive virtual reality (non-iVR) training in hip arthroscopy on procedural and knowledge-based skills acquisition and (2) evaluate the relative cost of each platform. METHODS: Fourteen orthopaedic surgery residents were randomized to simulation training utilizing an iVR Hip Arthroscopy Simulator (n = 7; PrecisionOS) or non-iVR simulator (n = 7; ArthroS Hip VR; VirtaMed). After training, performance was assessed on a cadaver by 4 expert hip arthroscopists through arthroscopic video review of a diagnostic hip arthroscopy. Performance was assessed using the Objective Structured Assessment of Technical Skills (OSATS) and Arthroscopic Surgery Skill Evaluation Tool (ASSET) scores. A cost analysis was performed using the transfer effectiveness ratio (TER) and a direct cost comparison of iVR to non-iVR. RESULTS: Demographic characteristics did not differ between treatment arms or by training level, hip arthroscopy experience, or prior simulator use. No significant differences were observed in OSATS and ASSET scores between iVR and non-iVR cohorts (OSATS: iVR 19.6 ± 4.4, non-iVR 21.0 ± 4.1, P = .55; ASSET: iVR 23.7 ± 4.5, non-iVR 25.8 ± 4.8, P = .43). The absolute TER was 0.06 and there was a 132-fold cost difference of iVR to non-iVR. CONCLUSIONS: Hip arthroscopy simulator training with iVR had similar performance results to non-iVR for technical skill and procedural knowledge acquisition after expert arthroscopic video assessment. The iVR platform had similar effectiveness in transfer of skill compared to non-iVR with a 132 times cost differential. CLINICAL RELEVANCE: Due to the accessibility, effectiveness, and relative affordability, iVR training may be beneficial in the future of safe arthroscopic hip training.

7.
Arthroscopy ; 2024 Jun 25.
Artículo en Inglés | MEDLINE | ID: mdl-38936557

RESUMEN

PURPOSE: To assess the ability of ChatGPT-4, an automated Chatbot powered by artificial intelligence, to answer common patient questions concerning the Latarjet procedure for patients with anterior shoulder instability and compare this performance with Google Search Engine. METHODS: Using previously validated methods, a Google search was first performed using the query "Latarjet." Subsequently, the top 10 frequently asked questions (FAQs) and associated sources were extracted. ChatGPT-4 was then prompted to provide the top 10 FAQs and answers concerning the procedure. This process was repeated to identify additional FAQs requiring discrete-numeric answers to allow for a comparison between ChatGPT-4 and Google. Discrete, numeric answers were subsequently assessed for accuracy on the basis of the clinical judgment of 2 fellowship-trained sports medicine surgeons who were blinded to search platform. RESULTS: Mean (± standard deviation) accuracy to numeric-based answers was 2.9 ± 0.9 for ChatGPT-4 versus 2.5 ± 1.4 for Google (P = .65). ChatGPT-4 derived information for answers only from academic sources, which was significantly different from Google Search Engine (P = .003), which used only 30% academic sources and websites from individual surgeons (50%) and larger medical practices (20%). For general FAQs, 40% of FAQs were found to be identical when comparing ChatGPT-4 and Google Search Engine. In terms of sources used to answer these questions, ChatGPT-4 again used 100% academic resources, whereas Google Search Engine used 60% academic resources, 20% surgeon personal websites, and 20% medical practices (P = .087). CONCLUSIONS: ChatGPT-4 demonstrated the ability to provide accurate and reliable information about the Latarjet procedure in response to patient queries, using multiple academic sources in all cases. This was in contrast to Google Search Engine, which more frequently used single-surgeon and large medical practice websites. Despite differences in the resources accessed to perform information retrieval tasks, the clinical relevance and accuracy of information provided did not significantly differ between ChatGPT-4 and Google Search Engine. CLINICAL RELEVANCE: Commercially available large language models (LLMs), such as ChatGPT-4, can perform diverse information retrieval tasks on-demand. An important medical information retrieval application for LLMs consists of the ability to provide comprehensive, relevant, and accurate information for various use cases such as investigation about a recently diagnosed medical condition or procedure. Understanding the performance and abilities of LLMs for use cases has important implications for deployment within health care settings.

8.
Artículo en Inglés | MEDLINE | ID: mdl-39126271

RESUMEN

PURPOSE: To define the minimal clinically important difference (MCID) for measures of pain and function at 2, 5 and 10 years after osteochondral autograft transplantations (OATs). METHODS: Patients undergoing OATs of the knee were identified from a prospectively maintained cartilage surgery registry. Baseline demographic, injury and surgical factors were collected. Patient-reported outcome scores (PROMs) were collected at baseline, 2-, 5- and 10-year follow-up, including the International Knee Documentation Committee (IKDC) score, Knee Outcome Survey Activities of Daily Living Scale (KOS-ADLS), Marx activity scale and Visual Analogue Scale (VAS) for pain. The MCIDs were quantified for each metric utilizing a distribution-based method equivalent to one-half the standard deviation of the mean change in outcome score. The percentage of patients achieving MCID as a function of time was assessed. RESULTS: Of 63 consecutive patients who underwent OATs, 47 (74.6%) patients were eligible for follow-up (surgical date before October 2021) and had fully completed preoperative PROMs. A total of 39 patients (83%) were available for a minimum 2-year follow-up, with a mean (±standard deviation) follow-up of 5.8 ± 3.4 years. The MCIDs were determined to be 9.3 for IKDC, 2.5 for Marx, 7.4 for KOS-ADLS and 12.9 for pain. At 2 years, 78.1% of patients achieved MCID for IKDC, 77.8% for Marx, 75% for KOS-ADLS and 57.9% for pain. These results were generally maintained through 10-year follow-ups, with 75% of patients achieving MCID for IKDC, 80% for Marx, 80% for KOS-ADLS and 69.8% for pain. CONCLUSIONS: The majority of patients achieved a clinically relevant outcome improvement after OATs of the knee, with results sustained through 10-year follow-up. Patients who experience clinically relevant outcome improvement after OATs in the short term continue to experience sustained benefits at longer-term follow-up. These data provide valuable prognostic information when discussing patient candidacy and the expected trajectory of recovery. LEVEL OF EVIDENCE: Level III.

9.
J Hand Surg Am ; 49(5): 411-422, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38551529

RESUMEN

PURPOSE: To review the existing literature to (1) determine the diagnostic efficacy of artificial intelligence (AI) models for detecting scaphoid and distal radius fractures and (2) compare the efficacy to human clinical experts. METHODS: PubMed, OVID/Medline, and Cochrane libraries were queried for studies investigating the development, validation, and analysis of AI for the detection of scaphoid or distal radius fractures. Data regarding study design, AI model development and architecture, prediction accuracy/area under the receiver operator characteristic curve (AUROC), and imaging modalities were recorded. RESULTS: A total of 21 studies were identified, of which 12 (57.1%) used AI to detect fractures of the distal radius, and nine (42.9%) used AI to detect fractures of the scaphoid. AI models demonstrated good diagnostic performance on average, with AUROC values ranging from 0.77 to 0.96 for scaphoid fractures and from 0.90 to 0.99 for distal radius fractures. Accuracy of AI models ranged between 72.0% to 90.3% and 89.0% to 98.0% for scaphoid and distal radius fractures, respectively. When compared to clinical experts, 13 of 14 (92.9%) studies reported that AI models demonstrated comparable or better performance. The type of fracture influenced model performance, with worse overall performance on occult scaphoid fractures; however, models trained specifically on occult fractures demonstrated substantially improved performance when compared to humans. CONCLUSIONS: AI models demonstrated excellent performance for detecting scaphoid and distal radius fractures, with the majority demonstrating comparable or better performance compared with human experts. Worse performance was demonstrated on occult fractures. However, when trained specifically on difficult fracture patterns, AI models demonstrated improved performance. CLINICAL RELEVANCE: AI models can help detect commonly missed occult fractures while enhancing workflow efficiency for distal radius and scaphoid fracture diagnoses. As performance varies based on fracture type, future studies focused on wrist fracture detection should clearly define whether the goal is to (1) identify difficult-to-detect fractures or (2) improve workflow efficiency by assisting in routine tasks.


Asunto(s)
Inteligencia Artificial , Fracturas del Radio , Hueso Escafoides , Fracturas de la Muñeca , Humanos , Fracturas del Radio/diagnóstico por imagen , Hueso Escafoides/lesiones , Fracturas de la Muñeca/diagnóstico por imagen
10.
J Arthroplasty ; 39(3): 701-707, 2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-37793507

RESUMEN

BACKGROUND: Interpreting clinical relevance of randomized clinical trials (RCTs) is challenging when P-values are marginally above or below the P = .05 threshold. This study examined the robustness of statistically insignificant mortality events from RCTs comparing hemiarthroplasty femoral fixation for displaced intracapsular hip fractures through the reverse fragility index (RFI). METHODS: RCTs were identified using Pubmed, OVID/Medline, and Cochrane databases. Mortality endpoints were stratified into 3 categories: (1) within 30-days, (2) within 90-days, and (3) at latest follow-up. The RFI was derived by manipulating reported mortality events utilizing a contingency table while maintaining a constant number of participants. The reverse fragility quotient (RFQ) was quantified by dividing the RFI by the study sample. RESULTS: Eight RCTs (2,494 participants) were included. The median RFI and RFQ within 30-days was 3.0 (interquartile range [IQR]: 3.0 to 6.0) and 0.016 (IQR: 0.015 to 0.021), suggesting nonsignificant findings were contingent on 1.6 mortality events/100 participants. The median RFI and RFQ within 90-days was 6.0 (IQR: 4.0 to 7.0) and 0.028 (IQR: 0.024 to 0.038), suggesting nonsignificant findings were contingent on 2.8 mortality events/100 participants. At latest follow-up, the median RFI and RFQ was 7.0 (IQR: 6.0 to 12.0) and 0.038 (IQR: 0.029 to 0.054), suggesting nonsignificant findings were contingent on only 3.8 mortality events/100 participants. Median loss to follow-up was 16.0 (IQR: 11.0 to 58.0; 228% greater than RFI), and exceeded the RFI in 6/7(85.7%) studies. CONCLUSIONS: A small number of events (median of 7) was required to convert a statistically nonsignificant finding to one that is significant for the endpoint of mortality. The median loss to follow-up exceeded the median RFI by greater than 200%, suggesting methodological limitations such as patient allocation could alter conclusions.


Asunto(s)
Artroplastia de Reemplazo de Cadera , Fracturas del Cuello Femoral , Hemiartroplastia , Fracturas de Cadera , Humanos , Cementos para Huesos/uso terapéutico , Ensayos Clínicos Controlados Aleatorios como Asunto , Fracturas de Cadera/cirugía , Fracturas del Cuello Femoral/cirugía
11.
J Arthroplasty ; 2024 Oct 16.
Artículo en Inglés | MEDLINE | ID: mdl-39424245

RESUMEN

BACKGROUND: Total joint arthroplasty (TJA) is well recognized for improving quality of life and functional outcomes of patients with osteoarthritis; however, TJA's impact on body weight remains unclear. Recent trends have demonstrated a shift among TJA patients, such that patients who have higher body mass indices (BMIs) are undergoing this common surgery. Given this trend, it is critical to characterize the impact TJA may have on body weight/BMI. This meta-analysis aimed to quantitatively assess whether patients lose, gain, or maintain body weight/BMI after TJA. METHODS: This study followed the 2020 Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines. Ovid MEDLINE, Embase, and the Cochrane Central Register of Controlled Trials databases were queried from inception through July 2022. INCLUDED STUDIES: (1) reported on weight or BMI after elective, primary total hip arthroplasty (THA) or total knee arthroplasty (TKA); (2) weight/BMI change was deemed to be associated with THA/TKA. Excluded studies: (1) included weight/BMI interventions; (2) reported on unicompartmental/partial arthroplasty, revision arthroplasty, or joint arthroscopy. Meta-analyses for weight change, BMI change, and proportion of patients achieving clinically significant change were performed using random-effects models. Factors associated with clinically significant change were systematically reported. A total of 60,837 patients from 39 studies were included. RESULTS: No significant differences existed between preoperative and postoperative weights (P = 1.0; P = 0.28) or BMIs (P = 1.0; P = 1.0) after THA or TKA, respectively. Overall, 66% of THA patients (P < 0.01) and 65% of TKA patients (P < 0.01) did not experience clinically significant weight change. Age, preoperative BMI, and sex were most often reported to be associated with clinically significant weight/BMI change. CONCLUSIONS: There were two-thirds of the patients undergoing TJA who maintained their preoperative body weight/BMI after surgery. With these results, orthopaedic surgeons can better manage patient expectations of TJA.

12.
J Arthroplasty ; 39(5): 1191-1198.e2, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38007206

RESUMEN

BACKGROUND: The radiographic assessment of bone morphology impacts implant selection and fixation type in total hip arthroplasty (THA) and is important to minimize the risk of periprosthetic femur fracture (PFF). We utilized a deep-learning algorithm to automate femoral radiographic parameters and determined which automated parameters were associated with early PFF. METHODS: Radiographs from a publicly available database and from patients undergoing primary cementless THA at a high-volume institution (2016 to 2020) were obtained. A U-Net algorithm was trained to segment femoral landmarks for bone morphology parameter automation. Automated parameters were compared against that of a fellowship-trained surgeon and compared in an independent cohort of 100 patients who underwent THA (50 with early PFF and 50 controls matched by femoral component, age, sex, body mass index, and surgical approach). RESULTS: On the independent cohort, the algorithm generated 1,710 unique measurements for 95 images (5% lesser trochanter identification failure) in 22 minutes. Medullary canal width, femoral cortex width, canal flare index, morphological cortical index, canal bone ratio, and canal calcar ratio had good-to-excellent correlation with surgeon measurements (Pearson's correlation coefficient: 0.76 to 0.96). Canal calcar ratios (0.43 ± 0.08 versus 0.40 ± 0.07) and canal bone ratios (0.39 ± 0.06 versus 0.36 ± 0.06) were higher (P < .05) in the PFF cohort when comparing the automated parameters. CONCLUSIONS: Deep-learning automated parameters demonstrated differences in patients who had and did not have early PFF after cementless primary THA. This algorithm has the potential to complement and improve patient-specific PFF risk-prediction tools.

13.
Clin Orthop Relat Res ; 481(9): 1745-1759, 2023 09 01.
Artículo en Inglés | MEDLINE | ID: mdl-37256278

RESUMEN

BACKGROUND: Unplanned hospital readmissions after total joint arthroplasty (TJA) represent potentially serious adverse events and remain a critical measure of hospital quality. Predicting the risk of readmission after TJA may provide patients and clinicians with valuable information for preoperative decision-making. QUESTIONS/PURPOSES: (1) Can nonlinear machine-learning models integrating preoperatively available patient, surgeon, hospital, and county-level information predict 30-day unplanned hospital readmissions in a large cohort of nationwide Medicare beneficiaries undergoing TJA? (2) Which predictors are the most important in predicting 30-day unplanned hospital readmissions? (3) What specific information regarding population-level associations can we obtain from interpreting partial dependency plots (plots describing, given our modeling choice, the potentially nonlinear shape of associations between predictors and readmissions) of the most important predictors of 30-day readmission? METHODS: National Medicare claims data (chosen because this database represents a large proportion of patients undergoing TJA annually) were analyzed for patients undergoing inpatient TJA between October 2016 and September 2018. A total of 679,041 TJAs (239,391 THAs [61.3% women, 91.9% White, 52.6% between 70 and 79 years old] and 439,650 TKAs [63.3% women, 90% White, 55.2% between 70 and 79 years old]) were included. Model features included demographics, county-level social determinants of health, prior-year (365-day) hospital and surgeon TJA procedure volumes, and clinical classification software-refined diagnosis and procedure categories summarizing each patient's Medicare claims 365 days before TJA. Machine-learning models, namely generalized additive models with pairwise interactions (prediction models consisting of both univariate predictions and pairwise interaction terms that allow for nonlinear effects), were trained and evaluated for predictive performance using area under the receiver operating characteristic (AUROC; 1.0 = perfect discrimination, 0.5 = no better than random chance) and precision-recall curves (AUPRC; equivalent to the average positive predictive value, which does not give credit for guessing "no readmission" when this is true most of the time, interpretable relative to the base rate of readmissions) on two holdout samples. All admissions (except the last 2 months' worth) were collected and split randomly 80%/20%. The training cohort was formed with the random 80% sample, which was downsampled (so it included all readmissions and a random, equal number of nonreadmissions). The random 20% sample served as the first test cohort ("random holdout"). The last 2 months of admissions (originally held aside) served as the second test cohort ("2-month holdout"). Finally, feature importances (the degree to which each variable contributed to the predictions) and partial dependency plots were investigated to answer the second and third research questions. RESULTS: For the random holdout sample, model performance values in terms of AUROC and AUPRC were 0.65 and 0.087, respectively, for THA and 0.66 and 0.077, respectively, for TKA. For the 2-month holdout sample, these numbers were 0.66 and 0.087 and 0.65 and 0.075. Thus, our nonlinear models incorporating a wide variety of preoperative features from Medicare claims data could not well-predict the individual likelihood of readmissions (that is, the models performed poorly and are not appropriate for clinical use). The most predictive features (in terms of mean absolute scores) and their partial dependency graphs still confer information about population-level associations with increased risk of readmission, namely with older patient age, low prior 365-day surgeon and hospital TJA procedure volumes, being a man, patient history of cardiac diagnoses and lack of oncologic diagnoses, and higher county-level rates of hospitalizations for ambulatory-care sensitive conditions. Further inspection of partial dependency plots revealed nonlinear population-level associations specifically for surgeon and hospital procedure volumes. The readmission risk for THA and TKA decreased as surgeons performed more procedures in the prior 365 days, up to approximately 75 TJAs (odds ratio [OR] = 1.2 for TKA and 1.3 for THA), but no further risk reduction was observed for higher annual surgeon procedure volumes. For THA, the readmission risk decreased as hospitals performed more procedures, up to approximately 600 TJAs (OR = 1.2), but no further risk reduction was observed for higher annual hospital procedure volumes. CONCLUSION: A large dataset of Medicare claims and machine learning were inadequate to provide a clinically useful individual prediction model for 30-day unplanned readmissions after TKA or THA, suggesting that other factors that are not routinely collected in claims databases are needed for predicting readmissions. Nonlinear population-level associations between low surgeon and hospital procedure volumes and increased readmission risk were identified, including specific volume thresholds above which the readmission risk no longer decreases, which may still be indirectly clinically useful in guiding policy as well as patient decision-making when selecting a hospital or surgeon for treatment. LEVEL OF EVIDENCE: Level III, therapeutic study.


Asunto(s)
Artroplastia de Reemplazo de Cadera , Artroplastia de Reemplazo de Rodilla , Masculino , Humanos , Femenino , Anciano , Estados Unidos , Artroplastia de Reemplazo de Cadera/efectos adversos , Readmisión del Paciente , Medicare , Artroplastia de Reemplazo de Rodilla/efectos adversos , Aprendizaje Automático , Factores de Riesgo , Estudios Retrospectivos
14.
Arthroscopy ; 39(12): 2454-2455, 2023 12.
Artículo en Inglés | MEDLINE | ID: mdl-37981387

RESUMEN

The evolution of social media and related online sources has substantially increased the ability of patients to query and access publicly available information that may have relevance to a potential musculoskeletal condition of interest. Although increased accessibility to information has several purported benefits, including encouragement of patients to become more invested in their care through self-teaching, a downside to the existence of a vast number of unregulated resources remains the risk of misinformation. As health care providers, we have a moral and ethical obligation to mitigate this risk by directing patients to high-quality resources for medical information and to be aware of resources that are unreliable. To this end, a growing body of evidence has suggested that YouTube lacks reliability and quality in terms of medical information concerning a variety of musculoskeletal conditions.


Asunto(s)
Enfermedades Musculoesqueléticas , Humanos , Reproducibilidad de los Resultados
15.
Arthroscopy ; 39(2): 151-158, 2023 02.
Artículo en Inglés | MEDLINE | ID: mdl-35561871

RESUMEN

With the plethora of machine learning (ML) analyses published in the orthopaedic literature within the last 5 years, several attempts have been made to enhance our understanding of what exactly ML means and how it is used. At its most fundamental level, ML comprises a branch of artificial intelligence that uses algorithms to analyze and learn from patterns in data without explicit programming or human intervention. On the other hand, traditional statistics require a user to specifically choose variables of interest to create a model capable of predicting an outcome, the output of which (1) may be falsely influenced by the variables chosen to be included by the user and (2) does not allow for optimization of performance. Early publications have served as succinct editorials or reviews intended to ease audiences unfamiliar with ML into the complexities that accompany the subject. Most commonly, the focus of these studies concerns the terminology and concepts surrounding ML because it is important to understand the rationale behind performing such studies. Unfortunately, these publications only touch on the most basic aspects of ML and are too frequently repetitive. Indeed, the conclusion of these articles reiterate that the potential clinical utility of these algorithms remains tangential at best in their current form and caution against premature adoption without external validation. By doing so, our perspective and ability to draw our own conclusions from these studies have not advanced, and we are left concluding with each subsequent study that a new algorithm is published for an outcome of interest that cannot be used until further validation. What readers now need is to regress to embrace the principles of the scientific method that they have used to critically assess vast numbers of publications before this wave of newly applied statistical methodology-a guide to interpret results such that their own conclusions can be drawn. LEVEL OF EVIDENCE: Level V, expert opinion.


Asunto(s)
Inteligencia Artificial , Aprendizaje Automático , Humanos , Algoritmos , Extremidad Superior
16.
Arthroscopy ; 39(5): 1330-1344, 2023 05.
Artículo en Inglés | MEDLINE | ID: mdl-36649827

RESUMEN

PURPOSE: To assess the relationship between pitch velocity and throwing arm kinetics, injury, and ulnar collateral ligament reconstruction (UCLr) among high school, collegiate, and professional baseball pitchers. METHODS: The Cochrane Database of Systematic Reviews, the Cochrane Central Register of Controlled Trials, PubMed (2008-2019), and OVID/MEDLINE (2008-2019) were queried for articles that reported on pitch velocity predicting throwing arm kinetics, injury, or UCLr. The Methodological Index for Non-randomized Studies checklist was used to evaluate the quality of all included studies. Descriptive statistics with ranges were used to quantify data where appropriate. RESULTS: A total of 24 studies examining 2,896 pitchers, with Level of Evidence II-V were included. Intergroup analysis noted pitch velocity was significantly correlated with elbow varus torque in high school (R2 = 0.36), collegiate (R2 = 0.29), and professional (R2 = 0.076) pitchers. Elbow distraction force was positively associated with ball velocity in interpitcher analyses of high school (R2 = 0.373), professional (R2 = 0.175), and mixed-cohort evaluations (R2 = 0.624). Intragroup analysis demonstrated a strong association between pitch velocity and elbow varus torque (R2 = 0.922-0.957) and elbow distraction force (R2 = 0.910) in professional pitchers. Faster ball velocity was positively associated with a history of throwing arm injury (R2 = 0.194) in nonadult pitchers. In 2 studies evaluating professionals, injured pitchers had faster pitch velocity before injury compared with uninjured controls (P = .014; P = .0354). The need for UCLr was positively correlated with pitch velocity (R2 = 0.036) in professional pitchers. The consequences of UCLr noted little to no decrease in pitch velocity. CONCLUSIONS: Professional baseball pitchers with faster pitch velocity may be at the greatest risk of elbow injury and subsequent UCLr, potentially through the mechanism of increased distractive forces on the medial elbow complex. When a pitcher ultimately undergoes UCLr, decreases in pitching performance are unlikely, but may occur, which should encourage pitchers to caution against maximizing pitch velocity. LEVEL OF EVIDENCE: Level IV, systematic review of Level II-IV studies.


Asunto(s)
Brazo , Béisbol , Ligamento Colateral Cubital , Reconstrucción del Ligamento Colateral Cubital , Adolescente , Humanos , Brazo/fisiología , Brazo/cirugía , Béisbol/lesiones , Fenómenos Biomecánicos , Ligamento Colateral Cubital/lesiones , Ligamento Colateral Cubital/cirugía , Articulación del Codo/cirugía
17.
Arthroscopy ; 39(3): 592-599, 2023 03.
Artículo en Inglés | MEDLINE | ID: mdl-36575108

RESUMEN

PURPOSE: To determine the incidence of ramp lesions and posteromedial tibial plateau (PMTP) bone bruising on magnetic resonance imaging (MRI) in patients with multiligament knee injuries (MLKIs) and an intact anterior cruciate ligament (ACL). METHODS: A retrospective review of consecutive patients surgically treated for MLKIs at 2 level I trauma centers between January 2001 and March 2021 was performed. Only MLKIs with an intact ACL that received MRI scans within 90 days of the injury were included. All MLKIs were diagnosed on MRI and confirmed with operative reports. Two musculoskeletal radiologists retrospectively rereviewed preoperative MRIs for evidence of medial meniscus ramp lesions (MMRLs) and PMTP bone bruises using previously established classification systems. Intraclass correlation coefficients were used to calculate the reliability between the radiologists. The incidence of MMRLs and PMTP bone bruises was quantified using descriptive statistics. RESULTS: A total of 221 MLKIs were identified, of which 32 (14.5%) had an intact ACL (87.5% male; mean age of 29.9 ± 8.6 years) and were included. The most common MLKI pattern was combined injury to the posterior cruciate ligament and posterolateral corner (n = 27, 84.4%). PMTP bone bruises were observed in 12 of 32 (37.5%) patients. Similarly, MMRLs were diagnosed in 12 of 32 (37.5%) patients. A total of 8 of 12 (66.7%) patients with MMRLs demonstrated evidence PMTP bone bruising. CONCLUSIONS: Over one-third of MLKI patients with an intact ACL were diagnosed with MMRLs on MRI in this series. PMTP bone bruising was observed in 66.7% of patients with MMRLs, suggesting that increased vigilance for identifying MMRLs at the time of ligament reconstruction should be practiced in patients with this bone bruising pattern. LEVEL OF EVIDENCE: Level IV, retrospective case series.


Asunto(s)
Lesiones del Ligamento Cruzado Anterior , Contusiones , Traumatismos de la Rodilla , Humanos , Masculino , Adulto Joven , Adulto , Femenino , Ligamento Cruzado Anterior/diagnóstico por imagen , Ligamento Cruzado Anterior/cirugía , Meniscos Tibiales/cirugía , Estudios Retrospectivos , Lesiones del Ligamento Cruzado Anterior/diagnóstico por imagen , Lesiones del Ligamento Cruzado Anterior/cirugía , Lesiones del Ligamento Cruzado Anterior/complicaciones , Reproducibilidad de los Resultados , Traumatismos de la Rodilla/diagnóstico por imagen , Traumatismos de la Rodilla/epidemiología , Traumatismos de la Rodilla/cirugía , Contusiones/diagnóstico por imagen , Contusiones/epidemiología , Contusiones/etiología , Imagen por Resonancia Magnética
18.
Arthroscopy ; 39(2): 245-252, 2023 02.
Artículo en Inglés | MEDLINE | ID: mdl-36049587

RESUMEN

PURPOSE: To compare complication rates and 5-year reoperation rates between open debridement (OD) and arthroscopic debridement (AD) for lateral epicondylitis. METHODS: The PearlDiver MUExtr database (2010-2019) was reviewed for patients diagnosed with lateral epicondylitis (queried by International Classification of Diseases, Ninth Revision and International Classification of Diseases, Tenth Revision [ICD-10] codes) undergoing OD or AD of the common extensor tendon without repair (queried by Current Procedural Terminology codes). Patients were stratified into 2 cohorts: those who underwent AD and those who underwent OD. Nonoperative treatment modalities were reported for both groups within 1 year before index procedure. The rates of 90-day postoperative complications were compared, and multivariate logistic regression analysis was used to identify risk factors for complications. The 5-year reoperation rates, using laterality-specific ICD-10 codes, were also compared between the 2 groups. RESULTS: In total, 19,280 patients (OD = 17,139, AD = 2,141) were analyzed in this study. The most common nonoperative treatments for patients who underwent OD or AD were corticosteroid injections (49.5% vs 43.2%), physical therapy (24.8% vs 25.7%), bracing (2.8% vs 3.2%), and platelet-rich plasma injections (1.3% vs 1.0%). There were no significant differences in radial nerve injuries, hematomas, surgical site infections, wound dehiscence, and sepsis events between the 2 procedures (P = .50). The 5-year reoperation rate was not significantly different between the AD (5.0%) and OD (3.9%) cohorts (P = .10). CONCLUSIONS: For lateral epicondylitis, both AD and OD of the extensor carpi radialis brevis (without repair) were found to have low rates of 90-day adverse events, with no significant differences between the 2 approaches. Similarly, the 5-year reoperation rate was low and not statistically different for those treated with OD or AD. LEVEL OF EVIDENCE: Level III, cross-sectional study.


Asunto(s)
Codo de Tenista , Humanos , Codo de Tenista/cirugía , Codo de Tenista/complicaciones , Reoperación , Desbridamiento/métodos , Estudios Transversales , Músculo Esquelético/cirugía , Artroscopía/métodos , Estudios Retrospectivos
19.
Knee Surg Sports Traumatol Arthrosc ; 31(3): 725-732, 2023 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-36581682

RESUMEN

A meta-analysis is the quantitative synthesis of data from two or more individual studies and are as a rule an important method of obtaining a more accurate estimate of the direction and magnitude of a treatment effect. However, it is imperative that the meta-analysis be performed with proper, rigorous methodology to ensure validity of the results and their interpretation. In this article the authors will review the most important questions researchers should consider when planning a meta-analysis to ensure proper indications and methodologies, minimize the risk of bias, and avoid misleading conclusions.


Asunto(s)
Sesgo , Metaanálisis como Asunto , Proyectos de Investigación , Humanos
20.
Knee Surg Sports Traumatol Arthrosc ; 31(8): 3339-3352, 2023 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-37000243

RESUMEN

PURPOSE: To perform a meta-analysis of RCTs evaluating donor site morbidity after bone-patellar tendon-bone (BTB), hamstring tendon (HT) and quadriceps tendon (QT) autograft harvest for anterior cruciate ligament reconstruction (ACLR). METHODS: PubMed, OVID/Medline and Cochrane databases were queried in July 2022. All level one articles reporting the frequency of specific donor-site morbidity were included. Frequentist model network meta-analyses with P-scores were conducted to compare the prevalence of donor-site morbidity, complications, all-cause reoperations and revision ACLR among the three treatment groups. RESULTS: Twenty-one RCTs comprising the outcomes of 1726 patients were included. The overall pooled rate of donor-site morbidity (defined as anterior knee pain, difficulty/impossibility kneeling, or combination) was 47.3% (range, 3.8-86.7%). A 69% (95% confidence interval [95% CI]: 0.18-0.56) and 88% (95% CI: 0.04-0.33) lower odds of incurring donor-site morbidity was observed with HT and QT autografts, respectively (p < 0.0001, both), when compared to BTB autograft. QT autograft was associated with a non-statistically significant reduction in donor-site morbidity compared with HT autograft (OR: 0.37, 95% CI: 0.14-1.03, n.s.). Treatment rankings (ordered from best-to-worst autograft choice with respect to donor-site morbidity) were as follows: (1) QT (P-score = 0.99), (2) HT (P-score = 0.51) and (3) BTB (P-score = 0.00). No statistically significant associations were observed between autograft and complications (n.s.), reoperations (n.s.) or revision ACLR (n.s.). CONCLUSION: ACLR using HT and QT autograft tissue was associated with a significant reduction in donor-site morbidity compared to BTB autograft. Autograft selection was not associated with complications, all-cause reoperations, or revision ACLR. Based on the current data, there is sufficient evidence to recommend that autograft selection should be personalized through considering differential rates of donor-site morbidity in the context of patient expectations and activity level without concern for a clinically important change in the rate of adverse events. LEVEL OF EVIDENCE: Level I.


Asunto(s)
Lesiones del Ligamento Cruzado Anterior , Reconstrucción del Ligamento Cruzado Anterior , Tendones Isquiotibiales , Ligamento Rotuliano , Humanos , Autoinjertos/cirugía , Ligamento Rotuliano/cirugía , Metaanálisis en Red , Lesiones del Ligamento Cruzado Anterior/cirugía , Ensayos Clínicos Controlados Aleatorios como Asunto , Tendones/trasplante , Reconstrucción del Ligamento Cruzado Anterior/métodos , Trasplante Autólogo , Tendones Isquiotibiales/trasplante , Morbilidad , Plastía con Hueso-Tendón Rotuliano-Hueso/efectos adversos , Plastía con Hueso-Tendón Rotuliano-Hueso/métodos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA