Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 182
Filtrar
1.
Artículo en Inglés | MEDLINE | ID: mdl-38470976

RESUMEN

BACKGROUND: Estimating the risk of revision after arthroplasty could inform patient and surgeon decision-making. However, there is a lack of well-performing prediction models assisting in this task, which may be due to current conventional modeling approaches such as traditional survivorship estimators (such as Kaplan-Meier) or competing risk estimators. Recent advances in machine learning survival analysis might improve decision support tools in this setting. Therefore, this study aimed to assess the performance of machine learning compared with that of conventional modeling to predict revision after arthroplasty. QUESTION/PURPOSE: Does machine learning perform better than traditional regression models for estimating the risk of revision for patients undergoing hip or knee arthroplasty? METHODS: Eleven datasets from published studies from the Dutch Arthroplasty Register reporting on factors associated with revision or survival after partial or total knee and hip arthroplasty between 2018 and 2022 were included in our study. The 11 datasets were observational registry studies, with a sample size ranging from 3038 to 218,214 procedures. We developed a set of time-to-event models for each dataset, leading to 11 comparisons. A set of predictors (factors associated with revision surgery) was identified based on the variables that were selected in the included studies. We assessed the predictive performance of two state-of-the-art statistical time-to-event models for 1-, 2-, and 3-year follow-up: a Fine and Gray model (which models the cumulative incidence of revision) and a cause-specific Cox model (which models the hazard of revision). These were compared with a machine-learning approach (a random survival forest model, which is a decision tree-based machine-learning algorithm for time-to-event analysis). Performance was assessed according to discriminative ability (time-dependent area under the receiver operating curve), calibration (slope and intercept), and overall prediction error (scaled Brier score). Discrimination, known as the area under the receiver operating characteristic curve, measures the model's ability to distinguish patients who achieved the outcomes from those who did not and ranges from 0.5 to 1.0, with 1.0 indicating the highest discrimination score and 0.50 the lowest. Calibration plots the predicted versus the observed probabilities; a perfect plot has an intercept of 0 and a slope of 1. The Brier score calculates a composite of discrimination and calibration, with 0 indicating perfect prediction and 1 the poorest. A scaled version of the Brier score, 1 - (model Brier score/null model Brier score), can be interpreted as the amount of overall prediction error. RESULTS: Using machine learning survivorship analysis, we found no differences between the competing risks estimator and traditional regression models for patients undergoing arthroplasty in terms of discriminative ability (patients who received a revision compared with those who did not). We found no consistent differences between the validated performance (time-dependent area under the receiver operating characteristic curve) of different modeling approaches because these values ranged between -0.04 and 0.03 across the 11 datasets (the time-dependent area under the receiver operating characteristic curve of the models across 11 datasets ranged between 0.52 to 0.68). In addition, the calibration metrics and scaled Brier scores produced comparable estimates, showing no advantage of machine learning over traditional regression models. CONCLUSION: Machine learning did not outperform traditional regression models. CLINICAL RELEVANCE: Neither machine learning modeling nor traditional regression methods were sufficiently accurate in order to offer prognostic information when predicting revision arthroplasty. The benefit of these modeling approaches may be limited in this context.

2.
Clin Orthop Relat Res ; 481(12): 2309-2315, 2023 12 01.
Artículo en Inglés | MEDLINE | ID: mdl-37707789

RESUMEN

BACKGROUND: In the setting of a suspected scaphoid fracture, MRI may result in overdiagnosis and potential overtreatment. This is in part because of the low prevalence of true fractures among suspected fractures, but also because of potentially misleading variations in signal that may be more common than fracture-related signal changes. To better understand the risk of overdiagnosis, we first need insight into the relative prevalence of useful and potentially distracting signal changes among patients with a suspected scaphoid fracture. QUESTION/PURPOSE: What is the proportion of signal changes representing definite and possible scaphoid fractures relative to other types of signal changes on MRI among patients with a suspected scaphoid fracture? METHODS: In a retrospective study in an orthopaedic trauma clinic associated with a Level I trauma center, we evaluated MR images of patients 16 years and older with a clinically suspected scaphoid fracture. At our institution, patients with symptoms and signs of a possible scaphoid fracture and negative radiographs undergo MRI scanning. Between January 1, 2012, and September 1, 2019, a total of 310 patients 16 years or older had an MRI to evaluate a suspected scaphoid fracture. Exclusion criteria included a scaphoid fracture that was visible on radiographs before MRI as reported by the radiologist (four patients), no available radiographs before MRI (two), MRI more than 3 weeks after injury (28), unknown date of injury (nine), and repeat or bilateral MRI scans (11), leaving 256 MR images for analysis. Sixty percent (153) of patients were women, and the median age was 34 years (IQR 21 to 50 years). The images were taken a median of 8 days (IQR 2 to 12 days) after injury. MR images were screened for the presence of scaphoid signal changes. We identified the following patterns of signal change with a reliability of kappa 0.62: definite scaphoid fracture, possible scaphoid fracture, signal in the waist area other than possible or definite fractures, and other signal changes. A definite scaphoid fracture was defined as a linear, focal, and bicortical signal abnormality, with adjacent edema and a relatively transverse orientation relative to the scaphoid long axis. The transverse linear signal was visible on more than one cut in multiple planes. A possible scaphoid fracture had a transverse linear signal on more than one cut on sagittal or coronal planes, with or without adjacent edema. RESULTS: Six percent (16 of 256) of MR images were categorized as revealing definite (2% [four of 256]) or possible (5% [12 of 256]) scaphoid fractures, whereas 29% (74 of 256) were categorized as revealing nonspecific signal changes at the waist (14% [35 of 256]) and other areas (15% [39 of 256]). Of the 51 patients with scaphoid waist signal changes, 69% (35) were categorized as having distracting and potentially misleading MRI findings. CONCLUSION: The high prevalence of signal changes that are distracting and potentially misleading, the low prevalence of signal changes that clearly represent a scaphoid fracture, and the low pretest odds of a true fracture among patients with a suspected scaphoid fracture illustrate that routine MRI of suspected scaphoid fractures carries a notable risk of overdiagnosis and potential overtreatment. Two alternative strategies are supported by preliminary evidence and merit additional attention: more-selective use of MRI in people deemed at higher risk according to a clinical prediction rule and strategies for involving the patient in decisions regarding how to manage the notably small risk of future symptomatic nonunion. LEVEL OF EVIDENCE: Level IV, diagnostic study.


Asunto(s)
Fracturas Óseas , Traumatismos de la Mano , Hueso Escafoides , Traumatismos de la Muñeca , Humanos , Femenino , Adulto , Masculino , Fracturas Óseas/diagnóstico por imagen , Fracturas Óseas/epidemiología , Hueso Escafoides/diagnóstico por imagen , Hueso Escafoides/lesiones , Sobrediagnóstico , Estudios Retrospectivos , Reproducibilidad de los Resultados , Imagen por Resonancia Magnética , Traumatismos de la Muñeca/diagnóstico por imagen , Traumatismos de la Muñeca/epidemiología , Edema
3.
Clin Orthop Relat Res ; 481(12): 2419-2430, 2023 12 01.
Artículo en Inglés | MEDLINE | ID: mdl-37229565

RESUMEN

BACKGROUND: The ability to predict survival accurately in patients with osseous metastatic disease of the extremities is vital for patient counseling and guiding surgical intervention. We, the Skeletal Oncology Research Group (SORG), previously developed a machine-learning algorithm (MLA) based on data from 1999 to 2016 to predict 90-day and 1-year survival of surgically treated patients with extremity bone metastasis. As treatment regimens for oncology patients continue to evolve, this SORG MLA-driven probability calculator requires temporal reassessment of its accuracy. QUESTION/PURPOSE: Does the SORG-MLA accurately predict 90-day and 1-year survival in patients who receive surgical treatment for a metastatic long-bone lesion in a more recent cohort of patients treated between 2016 and 2020? METHODS: Between 2017 and 2021, we identified 674 patients 18 years and older through the ICD codes for secondary malignant neoplasm of bone and bone marrow and CPT codes for completed pathologic fractures or prophylactic treatment of an impending fracture. We excluded 40% (268 of 674) of patients, including 18% (118) who did not receive surgery; 11% (72) who had metastases in places other than the long bones of the extremities; 3% (23) who received treatment other than intramedullary nailing, endoprosthetic reconstruction, or dynamic hip screw; 3% (23) who underwent revision surgery, 3% (17) in whom there was no tumor, and 2% (15) who were lost to follow-up within 1 year. Temporal validation was performed using data on 406 patients treated surgically for bony metastatic disease of the extremities from 2016 to 2020 at the same two institutions where the MLA was developed. Variables used to predict survival in the SORG algorithm included perioperative laboratory values, tumor characteristics, and general demographics. To assess the models' discrimination, we computed the c-statistic, commonly referred to as the area under the receiver operating characteristic (AUC) curve for binary classification. This value ranged from 0.5 (representing chance-level performance) to 1.0 (indicating excellent discrimination) Generally, an AUC of 0.75 is considered high enough for use in clinical practice. To evaluate the agreement between predicted and observed outcomes, a calibration plot was used, and the calibration slope and intercept were calculated. Perfect calibration would result in a slope of 1 and intercept of 0. For overall performance, the Brier score and null-model Brier score were determined. The Brier score can range from 0 (representing perfect prediction) to 1 (indicating the poorest prediction). Proper interpretation of the Brier score necessitates a comparison with the null-model Brier score, which represents the score for an algorithm that predicts a probability equal to the population prevalence of the outcome for each patient. Finally, a decision curve analysis was conducted to compare the potential net benefit of the algorithm with other decision-support methods, such as treating all or none of the patients. Overall, 90-day and 1-year mortality were lower in the temporal validation cohort than in the development cohort (90 day: 23% versus 28%; p < 0.001, and 1 year: 51% versus 59%; p<0.001). RESULTS: Overall survival of the patients in the validation cohort improved from 28% mortality at the 90-day timepoint in the cohort on which the model was trained to 23%, and 59% mortality at the 1-year timepoint to 51%. The AUC was 0.78 (95% CI 0.72 to 0.82) for 90-day survival and 0.75 (95% CI 0.70 to 0.79) for 1-year survival, indicating the model could distinguish the two outcomes reasonably. For the 90-day model, the calibration slope was 0.71 (95% CI 0.53 to 0.89), and the intercept was -0.66 (95% CI -0.94 to -0.39), suggesting the predicted risks were overly extreme, and that in general, the risk of the observed outcome was overestimated. For the 1-year model, the calibration slope was 0.73 (95% CI 0.56 to 0.91) and the intercept was -0.67 (95% CI -0.90 to -0.43). With respect to overall performance, the model's Brier scores for the 90-day and 1-year models were 0.16 and 0.22. These scores were higher than the Brier scores of internal validation of the development study (0.13 and 0.14) models, indicating the models' performance has declined over time. CONCLUSION: The SORG MLA to predict survival after surgical treatment of extremity metastatic disease showed decreased performance on temporal validation. Moreover, in patients undergoing innovative immunotherapy, the possibility of mortality risk was overestimated in varying severity. Clinicians should be aware of this overestimation and discount the prediction of the SORG MLA according to their own experience with this patient population. Generally, these results show that temporal reassessment of these MLA-driven probability calculators is of paramount importance because the predictive performance may decline over time as treatment regimens evolve. The SORG-MLA is available as a freely accessible internet application at https://sorg-apps.shinyapps.io/extremitymetssurvival/ .Level of Evidence Level III, prognostic study.


Asunto(s)
Neoplasias Óseas , Humanos , Pronóstico , Neoplasias Óseas/terapia , Algoritmos , Extremidades , Aprendizaje Automático , Estudios Retrospectivos
4.
BMC Med Inform Decis Mak ; 23(1): 108, 2023 06 13.
Artículo en Inglés | MEDLINE | ID: mdl-37312177

RESUMEN

BACKGROUND: Unplanned hospital readmissions are serious medical adverse events, stressful to patients, and expensive for hospitals. This study aims to develop a probability calculator to predict unplanned readmissions (PURE) within 30-days after discharge from the department of Urology, and evaluate the respective diagnostic performance characteristics of the PURE probability calculator developed with machine learning (ML) algorithms comparing regression versus classification algorithms. METHODS: Eight ML models (i.e. logistic regression, LASSO regression, RIDGE regression, decision tree, bagged trees, boosted trees, XGBoost trees, RandomForest) were trained on 5.323 unique patients with 52 different features, and evaluated on diagnostic performance of PURE within 30 days of discharge from the department of Urology. RESULTS: Our main findings were that performances from classification to regression algorithms had good AUC scores (0.62-0.82), and classification algorithms showed a stronger overall performance as compared to models trained with regression algorithms. Tuning the best model, XGBoost, resulted in an accuracy of 0.83, sensitivity of 0.86, specificity of 0.57, AUC of 0.81, PPV of 0.95, and a NPV of 0.31. CONCLUSIONS: Classification models showed stronger performance than regression models with reliable prediction for patients with high probability of readmission, and should be considered as first choice. The tuned XGBoost model shows performance that indicates safe clinical appliance for discharge management in order to prevent an unplanned readmission at the department of Urology.


STUDY NEED AND IMPORTANCE: Unplanned readmissions form a consistent problem for many hospitals. Unplanned readmission rates can go up as high as to 35%, and may differ significantly between respective hospital departments. In addition, in the field of Urology readmission rates can be greatly influenced by type of surgery performed and unplanned readmissions in patients can go up as high as 26%. Although predicting unplanned readmissions for individual patients is often complex, due to multiple factors that need to be taken into account (e.g. functional disability, poor overall condition), there is evidence that these can be prevented when discharge management is evaluated with an objective measuring tool that facilitate such risk stratification between high and low risk patients. However, to the best of our knowledge, the latter risk stratification using ML driven probability calculators in the field of Urology have not been evaluated to date. Using ML, calculated risk scores based on analysing complex data patterns on patient level can support safe discharge and inform concerning the risk of having an unplanned readmission. WHAT WE FOUND: Eight ML models were trained on 5.323 unique patients with 52 different features, and evaluated on diagnostic performance. Classification models showed stronger performance than regression models with reliable prediction for patients with high probability of readmission, and should be considered as first choice. The tuned XGBoost model shows performance that indicates safe clinical appliance for discharge management in order to prevent an unplanned readmission at the department of Urology. Limitations of our study were the quality and presence of patient data on features, and how to implement these findings in clinical setting to transition from predicting to preventing unplanned readmissions. INTERPRETATION FOR CLINICIANS: ML models based on classification should be first choice to predict unplanned readmissions, and the XGBoost model showed the strongest results.


Asunto(s)
Readmisión del Paciente , Urología , Humanos , Algoritmos , Hospitales , Aprendizaje Automático
5.
J Shoulder Elbow Surg ; 32(12): 2508-2518, 2023 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-37327989

RESUMEN

BACKGROUND: Although reverse total shoulder arthroplasty (RTSA) is considered a viable treatment strategy for proximal humeral fractures, there is an ongoing discussion of how its revision rate compares with indications performed in the elective setting. First, this study evaluated whether RTSA for fractures conveyed a higher revision rate than RTSA for degenerative conditions (osteoarthritis, rotator cuff arthropathy, rotator cuff tear, or rheumatoid arthritis). Second, this study assessed whether there was a difference in patient-reported outcomes between these 2 groups following primary replacement. Finally, the results of conventional stem designs were compared with those of fracture-specific designs within the fracture group. MATERIALS AND METHODS: This was a retrospective comparative cohort study with registry data from the Netherlands, generated prospectively between 2014 and 2020. Patients (aged ≥ 18 years) were included if they underwent primary RTSA for a fracture (<4 weeks after trauma), osteoarthritis, rotator cuff arthropathy, rotator cuff tear, or rheumatoid arthritis, with follow-up until first revision, death, or the end of the study period. The primary outcome was the revision rate. The secondary outcomes were the Oxford Shoulder Score, EuroQol 5 Dimensions (EQ-5D) score, numerical rating scale score (pain at rest and during activity), recommendation score, and scores assessing change in daily functioning and change in pain. RESULTS: This study included 8753 patients in the degenerative condition group (mean age, 74.3 ± 7.2 years) and 2104 patients in the fracture group (mean age, 74.3 ± 7.8 years). RTSA performed for fractures showed an early steep decline in survivorship: Adjusted for time, age, sex, and arthroplasty brand, the revision risk after 1 year was significantly higher in these patients than in those with degenerative conditions (hazard ratio [HR], 2.50; 95% confidence interval, 1.66-3.77). Over time, the HR steadily decreased, with an HR of 0.98 at year 6. Apart from the recommendation score (which was slightly better within the fracture group), there were no clinically relevant differences in the patient-reported outcome measures after 12 months. Patients who received conventional stems (n = 1137) did not have a higher likelihood of undergoing a revision procedure than those who received fracture-specific stems (n = 675) (HR, 1.70; 95% confidence interval, 0.91-3.17). CONCLUSION: Patients undergoing primary RTSA for fractures have a substantially higher likelihood of undergoing revision within the first year following the procedure than patients with degenerative conditions preoperatively. Although RTSA is regarded as a reliable and safe treatment option for fractures, surgeons should inform patients accordingly and incorporate this information in decision making when opting for head replacement surgery. There were no differences in patient-reported outcomes between the 2 groups and no differences in revision rates between conventional and fracture-specific stem designs.


Asunto(s)
Artritis Reumatoide , Artroplastía de Reemplazo de Hombro , Osteoartritis , Lesiones del Manguito de los Rotadores , Fracturas del Hombro , Articulación del Hombro , Humanos , Anciano , Anciano de 80 o más Años , Artroplastía de Reemplazo de Hombro/efectos adversos , Lesiones del Manguito de los Rotadores/cirugía , Lesiones del Manguito de los Rotadores/etiología , Estudios Retrospectivos , Estudios de Cohortes , Resultado del Tratamiento , Osteoartritis/cirugía , Osteoartritis/etiología , Fracturas del Hombro/cirugía , Fracturas del Hombro/etiología , Artritis Reumatoide/cirugía , Dolor/etiología , Articulación del Hombro/cirugía , Rango del Movimiento Articular
6.
Arch Orthop Trauma Surg ; 143(1): 213-223, 2023 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-34226981

RESUMEN

INTRODUCTION: The three-dimensional (3D) microstructure of the cortical and trabecular bone of the proximal ulna has not yet been described by means of high-resolution 3D imaging. An improved characterization can provide a better understanding of their relative contribution to resist impact load. The aim of this study is to describe the proximal ulna bone microstructure using micro-computed tomography (micro-CT) and relate it to gross morphology and function. MATERIALS AND METHODS: Five dry cadaveric human ulnae were scanned by micro-CT (17 µm/voxel, isotropic). Both qualitative and quantitative assessments were performed on sagittal image stacks. The cortical thickness of the trochlear notch and the trabecular bone microstructure were measured in the olecranon, bare area and coronoid. RESULTS: Groups of trabecular struts starting in the bare area, spanning towards the anterior and posterior side of the proximal ulna, were observed; within the coronoid, the trabeculae were orthogonal to the joint surface. Consistently among the ulnae, the coronoid showed the highest cortical thickness (1.66 ± 0.59 mm, p = 0.04) and the olecranon the lowest (0.33 ± 0.06 mm, p = 0.04). The bare area exhibited the highest bone volume fraction (BV/TV = 43.7 ± 22.4%), trabecular thickness (Tb.Th = 0.40 ± 0.09 mm) and lowest structure model index (SMI = - 0.28 ± 2.20, indicating plate-like structure), compared to the other regions (p = 0.04). CONCLUSIONS: Our microstructural results suggest that the bare area is the region where most of the loading of the proximal ulna is concentrated, whereas the coronoid, together with its anteromedial facet, is the most important bony stabilizer of the elbow joint. Studying the proximal ulna bone microstructure helps understanding its possible everyday mechanical loading conditions and potential fractures. LEVEL OF EVIDENCE: N.A.


Asunto(s)
Fracturas Óseas , Olécranon , Humanos , Microtomografía por Rayos X/métodos , Hueso Esponjoso/diagnóstico por imagen , Cúbito/diagnóstico por imagen , Imagenología Tridimensional/métodos
7.
Arch Orthop Trauma Surg ; 143(6): 3119-3128, 2023 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-35840714

RESUMEN

INTRODUCTION: It is unclear if the collar and cuff treatment improve alignment in displaced surgical neck fractures of the proximal humerus. Therefore, this study evaluated if the neckshaft angle and extent of displacement would improve between trauma and onset of radiographically visible callus in non-operatively treated surgical neck fractures (Boileau type A, B, C). MATERIALS AND METHODS: A consecutive series of patients (≥ 18 years old) were retrospectively evaluated from a level 1 trauma center in Australia (inclusion period: 2016-2020) and a level 2 trauma center in the Netherlands (inclusion period: 2004 to 2018). Patients were included if they sustained a Boileau-type fracture and underwent initial non-operative treatment. The first radiograph had to be obtained within 24 h after the initial injury and the follow-up radiograph(s) 1 week after trauma and before the start of radiographically visible callus. On each radiograph, the maximal medial gap (MMG), maximal lateral gap (MLG), and neck-shaft angle (NSA) were measured. Linear mixed modelling was performed to evaluate if these measurements would improve over time. RESULTS: Sixty-seven patients were included: 25 type A, 11 type B, and 31 type C fractures. The mean age (range) was 68 years (24-93), and the mean number (range) of follow-up radiographs per patient was 1 (1-4). Linear mixed modelling on both MMG and MLG revealed no improvement during follow-up among the three groups. Mean NSA of type A fractures improved significantly from 161° at trauma to 152° at last follow-up (p-value = 0.004). CONCLUSIONS: Apart from humeral head angulation improvement in type A, there is no increase nor reduction in displacement among the three fracture patterns. Therefore, it is advised that surgical decision-making should be performed immediately after trauma. LEVEL OF CLINICAL EVIDENCE: Level IV, retrospective case series.


Asunto(s)
Fracturas del Húmero , Fracturas del Hombro , Humanos , Anciano , Adolescente , Estudios Retrospectivos , Fracturas del Hombro/cirugía , Fijación Interna de Fracturas , Radiografía , Cabeza Humeral , Resultado del Tratamiento , Fracturas del Húmero/diagnóstico por imagen , Fracturas del Húmero/cirugía
8.
Clin Orthop Relat Res ; 480(6): 1170-1177, 2022 06 01.
Artículo en Inglés | MEDLINE | ID: mdl-35230277

RESUMEN

BACKGROUND: Tibial plateau fractures are often complex, and they can be challenging to treat. Classifying fractures is often part of the treatment process, but intra- and interobserver reliability of fracture classification systems often is inadequate to the task, and classifications that lack reliability can mislead providers and result in harm to patients. Three-dimensionally (3D)-printed models might help in this regard, but whether that is the case for the classification of tibial plateau fractures, and whether the utility of such models might vary by the experience of the individual classifying the fractures, is unknown. QUESTIONS/PURPOSES: (1) Does the overall interobserver agreement improve when fractures are classified with 3D-printed models compared with conventional radiology? (2) Does interobserver agreement vary among attending and consultant trauma surgeons, senior surgical residents, and junior surgical residents? (3) Do surgeons' and surgical residents' confidence and accuracy improve when tibial plateau fractures are classified with an additional 3D model compared with conventional radiology? METHODS: Between 2012 and 2020, 113 patients with tibial plateau fractures were treated at a Level 1 trauma center. Forty-four patients were excluded based on the presence of bone diseases (such as osteoporosis) and the absence of a CT scan. To increase the chance to detect an improvement or deterioration and to prevent observers from losing focus during the classification, we decided to include 40 patients with tibial plateau fractures. Nine trauma surgeons, eight senior surgical residents, and eight junior surgical residents-none of whom underwent any study-specific pretraining-classified these fractures according to three often-used classification systems (Schatzker, OA/OTA, and the Luo three-column concept), with and without 3D-printed models, and they indicated their overall confidence on a 10-point Likert scale, with 0 meaning not confident at all and 10 absolutely certainty. To set the gold standard, a panel of three experienced trauma surgeons who had special expertise in knee surgery and 10 years to 25 years of experience in practice also classified the fractures until consensus was reached. The Fleiss kappa was used to determine interobserver agreement for fracture classification. Differences in confidence in assessing fractures with and without the 3D-printed model were compared using a paired t-test. Accuracy was calculated by comparing the participants' observations with the gold standard. RESULTS: The overall interobserver agreement improved minimally for fracture classification according to two of three classification systems (Schatzker: κconv = 0.514 versus κ3Dprint = 0.539; p = 0.005; AO/OTA:κconv = 0.359 versus κ3Dprint = 0.372; p = 0.03). However, none of the classification systems, even when used by our most experienced group of trauma surgeons, achieved more than moderate interobserver agreement, meaning that a large proportion of fractures were misclassified by at least one observer. Overall, there was no improvement in self-assessed confidence in classifying fractures or accuracy with 3D-printed models; confidence was high (about 7 points on a 10-point scale) as rated by all observers, despite moderate or worse accuracy and interobserver agreement. CONCLUSION: Although 3D-printed models minimally improved the overall interobserver agreement for two of three classification systems, none of the classification systems achieved more than moderate interobserver agreement. This suggests that even with 3D-printed models, many fractures would be misclassified, which could result in misleading communication, inaccurate prognostic assessments, unclear research, and incorrect treatment choices. Therefore, we cannot recommend the use of 3D-printed models in practice and research for classification of tibial plateau fractures. LEVEL OF EVIDENCE: Level III, diagnostic study.


Asunto(s)
Cirujanos , Fracturas de la Tibia , Humanos , Variaciones Dependientes del Observador , Impresión Tridimensional , Reproducibilidad de los Resultados , Fracturas de la Tibia/diagnóstico por imagen , Fracturas de la Tibia/cirugía
9.
Clin Orthop Relat Res ; 480(11): 2205-2213, 2022 11 01.
Artículo en Inglés | MEDLINE | ID: mdl-35561268

RESUMEN

BACKGROUND: Postoperative delirium in patients aged 60 years or older with hip fractures adversely affects clinical and functional outcomes. The economic cost of delirium is estimated to be as high as USD 25,000 per patient, with a total budgetary impact between USD 6.6 to USD 82.4 billion annually in the United States alone. Forty percent of delirium episodes are preventable, and accurate risk stratification can decrease the incidence and improve clinical outcomes in patients. A previously developed clinical prediction model (the SORG Orthopaedic Research Group hip fracture delirium machine-learning algorithm) is highly accurate on internal validation (in 28,207 patients with hip fractures aged 60 years or older in a US cohort) in identifying at-risk patients, and it can facilitate the best use of preventive interventions; however, it has not been tested in an independent population. For an algorithm to be useful in real life, it must be valid externally, meaning that it must perform well in a patient cohort different from the cohort used to "train" it. With many promising machine-learning prediction models and many promising delirium models, only few have also been externally validated, and even fewer are international validation studies. QUESTION/PURPOSE: Does the SORG hip fracture delirium algorithm, initially trained on a database from the United States, perform well on external validation in patients aged 60 years or older in Australia and New Zealand? METHODS: We previously developed a model in 2021 for assessing risk of delirium in hip fracture patients using records of 28,207 patients obtained from the American College of Surgeons National Surgical Quality Improvement Program. Variables included in the original model included age, American Society of Anesthesiologists (ASA) class, functional status (independent or partially or totally dependent for any activities of daily living), preoperative dementia, preoperative delirium, and preoperative need for a mobility aid. To assess whether this model could be applied elsewhere, we used records from an international hip fracture registry. Between June 2017 and December 2018, 6672 patients older than 60 years of age in Australia and New Zealand were treated surgically for a femoral neck, intertrochanteric hip, or subtrochanteric hip fracture and entered into the Australian & New Zealand Hip Fracture Registry. Patients were excluded if they had a pathological hip fracture or septic shock. Of all patients, 6% (402 of 6672) did not meet the inclusion criteria, leaving 94% (6270 of 6672) of patients available for inclusion in this retrospective analysis. Seventy-one percent (4249 of 5986) of patients were aged 80 years or older, after accounting for 5% (284 of 6270) of missing values; 68% (4292 of 6266) were female, after accounting for 0.06% (4 of 6270) of missing values, and 83% (4690 of 5661) of patients were classified as ASA III/IV, after accounting for 10% (609 of 6270) of missing values. Missing data were imputed using the missForest methodology. In total, 39% (2467 of 6270) of patients developed postoperative delirium. The performance of the SORG hip fracture delirium algorithm on the validation cohort was assessed by discrimination, calibration, Brier score, and a decision curve analysis. Discrimination, known as the area under the receiver operating characteristic curves (c-statistic), measures the model's ability to distinguish patients who achieved the outcomes from those who did not and ranges from 0.5 to 1.0, with 1.0 indicating the highest discrimination score and 0.50 the lowest. Calibration plots the predicted versus the observed probabilities, a perfect plot has an intercept of 0 and a slope of 1. The Brier score calculates a composite of discrimination and calibration, with 0 indicating perfect prediction and 1 the poorest. RESULTS: The SORG hip fracture algorithm, when applied to an external patient cohort, distinguished between patients at low risk and patients at moderate to high risk of developing postoperative delirium. The SORG hip fracture algorithm performed with a c-statistic of 0.74 (95% confidence interval 0.73 to 0.76). The calibration plot showed high accuracy in the lower predicted probabilities (intercept -0.28, slope 0.52) and a Brier score of 0.22 (the null model Brier score was 0.24). The decision curve analysis showed that the model can be beneficial compared with no model or compared with characterizing all patients as at risk for developing delirium. CONCLUSION: Algorithms developed with machine learning are a potential tool for refining treatment of at-risk patients. If high-risk patients can be reliably identified, resources can be appropriately directed toward their care. Although the current iteration of SORG should not be relied on for patient care, it suggests potential utility in assessing risk. Further assessment in different populations, made easier by international collaborations and standardization of registries, would be useful in the development of universally valid prediction models. The model can be freely accessed at: https://sorg-apps.shinyapps.io/hipfxdelirium/ . LEVEL OF EVIDENCE: Level III, therapeutic study.


Asunto(s)
Delirio , Fracturas de Cadera , Ortopedia , Actividades Cotidianas , Algoritmos , Australia , Delirio/diagnóstico , Delirio/epidemiología , Delirio/etiología , Femenino , Fracturas de Cadera/cirugía , Humanos , Masculino , Persona de Mediana Edad , Modelos Estadísticos , Pronóstico , Estudios Retrospectivos
10.
Clin Orthop Relat Res ; 480(1): 150-159, 2022 01 01.
Artículo en Inglés | MEDLINE | ID: mdl-34427569

RESUMEN

BACKGROUND: Reliably recognizing the overall pattern and specific characteristics of proximal humerus fractures may aid in surgical decision-making. With conventional onscreen imaging modalities, there is considerable and undesired interobserver variability, even when observers receive training in the application of the classification systems used. It is unclear whether three-dimensional (3D) models, which now can be fabricated with desktop printers at relatively little cost, can decrease interobserver variability in fracture classification. QUESTIONS/PURPOSES: Do 3D-printed handheld models of proximal humerus fractures improve agreement among residents and attending surgeons regarding (1) specific fracture characteristics and (2) patterns according to the Neer and Hertel classification systems? METHODS: Plain radiographs, as well as two-dimensional (2D) and 3D CT images, were collected from 20 patients (aged 18 years or older) who sustained a three-part or four-part proximal humerus fracture treated at a Level I trauma center between 2015 and 2019. The included images were chosen to comprise images from patients whose fractures were considered as difficult-to-classify, displaced fractures. Consequently, the images were assessed for eight fracture characteristics and categorized according to the Neer and Hertel classifications by four orthopaedic residents and four attending orthopaedic surgeons during two separate sessions. In the first session, the assessment was performed with conventional onscreen imaging (radiographs and 2D and 3D CT images). In the second session, 3D-printed handheld models were used for assessment, while onscreen imaging was also available. Although proximal humerus classifications such as the Neer classification have, in the past, been shown to have low interobserver reliability, we theorized that by receiving direct tactile and visual feedback from 3D-printed handheld fracture models, clinicians would be able to recognize the complex 3D aspects of classification systems reliably. Interobserver agreement was determined with the multirater Fleiss kappa and scored according to the categorical rating by Landis and Koch. To determine whether there was a difference between the two sessions, we calculated the delta (difference in the) kappa value with 95% confidence intervals and a two-tailed p value. Post hoc power analysis revealed that with the current sample size, a delta kappa value of 0.40 could be detected with 80% power at alpha = 0.05. RESULTS: Using 3D-printed models in addition to conventional imaging did not improve interobserver agreement of the following fracture characteristics: more than 2 mm medial hinge displacement, more than 8 mm metaphyseal extension, surgical neck fracture, anatomic neck fracture, displacement of the humeral head, more than 10 mm lesser tuberosity displacement, and more than 10 mm greater tuberosity displacement. Agreement regarding the presence of a humeral head-splitting fracture was improved but only to a level that was insufficient for clinical or scientific use (fair to substantial, delta kappa = 0.33 [95% CI 0.02 to 0.64]). Assessing 3D-printed handheld models adjunct to onscreen conventional imaging did not improve the interobserver agreement for pattern recognition according to Neer (delta kappa = 0.02 [95% CI -0.11 to 0.07]) and Hertel (delta kappa = 0.01 [95% CI -0.11 to 0.08]). There were no differences between residents and attending surgeons in terms of whether 3D models helped them classify the fractures, but there were few differences to identify fracture characteristics. However, none of the identified differences improved to almost perfect agreement (kappa value above 0.80), so even those few differences are unlikely to be clinically useful. CONCLUSION: Using 3D-printed handheld fracture models in addition to conventional onscreen imaging of three-part and four-part proximal humerus fractures does not improve agreement among residents and attending surgeons on specific fracture characteristics and patterns. Therefore, we do not recommend that clinicians expend the time and costs needed to create these models if the goal is to classify or describe patients' fracture characteristics or pattern, since doing so is unlikely to improve clinicians' abilities to select treatment or estimate prognosis. LEVEL OF EVIDENCE: Level III, diagnostic study.


Asunto(s)
Fracturas del Hombro , Tomografía Computarizada por Rayos X , Humanos , Cabeza Humeral , Variaciones Dependientes del Observador , Impresión Tridimensional , Reproducibilidad de los Resultados , Fracturas del Hombro/diagnóstico por imagen , Fracturas del Hombro/cirugía
11.
Clin Orthop Relat Res ; 480(12): 2288-2295, 2022 12 01.
Artículo en Inglés | MEDLINE | ID: mdl-35638902

RESUMEN

BACKGROUND: Gap and stepoff measurements provide information about fracture displacement and are used for clinical decision-making when choosing either operative or nonoperative management of tibial plateau fractures. However, there is no consensus about the maximum size of gaps and stepoffs on CT images and their relation to functional outcome in skeletally mature patients with tibial plateau fractures who were treated without surgery. Because this is important for patient counseling regarding treatment and prognosis, it is critical to identify the limits of gaps and stepoffs that are well tolerated. QUESTIONS/PURPOSES: (1) In patients treated nonoperatively for tibial plateau fractures, what is the association between initial fracture displacement, as measured by gaps and stepoffs at the articular surface on a CT image, and functional outcome? (2) What is the survivorship of the native joint, free from conversion to a total knee prosthesis, among patients with tibial plateau fractures who were treated without surgery? METHODS: A multicenter cross-sectional study was performed in all patients who were treated nonoperatively for a tibial plateau fracture between 2003 and 2018 in four trauma centers. All patients had a diagnostic CT scan, and a gap and/or stepoff more than 2 mm was an indication for recommending surgery. Some patients with gaps and/or stepoffs exceeding 2 mm might not have had surgery based on shared decision-making. Between 2003 and 2018, 530 patients were treated nonoperatively for tibial plateau fractures, of which 45 had died at follow-up, 30 were younger than 18 years at the time of injury, and 10 had isolated tibial eminence avulsions, leaving 445 patients for follow-up analysis. All patients were asked to complete the validated Knee Injury and Osteoarthritis Outcome Score (KOOS) questionnaire consisting of five subscales: symptoms, pain, activities of daily living (ADL), function in sports and recreation, and knee-related quality of life (QOL). The score for each subscale ranged from 0 to 100, with higher scores indicating better function. A total of 46% (203 of 445) of patients participated at a mean follow-up of 6 ± 3 years since injury. All knee radiographs and CT images were reassessed, fractures were classified, and gap and stepoff measurements were taken. Nonresponders did not differ much from responders in terms of age (53 ± 16 years versus 54 ± 20 years; p = 0.89), gender (70% [142 of 203] women versus 59% [142 of 242] women; p = 0.01), fracture classifications (Schatzker types and three-column concept), gaps (2.1 ± 1.3 mm versus 1.7 ± 1.6 mm; p = 0.02), and stepoffs (2.1 ± 2.2 mm versus 1.9 ± 1.7 mm; p = 0.13). In our study population, the mean gap was 2.1 ± 1.3 mm and stepoff was 2.1 ± 2.2 mm. The participating patients divided into groups with increasing fracture displacement based on gap and/or stepoff (< 2 mm, 2 to 4 mm, or > 4 mm), as measured on CT images. ANOVA was used to assess whether an increase in the initial fracture displacement was associated with poorer functional outcome. We estimated the survivorship of the knee free from conversion to total knee prosthesis at a mean follow-up of 5 years using a Kaplan-Meier survivorship estimator. RESULTS: KOOS scores in patients with a less than 2 mm, 2 to 4 mm, or greater than 4 mm gap did not differ (symptoms: 83 versus 83 versus 82; p = 0.98, pain: 85 versus 83 versus 86; p = 0.69, ADL: 87 versus 84 versus 89; p = 0.44, sport: 65 versus 64 versus 66; p = 0.95, QOL: 70 versus 71 versus 74; p = 0.85). The KOOS scores in patients with a less than 2 mm, 2 to 4 mm, or greater than 4 mm stepoff did not differ (symptoms: 84 versus 83 versus 77; p = 0.32, pain: 85 versus 85 versus 81; p = 0.66, ADL: 86 versus 87 versus 82; p = 0.54, sport: 65 versus 68 versus 56; p = 0.43, QOL: 71 versus 73 versus 61; p = 0.19). Survivorship of the knee free from conversion to total knee prosthesis at mean follow-up of 5 years was 97% (95% CI 94% to 99%). CONCLUSION: Patients with minimally displaced tibial plateau fractures who opt for nonoperative fracture treatment should be told that fracture gaps or stepoffs up to 4 mm, as measured on CT images, could result in good functional outcome. Therefore, the arbitrary 2-mm limit of gaps and stepoffs for tibial plateau fractures could be revisited. The survivorship of the native knee free from conversion to a total knee prosthesis was high. Large prospective cohort studies with high response rates are needed to learn more about the relationship between the degree of fracture displacement and functional recovery after tibial plateau fractures. LEVEL OF EVIDENCE: Level III, prognostic study.


Asunto(s)
Fracturas de la Tibia , Fracturas de la Meseta Tibial , Humanos , Femenino , Adulto , Persona de Mediana Edad , Anciano , Resultado del Tratamiento , Calidad de Vida , Actividades Cotidianas , Estudios Prospectivos , Estudios Transversales , Fracturas de la Tibia/diagnóstico por imagen , Fracturas de la Tibia/terapia , Fracturas de la Tibia/complicaciones , Dolor/complicaciones , Fijación Interna de Fracturas/efectos adversos , Fijación Interna de Fracturas/métodos , Estudios Retrospectivos
12.
Clin Orthop Relat Res ; 480(12): 2350-2360, 2022 12 01.
Artículo en Inglés | MEDLINE | ID: mdl-35767811

RESUMEN

BACKGROUND: Femoral neck fractures are common and are frequently treated with internal fixation. A major disadvantage of internal fixation is the substantially high number of conversions to arthroplasty because of nonunion, malunion, avascular necrosis, or implant failure. A clinical prediction model identifying patients at high risk of conversion to arthroplasty may help clinicians in selecting patients who could have benefited from arthroplasty initially. QUESTION/PURPOSE: What is the predictive performance of a machine-learning (ML) algorithm to predict conversion to arthroplasty within 24 months after internal fixation in patients with femoral neck fractures? METHODS: We included 875 patients from the Fixation using Alternative Implants for the Treatment of Hip fractures (FAITH) trial. The FAITH trial consisted of patients with low-energy femoral neck fractures who were randomly assigned to receive a sliding hip screw or cancellous screws for internal fixation. Of these patients, 18% (155 of 875) underwent conversion to THA or hemiarthroplasty within the first 24 months. All patients were randomly divided into a training set (80%) and test set (20%). First, we identified 27 potential patient and fracture characteristics that may have been associated with our primary outcome, based on biomechanical rationale and previous studies. Then, random forest algorithms (an ML learning, decision tree-based algorithm that selects variables) identified 10 predictors of conversion: BMI, cardiac disease, Garden classification, use of cardiac medication, use of pulmonary medication, age, lung disease, osteoarthritis, sex, and the level of the fracture line. Based on these variables, five different ML algorithms were trained to identify patterns related to conversion. The predictive performance of these trained ML algorithms was assessed on the training and test sets based on the following performance measures: (1) discrimination (the model's ability to distinguish patients who had conversion from those who did not; expressed with the area under the receiver operating characteristic curve [AUC]), (2) calibration (the plotted estimated versus the observed probabilities; expressed with the calibration curve intercept and slope), and (3) the overall model performance (Brier score: a composite of discrimination and calibration). RESULTS: None of the five ML algorithms performed well in predicting conversion to arthroplasty in the training set and the test set; AUCs of the algorithms in the training set ranged from 0.57 to 0.64, slopes of calibration plots ranged from 0.53 to 0.82, calibration intercepts ranged from -0.04 to 0.05, and Brier scores ranged from 0.14 to 0.15. The algorithms were further evaluated in the test set; AUCs ranged from 0.49 to 0.73, calibration slopes ranged from 0.17 to 1.29, calibration intercepts ranged from -1.28 to 0.34, and Brier scores ranged from 0.13 to 0.15. CONCLUSION: The predictive performance of the trained algorithms was poor, despite the use of one of the best datasets available worldwide on this subject. If the current dataset consisted of different variables or more patients, the performance may have been better. Also, various reasons for conversion to arthroplasty were pooled in this study, but the separate prediction of underlying pathology (such as, avascular necrosis or nonunion) may be more precise. Finally, it may be possible that it is inherently difficult to predict conversion to arthroplasty based on preoperative variables alone. Therefore, future studies should aim to include more variables and to differentiate between the various reasons for arthroplasty. LEVEL OF EVIDENCE: Level III, prognostic study.


Asunto(s)
Artroplastia de Reemplazo de Cadera , Fracturas del Cuello Femoral , Humanos , Pronóstico , Modelos Estadísticos , Fracturas del Cuello Femoral/cirugía , Artroplastia de Reemplazo de Cadera/efectos adversos , Fijación Interna de Fracturas/efectos adversos , Algoritmos , Aprendizaje Automático , Necrosis/etiología , Necrosis/cirugía , Estudios Retrospectivos , Resultado del Tratamiento
13.
Clin Orthop Relat Res ; 480(9): 1766-1775, 2022 09 01.
Artículo en Inglés | MEDLINE | ID: mdl-35412473

RESUMEN

BACKGROUND: Incidental durotomy is an intraoperative complication in spine surgery that can lead to postoperative complications, increased length of stay, and higher healthcare costs. Natural language processing (NLP) is an artificial intelligence method that assists in understanding free-text notes that may be useful in the automated surveillance of adverse events in orthopaedic surgery. A previously developed NLP algorithm is highly accurate in the detection of incidental durotomy on internal validation and external validation in an independent cohort from the same country. External validation in a cohort with linguistic differences is required to assess the transportability of the developed algorithm, referred to geographical validation. Ideally, the performance of a prediction model, the NLP algorithm, is constant across geographic regions to ensure reproducibility and model validity. QUESTION/PURPOSE: Can we geographically validate an NLP algorithm for the automated detection of incidental durotomy across three independent cohorts from two continents? METHODS: Patients 18 years or older undergoing a primary procedure of (thoraco)lumbar spine surgery were included. In Massachusetts, between January 2000 and June 2018, 1000 patients were included from two academic and three community medical centers. In Maryland, between July 2016 and November 2018, 1279 patients were included from one academic center, and in Australia, between January 2010 and December 2019, 944 patients were included from one academic center. The authors retrospectively studied the free-text operative notes of included patients for the primary outcome that was defined as intraoperative durotomy. Incidental durotomy occurred in 9% (93 of 1000), 8% (108 of 1279), and 6% (58 of 944) of the patients, respectively, in the Massachusetts, Maryland, and Australia cohorts. No missing reports were observed. Three datasets (Massachusetts, Australian, and combined Massachusetts and Australian) were divided into training and holdout test sets in an 80:20 ratio. An extreme gradient boosting (an efficient and flexible tree-based algorithm) NLP algorithm was individually trained on each training set, and the performance of the three NLP algorithms (respectively American, Australian, and combined) was assessed by discrimination via area under the receiver operating characteristic curves (AUC-ROC; this measures the model's ability to distinguish patients who obtained the outcomes from those who did not), calibration metrics (which plot the predicted and the observed probabilities) and Brier score (a composite of discrimination and calibration). In addition, the sensitivity (true positives, recall), specificity (true negatives), positive predictive value (also known as precision), negative predictive value, F1-score (composite of precision and recall), positive likelihood ratio, and negative likelihood ratio were calculated. RESULTS: The combined NLP algorithm (the combined Massachusetts and Australian data) achieved excellent performance on independent testing data from Australia (AUC-ROC 0.97 [95% confidence interval 0.87 to 0.99]), Massachusetts (AUC-ROC 0.99 [95% CI 0.80 to 0.99]) and Maryland (AUC-ROC 0.95 [95% CI 0.93 to 0.97]). The NLP developed based on the Massachusetts cohort had excellent performance in the Maryland cohort (AUC-ROC 0.97 [95% CI 0.95 to 0.99]) but worse performance in the Australian cohort (AUC-ROC 0.74 [95% CI 0.70 to 0.77]). CONCLUSION: We demonstrated the clinical utility and reproducibility of an NLP algorithm with combined datasets retaining excellent performance in individual countries relative to algorithms developed in the same country alone for detection of incidental durotomy. Further multi-institutional, international collaborations can facilitate the creation of universal NLP algorithms that improve the quality and safety of orthopaedic surgery globally. The combined NLP algorithm has been incorporated into a freely accessible web application that can be found at https://sorg-apps.shinyapps.io/nlp_incidental_durotomy/ . Clinicians and researchers can use the tool to help incorporate the model in evaluating spine registries or quality and safety departments to automate detection of incidental durotomy and optimize prevention efforts. LEVEL OF EVIDENCE: Level III, diagnostic study.


Asunto(s)
Inteligencia Artificial , Procesamiento de Lenguaje Natural , Algoritmos , Australia , Humanos , Reproducibilidad de los Resultados , Estudios Retrospectivos
14.
Arch Orthop Trauma Surg ; 142(1): 165-174, 2022 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-33170354

RESUMEN

INTRODUCTION: A characterization of the internal bone microstructure of the radial head could provide a better understanding of commonly occurring fracture patterns frequently involving the (antero)lateral quadrant, for which a clear explanation is still lacking. The aim of this study is to describe the radial head bone microstructure using micro-computed tomography (micro-CT) and to relate it to gross morphology, function and possible fracture patterns. MATERIALS AND METHODS: Dry cadaveric human radii were scanned by micro-CT (17 µm/pixel, isotropic). The trabecular bone microstructure was quantified on axial image stacks in four quadrants: the anterolateral (AL), posterolateral (PL), posteromedial (PM) and anteromedial (AM) quadrant. RESULTS: The AL and PL quadrants displayed the significantly lowest bone volume fraction and trabecular number (BV/TV range 12.3-25.1%, Tb.N range 0.73-1.16 mm-1) and highest trabecular separation (Tb.Sp range 0.59-0.82 mm), compared to the PM and AM quadrants (BV/TV range 19.9-36.9%, Tb.N range 0.96-1.61 mm-1, Tb.Sp range 0.45-0.74 mm) (p = 0.03). CONCLUSIONS: Our microstructural results suggest that the lateral side is the "weaker side", exhibiting lower bone volume faction, less trabeculae and higher trabecular separation, compared to the medial side. As the forearm is pronated during most falls, the underlying bone microstructure could explain commonly observed fracture patterns of the radial head, particularly more often involving the AL quadrant. If screw fixation in radial head fractures is considered, surgeons should take advantage of the "stronger" bone microstructure of the medial side of the radial head, should the fracture line allow this.


Asunto(s)
Articulación del Codo , Fracturas del Radio , Tornillos Óseos , Humanos , Radio (Anatomía)/diagnóstico por imagen , Fracturas del Radio/diagnóstico por imagen , Fracturas del Radio/cirugía , Microtomografía por Rayos X
15.
J Hand Surg Am ; 46(8): 685-694, 2021 08.
Artículo en Inglés | MEDLINE | ID: mdl-34052040

RESUMEN

PURPOSE: The decision to continue immobilization of a nondisplaced scaphoid waist fracture is often based on radiographic appearance (despite evidence that radiographs are unreliable and inaccurate for diagnosing scaphoid union 6-12 weeks after fracture) and fracture tenderness (even though it is influenced by cognitive biases on pain). This may result in unhelpful additional immobilization. We studied nondisplaced scaphoid waist fractures to determine the factors associated with (1) the surgeon's decision to continue cast or splint immobilization at the first visit when cast removal was being considered; (2) greater pain on examination; and (3) the surgeon's concern about radiographic consolidation. METHODS: We prospectively included 46 patients with a nondisplaced scaphoid waist fracture treated nonoperatively. At the first visit when cast removal was considered - after an average of 6 weeks of immobilization - patients rated pain during 4 examination maneuvers. The treating surgeon assessed union on radiographs and decided whether to continue or discontinue immobilization. Patients completed measures of the following: (1) the degree to which pain limits activities (Patient-Reported Outcome Measure Interactive System [PROMIS] Pain Interference Computer Adaptive Test [CAT], Pain Self-Efficacy Questionnaire-2); (2) symptoms of depression (PROMIS Depression CAT); and (3) upper extremity function (PROMIS Upper Extremity Function CAT). We used multivariable regression analysis to investigate the factors associated with each outcome. RESULTS: Perceived inadequate radiographic healing and greater symptoms of depression were independently associated with continued immobilization. Pain during the examination was not associated with continued immobilization. Patient age was associated with pain on examination. Shorter immobilization duration was the only factor associated with the surgeon's perception of inadequate radiographic consolidation. CONCLUSIONS: Inadequate radiographic healing and greater symptoms of depression are associated with a surgeon's decision to continue cast or splint immobilization of a nondisplaced scaphoid waist fracture. CLINICAL RELEVANCE: Overreliance on radiographs and inadequate accounting for psychological distress may hinder the adoption of shorter immobilization times for nondisplaced waist fractures.


Asunto(s)
Fracturas Óseas , Hueso Escafoides , Fracturas Óseas/diagnóstico por imagen , Fracturas Óseas/terapia , Humanos , Estudios Prospectivos , Radiografía , Hueso Escafoides/diagnóstico por imagen , Férulas (Fijadores)
16.
Arch Orthop Trauma Surg ; 141(4): 561-568, 2021 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-32285189

RESUMEN

BACKGROUND AND PURPOSE: Humeral shaft fractures are often associated with radial nerve palsy (RNP) (8-16%). The primary aim of this systematic review was to assess the incidence of primary and secondary RNP in closed humeral shaft fractures. The secondary aim was to compare the recovery rate of primary RNP and the incidence of secondary RNP between operative and non-operative treatment. METHODS: A systematic literature search was performed in 'Trip Database', 'Embase' and 'PubMed' to identify original studies reporting on RNP in closed humeral shaft fractures. The Coleman Methodology Score was used to grade the quality of the studies. The incidence and recovery of RNP, fracture characteristics and treatment characteristics were extracted. Chi-square and Fisher exact tests were used to compare operative versus non-operative treatment. RESULTS: Forty studies reporting on 1758 patients with closed humeral shaft fractures were included. The incidence of primary RNP was 10%. There was no difference in the recovery rate of primary RNP when comparing operative treatment with radial nerve exploration (98%) versus non-operative treatment (91%) (p = 0.29). The incidence of secondary RNP after operative and non-operative treatment was 4% and 0.4%, respectively (p < 0.01). INTERPRETATION: One-in-ten patients with a closed humeral shaft fracture has an associated primary RNP, of which > 90% recovers without the need of (re-)intervention. No beneficial effect of early exploration on the recovery of primary RNP could be demonstrated when comparing patients managed non-operatively with those explored early. Patients managed operatively for closed humeral shaft fractures have a higher risk of developing secondary RNP. LEVEL OF EVIDENCE: Level IV; Systematic Review.


Asunto(s)
Fracturas del Húmero , Neuropatía Radial , Humanos , Fracturas del Húmero/complicaciones , Fracturas del Húmero/epidemiología , Fracturas del Húmero/terapia , Incidencia , Neuropatía Radial/epidemiología , Neuropatía Radial/etiología
17.
Arch Orthop Trauma Surg ; 141(11): 2011-2018, 2021 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-34302522

RESUMEN

INTRODUCTION: Data from clinical trials suggest that CT-confirmed nondisplaced scaphoid waist fractures heal with less than the conventional 8-12 weeks of immobilization. Barriers to adopting shorter immobilization times in clinical practice may include a strong influence of fracture tenderness and radiographic appearance on decision-making. This study aimed to investigate (1) the degree to which surgeons use fracture tenderness and radiographic appearance of union, among other factors, to decide whether or not to recommend additional cast immobilization after 8 or 12 weeks of immobilization; (2) identify surgeon factors associated with the decision to continue cast immobilization after 8 or 12 weeks. MATERIALS AND METHODS: In a survey-based study, 218 surgeons reviewed 16 patient scenarios of CT-confirmed nondisplaced waist fractures treated with cast immobilization for 8 or 12 weeks and recommended for or against additional cast immobilization. Clinical variables included patient sex, age, a description of radiographic fracture consolidation, fracture tenderness and duration of cast immobilization completed (8 versus 12 weeks). To assess the impact of clinical factors on recommendation to continue immobilization we calculated posterior probabilities and determined variable importance using a random forest algorithm. Multilevel logistic mixed regression analysis was used to identify surgeon characteristics associated with recommendation for additional cast immobilization. RESULTS: Unclear fracture healing on radiographs, fracture tenderness and 8 (versus 12) weeks of completed cast immobilization were the most important factors influencing surgeons' decision to recommend continued cast immobilization. Women surgeons (OR 2.96; 95% CI 1.28-6.81, p = 0.011), surgeons not specialized in orthopedic trauma, hand and wrist or shoulder and elbow surgery (categorized as 'other') (OR 2.64; 95% CI 1.31-5.33, p = 0.007) and surgeons practicing in the United States (OR 6.53, 95% CI 2.18-19.52, p = 0.01 versus Europe) were more likely to recommend continued immobilization. CONCLUSION: Adoption of shorter immobilization times for CT-confirmed nondisplaced scaphoid waist fractures may be hindered by surgeon attention to fracture tenderness and radiographic appearance.


Asunto(s)
Fracturas Óseas , Hueso Escafoides , Cirujanos , Moldes Quirúrgicos , Femenino , Fijación Interna de Fracturas , Fracturas Óseas/diagnóstico por imagen , Fracturas Óseas/cirugía , Humanos , Hueso Escafoides/diagnóstico por imagen , Hueso Escafoides/cirugía , Tomografía Computarizada por Rayos X
18.
Acta Orthop ; 92(2): 240-243, 2021 04.
Artículo en Inglés | MEDLINE | ID: mdl-33263445

RESUMEN

Background and purpose - There is ongoing debate as to whether commercial funding influences reporting of medical studies. We asked: Is there a difference in reported tones between abstracts, introductions, and discussions of orthopedic journal studies that were commercially funded and those that were not commercially funded?Methods - We conducted a systematic PubMed search to identify commercially funded studies published in 20 orthopedic journals between January 1, 2000 and December 1, 2019. We identified commercial funding of studies by including in our search the names of 10 medical device companies with the largest revenue in 2019. Commercial funding was designated when either the study or 1 or more of the authors received funding from a medical device company directly related to the content of the study. We matched 138 commercially funded articles 1 to 1 with 138 non-commercially funded articles with the same study design, published in the same journal, within a time range of 5 years. The IBM Watson Tone Analyzer was used to determine emotional tones (anger, fear, joy, and sadness) and language style (analytical, confident, and tentative).Results - For abstract and introduction sections, we found no differences in reported tones between commercially funded and non-commercially funded studies. Fear tones (non-commercially funded studies 5.1%, commercially funded studies 0.7%, p = 0.04), and analytical tones (non-commercially funded studies 95%, commercially funded studies 88%, p = 0.03) were more common in discussions of studies that were not commercially funded.Interpretation - Commercially funded studies have comparable tones to non-commercially funded studies in the abstract and introduction. In contrast, the discussion of non-commercially funded studies demonstrated more fear and analytical tones, suggesting them to be more tentative, accepting of uncertainty, and dispassionate. As text analysis tools become more sophisticated and mainstream, it might help to discern commercial bias in scientific reports.


Asunto(s)
Autoria , Emociones , Ortopedia , Publicaciones Periódicas como Asunto/economía , Proyectos de Investigación , Apoyo a la Investigación como Asunto , Humanos
19.
Acta Orthop ; 92(5): 513-525, 2021 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-33988081

RESUMEN

Background and purpose - Artificial intelligence (AI), deep learning (DL), and machine learning (ML) have become common research fields in orthopedics and medicine in general. Engineers perform much of the work. While they gear the results towards healthcare professionals, the difference in competencies and goals creates challenges for collaboration and knowledge exchange. We aim to provide clinicians with a context and understanding of AI research by facilitating communication between creators, researchers, clinicians, and readers of medical AI and ML research.Methods and results - We present the common tasks, considerations, and pitfalls (both methodological and ethical) that clinicians will encounter in AI research. We discuss the following topics: labeling, missing data, training, testing, and overfitting. Common performance and outcome measures for various AI and ML tasks are presented, including accuracy, precision, recall, F1 score, Dice score, the area under the curve, and ROC curves. We also discuss ethical considerations in terms of privacy, fairness, autonomy, safety, responsibility, and liability regarding data collecting or sharing.Interpretation - We have developed guidelines for reporting medical AI research to clinicians in the run-up to a broader consensus process. The proposed guidelines consist of a Clinical Artificial Intelligence Research (CAIR) checklist and specific performance metrics guidelines to present and evaluate research using AI components. Researchers, engineers, clinicians, and other stakeholders can use these proposal guidelines and the CAIR checklist to read, present, and evaluate AI research geared towards a healthcare setting.


Asunto(s)
Inteligencia Artificial/normas , Investigación Biomédica , Lista de Verificación , Guías como Asunto , Proyectos de Investigación , Humanos
20.
Eur J Orthop Surg Traumatol ; 31(1): 43-50, 2021 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-32656669

RESUMEN

INTRODUCTION: The reported rate of subsequent surgery after intramedullary nailing (IMN) of tibial shaft fractures (TSFs) is as high as 21%. However, most studies have not included the removal of symptomatic implant in these rates. The purpose of this study was to evaluate the subsequent surgery rate after IMN of TSFs, including the removal of symptomatic implants. Secondly, this study aimed to assess what factors are associated with subsequent surgery (1) to promote fracture and wound healing and (2) for the removal of symptomatic implants. METHODS: One-hundred and ninety-one patients treated with IMN for TSFs were retrospectively included. The rate of subsequent surgery was determined. Bi- and multivariable analysis was used to identify variables associated with subsequent surgery. RESULTS: Approximately half of patients (46%) underwent at least one subsequent surgical procedure. Forty-eight (25%) underwent a subsequent surgical procedure to promote fracture or wound healing. Age (P < 0.01), multi-trauma (P < 0.01), open fracture (P < 0.001) and index surgery during weekdays (P < 0.05) were associated with these procedures. Thirty-nine patients (20%) underwent a subsequent surgical procedure for removal of symptomatic implants. There was a significantly lower rate of implant removal in ASA II (11%) and ASA III-IV (14%) patients compared to ASA I patients (29%) (P < 0.05). CONCLUSIONS: Patients treated with IMN for TSFs should be consented that about one-in-two patients will undergo an additional surgical procedure. Half of these procedures are required to promote wound or fracture healing; the other half are for symptomatic implant removal. LEVEL OF EVIDENCE: Therapeutic level-IV.


Asunto(s)
Fijación Intramedular de Fracturas , Fracturas Abiertas , Fracturas de la Tibia , Adolescente , Adulto , Anciano , Anciano de 80 o más Años , Clavos Ortopédicos , Remoción de Dispositivos , Femenino , Fijación Intramedular de Fracturas/efectos adversos , Curación de Fractura , Humanos , Masculino , Persona de Mediana Edad , Reoperación , Estudios Retrospectivos , Factores de Riesgo , Fracturas de la Tibia/cirugía , Resultado del Tratamiento , Adulto Joven
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA