Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 30
Filtrar
1.
Eur Spine J ; 33(11): 4182-4203, 2024 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-38489044

RESUMO

BACKGROUND CONTEXT: Clinical guidelines, developed in concordance with the literature, are often used to guide surgeons' clinical decision making. Recent advancements of large language models and artificial intelligence (AI) in the medical field come with exciting potential. OpenAI's generative AI model, known as ChatGPT, can quickly synthesize information and generate responses grounded in medical literature, which may prove to be a useful tool in clinical decision-making for spine care. The current literature has yet to investigate the ability of ChatGPT to assist clinical decision making with regard to degenerative spondylolisthesis. PURPOSE: The study aimed to compare ChatGPT's concordance with the recommendations set forth by The North American Spine Society (NASS) Clinical Guideline for the Diagnosis and Treatment of Degenerative Spondylolisthesis and assess ChatGPT's accuracy within the context of the most recent literature. METHODS: ChatGPT-3.5 and 4.0 was prompted with questions from the NASS Clinical Guideline for the Diagnosis and Treatment of Degenerative Spondylolisthesis and graded its recommendations as "concordant" or "nonconcordant" relative to those put forth by NASS. A response was considered "concordant" when ChatGPT generated a recommendation that accurately reproduced all major points made in the NASS recommendation. Any responses with a grading of "nonconcordant" were further stratified into two subcategories: "Insufficient" or "Over-conclusive," to provide further insight into grading rationale. Responses between GPT-3.5 and 4.0 were compared using Chi-squared tests. RESULTS: ChatGPT-3.5 answered 13 of NASS's 28 total clinical questions in concordance with NASS's guidelines (46.4%). Categorical breakdown is as follows: Definitions and Natural History (1/1, 100%), Diagnosis and Imaging (1/4, 25%), Outcome Measures for Medical Intervention and Surgical Treatment (0/1, 0%), Medical and Interventional Treatment (4/6, 66.7%), Surgical Treatment (7/14, 50%), and Value of Spine Care (0/2, 0%). When NASS indicated there was sufficient evidence to offer a clear recommendation, ChatGPT-3.5 generated a concordant response 66.7% of the time (6/9). However, ChatGPT-3.5's concordance dropped to 36.8% when asked clinical questions that NASS did not provide a clear recommendation on (7/19). A further breakdown of ChatGPT-3.5's nonconcordance with the guidelines revealed that a vast majority of its inaccurate recommendations were due to them being "over-conclusive" (12/15, 80%), rather than "insufficient" (3/15, 20%). ChatGPT-4.0 answered 19 (67.9%) of the 28 total questions in concordance with NASS guidelines (P = 0.177). When NASS indicated there was sufficient evidence to offer a clear recommendation, ChatGPT-4.0 generated a concordant response 66.7% of the time (6/9). ChatGPT-4.0's concordance held up at 68.4% when asked clinical questions that NASS did not provide a clear recommendation on (13/19, P = 0.104). CONCLUSIONS: This study sheds light on the duality of LLM applications within clinical settings: one of accuracy and utility in some contexts versus inaccuracy and risk in others. ChatGPT was concordant for most clinical questions NASS offered recommendations for. However, for questions NASS did not offer best practices, ChatGPT generated answers that were either too general or inconsistent with the literature, and even fabricated data/citations. Thus, clinicians should exercise extreme caution when attempting to consult ChatGPT for clinical recommendations, taking care to ensure its reliability within the context of recent literature.


Assuntos
Guias de Prática Clínica como Assunto , Espondilolistese , Espondilolistese/diagnóstico , Espondilolistese/terapia , Humanos , Guias de Prática Clínica como Assunto/normas , Inteligência Artificial/normas , Tomada de Decisão Clínica/métodos
2.
J Shoulder Elbow Surg ; 33(11): 2392-2399, 2024 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-38688420

RESUMO

BACKGROUND: Walch B2 glenoids can present a challenge to shoulder arthroplasty surgeons. Short-term studies have demonstrated that corrective reaming to 10° of retroversion in anatomic total shoulder arthroplasty (aTSA) can result in good outcomes; however, there is little data reporting the long-term outcomes in this cohort. B2 glenoids treated with high-side reaming present a theoretical risk of early glenoid component failure as one may ream into the subchondral bone. This study aimed to demonstrate that (1) B2 glenoids treated with corrective reaming have durable results and (2) offer similar results to Walch A1/2 in long-term follow-up. METHODS: Patients who underwent aTSA by a single surgeon (E.L.F.) were identified from a shoulder arthroplasty registry. Inclusion criteria included Walch A1, A2, or B2 glenoid; a diagnosis of primary shoulder osteoarthritis; and a minimum radiographic and clinical follow-up of 5 years. Forty-three patients with B2 glenoids were compared to a cohort of 42 patients with A1 or A2 glenoids. Preoperative computed tomography (CT) and radiographs were used to assess deformity, glenoid version, and posterior subluxation of the humeral head. Postoperatively, patients were assessed with radiographs and patient-reported outcome measures including American Shoulder and Elbow Surgeons Standardized Shoulder Assessment Form (ASES) score, Simple Shoulder Test (SST) score, and visual analog scale (VAS). RESULTS: Eighty-five shoulders (82 patients, 42 B2 and 43 A1/A2 glenoids) with an average follow-up of 9.4 years were included. In the B2 cohort, the average retroversion was 21.1° and posterior subluxation was 69.4% compared with 10.6° (P < .001) and 57.5% (P < .001), respectively, in the A1 or A2 cohort. The cohort demographics were similar except for male sex (B2 69.8% vs. A1 or A2 37.2%, P = .008). There was no difference between the cohorts in their improvement in ASES (P = .807), SST (P = .586), and VAS (P = .930) scores. There was no difference in lateral humeral offset (P = .889) or acromial humeral interval (P = .468) between initial postoperative and final follow-up visits. Survivorship for B2 glenoids was 97.6%, 94.1%, and 73.3% at 5, 10, and 15 years, respectively, compared with 97.6%, 91.9%, and 83.5% in type A glenoids. The revision rate was similar between the 2 groups (P = .432). Lazarus score (P = .682) and rates of humeral radiolucency (P = .366) and humeral osteolysis (P = .194) were similar between the 2 cohorts at final follow-up. CONCLUSION: Asymmetric reaming of patients with B2 glenoids is a reliable method of glenoid preparation with excellent mid- to long-term clinical results, patient-reported outcomes, and low revision rates similar to their A1 and A2 counterparts.


Assuntos
Artroplastia do Ombro , Articulação do Ombro , Humanos , Artroplastia do Ombro/métodos , Masculino , Feminino , Idoso , Seguimentos , Pessoa de Meia-Idade , Articulação do Ombro/cirurgia , Articulação do Ombro/diagnóstico por imagem , Osteoartrite/cirurgia , Osteoartrite/diagnóstico por imagem , Resultado do Tratamento , Prótese de Ombro , Estudos Retrospectivos , Amplitude de Movimento Articular , Cavidade Glenoide/cirurgia , Cavidade Glenoide/diagnóstico por imagem
3.
J Shoulder Elbow Surg ; 33(9): 1962-1971, 2024 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-38430980

RESUMO

BACKGROUND: Proximal humerus fracture (PHF) is a risk factor for 1-year mortality. This study aimed to determine if surgery is associated with lower mortality compared to nonoperative treatment following PHF in older patients. METHODS: This retrospective cohort study used the Medicare Limited Data set. Patients aged 65 years and older with a PHF diagnosis in 2017-2020 were included. Treatment was classified as nonoperative, open reduction internal fixation (ORIF), total shoulder arthroplasty (TSA), or hemiarthroplasty. Multivariable logistic regression models examined (a) predictors of treatment type and (b) the association of treatment type with 1-year mortality, adjusting for patient demographics, comorbidities, frailty, and fracture severity among other variables. A subgroup analysis examined how the relationship between treatment type and 1-year mortality varied based on fracture severity. Adjusted odds ratios (aORs) and 95% confidence intervals (CIs) are reported. RESULTS: In total, 49,072 patients were included (mean age = 76.6 years, 82.3% female). Most were treated nonoperatively (77.5%), 10.9% underwent ORIF, 10.6% underwent TSA, and 1.0% underwent hemiarthroplasty. Examples of factors associated with receipt of operative (versus nonoperative treatment) included worse fracture severity and lower frailty. The 1-year mortality rate after the initial PHF diagnosis was 11.0% for the nonoperative group, 4.0% for ORIF, 5.2% for TSA, and 6.0% for hemiarthroplasty. Compared to nonoperative treatment, ORIF (aOR 0.55; 95% CI [0.47, 0.64]; P < .001) and TSA (aOR 0.59; 95% CI [0.50, 0.68]; P < .001) were associated with decreased odds of 1-year mortality. In the subgroup analysis, ORIF and TSA were associated with a lower 1-year mortality risk for 2-part and 3-/4-part fractures. CONCLUSIONS: Compared to nonoperative treatment, surgery (particularly TSA and ORIF) was associated with a decreased odds of 1-year mortality. This relationship remained significant for 2-part and 3-/4-part fractures after stratifying by fracture severity.


Assuntos
Artroplastia do Ombro , Hemiartroplastia , Medicare , Fraturas do Ombro , Humanos , Fraturas do Ombro/cirurgia , Fraturas do Ombro/mortalidade , Idoso , Feminino , Masculino , Estados Unidos/epidemiologia , Estudos Retrospectivos , Idoso de 80 Anos ou mais , Hemiartroplastia/mortalidade , Fixação Interna de Fraturas/métodos , Redução Aberta
4.
J Shoulder Elbow Surg ; 33(8): 1755-1761, 2024 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-38242528

RESUMO

BACKGROUND: Although cementation of humeral stems has long been considered the gold standard for anatomic shoulder arthroplasty (aTSA), cementless, or press-fit, fixation offers a relatively cheaper and less demanding alternative, particularly in the setting of a revision procedure. However, this approach has been accompanied by concerns of implant loosening and high rates of radiolucency. In the present study, we performed a propensity-matched comparison of clinical and patient-reported outcomes between cemented and cementless fixation techniques for aTSA. We hypothesized that cemented fixation of the humeral component would have significantly better implant survival while providing comparable functional outcomes at final follow-up. METHODS: This study was a retrospective comparison of 50 shoulders undergoing aTSA: 25 using cemented humeral fixation vs. 25 using press-fit humeral fixation. Patients in the 2 groups were propensity matched according to age, sex, and preoperative American Shoulder and Elbow Surgeons Standardized Shoulder Assessment Form (ASES) score. Primary outcome measures included range of motion (ROM) (forward elevation, external rotation, internal rotation), patient-reported outcomes (ASES, Simple Shoulder Test [SST], visual analog scale [VAS]), and implant survival. RESULTS: At baseline, the 2 fixation groups were similar in regard to age, sex, body mass index, preoperative ASES score, and surgical indication. Mean follow-up was 11.7 ± 4.95 years in the cemented cohort and 9.13 ± 3.77 years in the press-fit cohort (P = .045). Both groups demonstrated significant improvements postoperatively in all included ROM and patient-reported outcomes. However, press-fit patients reported significantly better VAS, ASES, and SST scores. Mean VAS pain score was 1.1 ± 1.8 in press-fit patients and 3.2 ± 3.0 in cemented patients (P = .005). The mean ASES score was 87.7 ± 12.4 in press-fit patients and 69.5 ± 22.7 in cemented patients (P = .002). Lastly, the mean SST score was 9.8 ± 3.1 in press-fit patients and 7.7 ± 3.7 in cemented patients (P = .040). Both fixation techniques provided lasting implant survivorship with only a single revision operation in each of the cohorts. CONCLUSION: Herein, we provide a propensity-matched, long-term comparison of patients receiving anatomic shoulder arthroplasty stratified according to humeral stem fixation technique. The results of this analysis illustrate that both types of humeral fixation techniques yield durable and significant improvements in shoulder function with similar rates of survival at 10 years of follow-up.


Assuntos
Artroplastia do Ombro , Desenho de Prótese , Amplitude de Movimento Articular , Humanos , Masculino , Feminino , Estudos Retrospectivos , Idoso , Seguimentos , Pessoa de Meia-Idade , Artroplastia do Ombro/métodos , Prótese de Ombro , Cimentação , Cimentos Ósseos , Úmero/cirurgia , Pontuação de Propensão , Falha de Prótese , Articulação do Ombro/cirurgia , Resultado do Tratamento
5.
Eur J Orthop Surg Traumatol ; 34(2): 799-807, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-37707634

RESUMO

PURPOSE: The utilization of reverse total shoulder arthroplasty now exceeds the incidence of anatomic shoulder arthroplasty. Previous mid-to-long-term studies on rTSA have reported a decrease in shoulder function as follow-up increased. The purpose of this study was to provide data on mid-term outcomes and implant survival in a series focusing on reverse total shoulder arthroplasty. MATERIALS AND METHODS: Demographic information such as age at surgery, revision surgery status, BMI, and smoking status were recorded. The clinical endpoints measured in this study were range of motion scores (forward elevation, external rotation, internal rotation) and patient reported outcomes (VAS, ASES, SST). Radiographic variables captured included preoperative glenoid morphology, humeral lucency, and glenoid loosening. RESULTS: Fifty-six shoulders were included in this study. The overall mean age at surgery was 72.5 ± 7.2 years with an average follow-up time of 6.8 ± 3.5 years. The mean BMI was 28.1 ± 5.5. All measurements of range of motion saw significant and sustained improvements. Overall, forward elevation improved from 82° preoperatively to 133° postoperatively (p < 0.01). External rotation improved from 23° preoperatively to 36° (p < 0.01), while internal rotation improved from L3 to L1 (p = 0.05). ASES scores improved from 31 preoperatively to 70 postoperatively (p < 0.01). SST scores improved from 2 preoperatively to 7 (p < 0.01). VAS pain index scores improved from 6 to 2 following surgery (p < 0.01). Postoperative scapular notching was seen in 18 patients at final follow-up. Glenoid loosening was seen in 3 shoulders. Humeral loosening was seen in 18 shoulders. Tuberosity resorption was seen in 8 shoulders. The 5 year survival estimate was 98%, and the 10 year survival estimate was 83%. CONCLUSION: In this series, we found that rTSA provides mid-term improvements in range of motion in patients while reducing pain levels. When considered together, this demonstrates that most patients undergoing rTSA can have excellent use of their shoulder from age at surgery to end-of-life.


Assuntos
Artroplastia do Ombro , Articulação do Ombro , Prótese de Ombro , Humanos , Idoso , Artroplastia do Ombro/efeitos adversos , Articulação do Ombro/diagnóstico por imagem , Articulação do Ombro/cirurgia , Resultado do Tratamento , Estudos Retrospectivos , Dor , Amplitude de Movimento Articular , Prótese de Ombro/efeitos adversos
6.
Eur Spine J ; 32(6): 2149-2156, 2023 06.
Artigo em Inglês | MEDLINE | ID: mdl-36854862

RESUMO

PURPOSE: Predict nonhome discharge (NHD) following elective anterior cervical discectomy and fusion (ACDF) using an explainable machine learning model. METHODS: 2227 patients undergoing elective ACDF from 2008 to 2019 were identified from a single institutional database. A machine learning model was trained on preoperative variables, including demographics, comorbidity indices, and levels fused. The validation technique was repeated stratified K-Fold cross validation with the area under the receiver operating curve (AUROC) statistic as the performance metric. Shapley Additive Explanation (SHAP) values were calculated to provide further explainability regarding the model's decision making. RESULTS: The preoperative model performed with an AUROC of 0.83 ± 0.05. SHAP scores revealed the most pertinent risk factors to be age, medicare insurance, and American Society of Anesthesiology (ASA) score. Interaction analysis demonstrated that female patients over 65 with greater fusion levels were more likely to undergo NHD. Likewise, ASA demonstrated positive interaction effects with female sex, levels fused and BMI. CONCLUSION: We validated an explainable machine learning model for the prediction of NHD using common preoperative variables. Adding transparency is a key step towards clinical application because it demonstrates that our model's "thinking" aligns with clinical reasoning. Interactive analysis demonstrated that those of age over 65, female sex, higher ASA score, and greater fusion levels were more predisposed to NHD. Age and ASA score were similar in their predictive ability. Machine learning may be used to predict NHD, and can assist surgeons with patient counseling or early discharge planning.


Assuntos
Alta do Paciente , Fusão Vertebral , Humanos , Feminino , Idoso , Estados Unidos , Fusão Vertebral/métodos , Medicare , Discotomia/métodos , Aprendizado de Máquina , Estudos Retrospectivos
7.
J Shoulder Elbow Surg ; 32(12): 2493-2500, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-37276920

RESUMO

BACKGROUND: Superior migration of the humeral head has been linked with rotator cuff dysfunction and glenoid loosening after total shoulder arthroplasty (TSA). We aimed to determine if superior migration was associated with poor shoulder function following anatomic TSA at long-term follow-up. METHODS: In this retrospective cohort study, we reviewed patients undergoing TSA by a single surgeon at an urban, academic institution. To study the effect of superior migration on TSA outcomes, we stratified the cohort by ≥ and <7 mm of acromiohumeral interval (AHI) and compared range of motion and patient reported outcomes (PROs). Clinical variables included preoperative and postoperative forward elevation (FE), internal rotation, external rotation, visual analog scale, American Shoulder and Elbow Surgeons shoulder score, and Simple Shoulder Text score. Radiographic variables included immediate postoperative and long-term follow-up AHI, lateral humeral offset, and glenoid loosening scores. RESULTS: After applying exclusion criteria, 121 TSAs were included. The mean age was 63.9 ± 9.5 years, and 66 surgeries (55%) were in male patients. The mean follow-up for our cohort was 11.2 years (range, 5-26 years). Nine shoulders underwent revision surgery. All range of motion and PROs improved significantly from preoperative to the most recent postoperative follow-up. The mean AHI immediately following surgery was 10.9 ± 4.1 mm, while the mean AHI at most recent follow-up was 8.4 ± 3.5 mm. Glenoid loosening was observed in 29 (23.8%) shoulders at the most recent follow-up appointment. Although AHI correlated weakly with FE (r = 0.252; P = .006), we did not observe a clear threshold of migration which led to degraded function. Importantly, glenoid loosening was not related to AHI at long-term follow-up (P = .631). None of FE, internal rotation, external rotation, visual analog scale, American Shoulder and Elbow Surgeons shoulder score, Simple Shoulder Text, or revisions were significantly different between patients with ≥ and <7 mm of AHI. CONCLUSION: Our results suggest that anatomic TSA provides durable improvements to pain, function, and PROs despite changes to the AHI.


Assuntos
Artroplastia do Ombro , Articulação do Ombro , Idoso , Humanos , Masculino , Pessoa de Meia-Idade , Seguimentos , Cabeça do Úmero/cirurgia , Amplitude de Movimento Articular , Estudos Retrospectivos , Articulação do Ombro/diagnóstico por imagem , Articulação do Ombro/cirurgia , Resultado do Tratamento , Feminino
8.
J Arthroplasty ; 38(12): 2634-2637, 2023 12.
Artigo em Inglês | MEDLINE | ID: mdl-37315633

RESUMO

BACKGROUND: Osteonecrosis of the femoral head is a common indication for total hip arthroplasty (THA). It is unclear to what extent the COVID-19 pandemic has impacted its incidence. Theoretically, the combination of microvascular thromboses and corticosteroid use in patients who have COVID-19 may increase the risk of osteonecrosis. We aimed to (1) assess recent osteonecrosis trends and (2) investigate if a history of COVID-19 diagnosis is associated with osteonecrosis. METHODS: This retrospective cohort study utilized a large national database between 2016 and 2021. Osteonecrosis incidence in 2016 to 2019 was compared to 2020 to 2021. Secondly, utilizing a cohort from April 2020 through December 2021, we investigated whether a prior COVID-19 diagnosis was associated with osteonecrosis. For both comparisons, Chi-square tests were applied. RESULTS: Among 1,127,796 THAs performed between 2016 and 2021, we found an osteonecrosis incidence of 1.6% (n = 5,812) in 2020 to 2021 compared to 1.4% (n = 10,974) in 2016 to 2019; P < .0001. Furthermore, using April 2020 to December 2021 data from 248,183 THAs, we found that osteonecrosis was more common among those who had a history of COVID-19 (3.9%; 130 of 3,313) compared to patients who had no COVID-19 history (3.0%; 7,266 of 244,870); P = .001). CONCLUSION: Osteonecrosis incidence was higher in 2020 to 2021 compared to previous years and a previous COVID-19 diagnosis was associated with a greater likelihood of osteonecrosis. These findings suggest a role of the COVID-19 pandemic on an increased osteonecrosis incidence. Continued monitoring is necessary to fully understand the impact of the COVID-19 pandemic on THA care and outcomes.


Assuntos
Artroplastia de Quadril , COVID-19 , Necrose da Cabeça do Fêmur , Osteonecrose , Humanos , Idoso , Artroplastia de Quadril/efeitos adversos , Estudos Retrospectivos , Teste para COVID-19 , Pandemias , COVID-19/epidemiologia , Osteonecrose/epidemiologia , Osteonecrose/etiologia , Resultado do Tratamento , Necrose da Cabeça do Fêmur/epidemiologia , Necrose da Cabeça do Fêmur/etiologia , Necrose da Cabeça do Fêmur/cirurgia
9.
Radiology ; 301(3): 664-671, 2021 12.
Artigo em Inglês | MEDLINE | ID: mdl-34546126

RESUMO

Background Patients who undergo surgery for cervical radiculopathy are at risk for developing adjacent segment disease (ASD). Identifying patients who will develop ASD remains challenging for clinicians. Purpose To develop and validate a deep learning algorithm capable of predicting ASD by using only preoperative cervical MRI in patients undergoing single-level anterior cervical diskectomy and fusion (ACDF). Materials and Methods In this Health Insurance Portability and Accountability Act-compliant study, retrospective chart review was performed for 1244 patients undergoing single-level ACDF in two tertiary care centers. After application of inclusion and exclusion criteria, 344 patients were included, of whom 60% (n = 208) were used for training and 40% for validation (n = 43) and testing (n = 93). A deep learning-based prediction model with 48 convolutional layers was designed and trained by using preoperative T2-sagittal cervical MRI. To validate model performance, a neuroradiologist and neurosurgeon independently provided ASD predictions for the test set. Validation metrics included accuracy, areas under the curve, and F1 scores. The difference in proportion of wrongful predictions between the model and clinician was statistically tested by using the McNemar test. Results A total of 344 patients (median age, 48 years; interquartile range, 41-58 years; 182 women) were evaluated. The model predicted ASD on the 93 test images with an accuracy of 88 of 93 (95%; 95% CI: 90, 99), sensitivity of 12 of 15 (80%; 95% CI: 60, 100), and specificity of 76 of 78 (97%; 95% CI: 94, 100). The neuroradiologist and neurosurgeon provided predictions with lower accuracy (54 of 93; 58%; 95% CI: 48, 68), sensitivity (nine of 15; 60%; 95% CI: 35, 85), and specificity (45 of 78; 58%; 95% CI: 56, 77) compared with the algorithm. The McNemar test on the contingency table demonstrated that the proportion of wrongful predictions was significantly lower by the model (test statistic, 2.000; P < .001). Conclusion A deep learning algorithm that used only preoperative cervical T2-weighted MRI outperformed clinical experts at predicting adjacent segment disease in patients undergoing surgery for cervical radiculopathy. © RSNA, 2021 An earlier incorrect version appeared online. This article was corrected on September 22, 2021.


Assuntos
Aprendizado Profundo , Interpretação de Imagem Assistida por Computador/métodos , Imageamento por Ressonância Magnética/métodos , Complicações Pós-Operatórias/diagnóstico , Radiculopatia/cirurgia , Doenças da Medula Espinal/diagnóstico , Fusão Vertebral/métodos , Adulto , Vértebras Cervicais/diagnóstico por imagem , Discotomia/métodos , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Valor Preditivo dos Testes , Cuidados Pré-Operatórios/métodos , Radiculopatia/diagnóstico por imagem , Reprodutibilidade dos Testes , Estudos Retrospectivos , Sensibilidade e Especificidade
11.
J Neurosurg Spine ; 41(3): 385-395, 2024 Sep 01.
Artigo em Inglês | MEDLINE | ID: mdl-38941643

RESUMO

OBJECTIVE: The objective of this study was to assess the safety and accuracy of ChatGPT recommendations in comparison to the evidence-based guidelines from the North American Spine Society (NASS) for the diagnosis and treatment of cervical radiculopathy. METHODS: ChatGPT was prompted with questions from the 2011 NASS clinical guidelines for cervical radiculopathy and evaluated for concordance. Selected key phrases within the NASS guidelines were identified. Completeness was measured as the number of overlapping key phrases between ChatGPT responses and NASS guidelines divided by the total number of key phrases. A senior spine surgeon evaluated the ChatGPT responses for safety and accuracy. ChatGPT responses were further evaluated on their readability, similarity, and consistency. Flesch Reading Ease scores and Flesch-Kincaid reading levels were measured to assess readability. The Jaccard Similarity Index was used to assess agreement between ChatGPT responses and NASS clinical guidelines. RESULTS: A total of 100 key phrases were identified across 14 NASS clinical guidelines. The mean completeness of ChatGPT-4 was 46%. ChatGPT-3.5 yielded a completeness of 34%. ChatGPT-4 outperformed ChatGPT-3.5 by a margin of 12%. ChatGPT-4.0 outputs had a mean Flesch reading score of 15.24, which is very difficult to read, requiring a college graduate education to understand. ChatGPT-3.5 outputs had a lower mean Flesch reading score of 8.73, indicating that they are even more difficult to read and require a professional education level to do so. However, both versions of ChatGPT were more accessible than NASS guidelines, which had a mean Flesch reading score of 4.58. Furthermore, with NASS guidelines as a reference, ChatGPT-3.5 registered a mean ± SD Jaccard Similarity Index score of 0.20 ± 0.078 while ChatGPT-4 had a mean of 0.18 ± 0.068. Based on physician evaluation, outputs from ChatGPT-3.5 and ChatGPT-4.0 were safe 100% of the time. Thirteen of 14 (92.8%) ChatGPT-3.5 responses and 14 of 14 (100%) ChatGPT-4.0 responses were in agreement with current best clinical practices for cervical radiculopathy according to a senior spine surgeon. CONCLUSIONS: ChatGPT models were able to provide safe and accurate but incomplete responses to NASS clinical guideline questions about cervical radiculopathy. Although the authors' results suggest that improvements are required before ChatGPT can be reliably deployed in a clinical setting, future versions of the LLM hold promise as an updated reference for guidelines on cervical radiculopathy. Future versions must prioritize accessibility and comprehensibility for a diverse audience.


Assuntos
Radiculopatia , Humanos , Radiculopatia/diagnóstico , Guias de Prática Clínica como Assunto/normas , Vértebras Cervicais/cirurgia , Sociedades Médicas
12.
Clin Spine Surg ; 2024 Oct 25.
Artigo em Inglês | MEDLINE | ID: mdl-39450873

RESUMO

STUDY DESIGN: Retrospective cohort. OBJECTIVE: This study aims to evaluate the relationship between the cervical levels fused and the degree of subsidence following anterior cervical discectomy and fusion (ACDF) procedures. BACKGROUND: Subsidence following ACDF may worsen clinical outcomes. Previous studies have linked lower cervical levels with higher rates of subsidence, but none have quantified the relative degree of subsidence between levels. MATERIALS AND METHODS: Patients who underwent ACDF from 2016 to 2021 at a tertiary medical center were included in this study. Lateral cervical radiographs from the immediate postoperative period and the final follow-ups were used to calculate subsidence. Analysis of variance was used to examine the association between cervical levels fused and subsidence. Multivariable linear regression analysis controlled for age, sex, smoking status, osteopenia/osteoporosis, number of fused levels, cage-to-body ratio, and cage type while examining the relationship between the cervical level fused and subsidence. RESULTS: This study includes 122 patients who underwent 227 levels fused. There were 16 (7.0%) C3-C4 fusions, 55 (24.2%) C4-C5 fusions, 97 (42.7%) C5-C6 fusions, and 59 (26.0%) C6-C7 fusions. There was a significant difference in the degree of anterior subsidence between cervical levels fused (P = 0.013) with a mean subsidence of 1.0 mm (SD: 1.6) for C3-C4, 1.1 mm (SD: 1.4) for C4-C5, 1.8 mm (SD: 1.5) for C5-C6, and 1.8 mm (SD: 1.6) for C6-C7 fusions. Relative to C6-C7 fusions, C4-C5 (P = 0.016), and C3-C4 (P = 0.014) fusions were associated with decreased anterior subsidence, whereas C5-C6 (P = 0.756) fusions were found to have similar degrees of anterior subsidence in the multivariable analysis. CONCLUSION: We found upper cervical levels experienced a smaller degree of anterior subsidence than lower levels, after controlling for demographic and implant characteristics. Surgeons can consider using larger cages at lower cervical levels to minimize these risks.

13.
Artigo em Inglês | MEDLINE | ID: mdl-39137403

RESUMO

BACKGROUND: Acute hip fractures are a public health problem affecting primarily older adults. Chat Generative Pretrained Transformer may be useful in providing appropriate clinical recommendations for beneficial treatment. OBJECTIVE: To evaluate the accuracy of Chat Generative Pretrained Transformer (ChatGPT)-4.0 by comparing its appropriateness scores for acute hip fractures with the American Academy of Orthopaedic Surgeons (AAOS) Appropriate Use Criteria given 30 patient scenarios. "Appropriateness" indicates the unexpected health benefits of treatment exceed the expected negative consequences by a wide margin. METHODS: Using the AAOS Appropriate Use Criteria as the benchmark, numerical scores from 1 to 9 assessed appropriateness. For each patient scenario, ChatGPT-4.0 was asked to assign an appropriate score for six treatments to manage acute hip fractures. RESULTS: Thirty patient scenarios were evaluated for 180 paired scores. Comparing ChatGPT-4.0 with AAOS scores, there was a positive correlation for multiple cannulated screw fixation, total hip arthroplasty, hemiarthroplasty, and long cephalomedullary nails. Statistically significant differences were observed only between scores for long cephalomedullary nails. CONCLUSION: ChatGPT-4.0 scores were not concordant with AAOS scores, overestimating the appropriateness of total hip arthroplasty, hemiarthroplasty, and long cephalomedullary nails, and underestimating the other three. ChatGPT-4.0 was inadequate in selecting an appropriate treatment deemed acceptable, most reasonable, and most likely to improve patient outcomes.


Assuntos
Fraturas do Quadril , Humanos , Fraturas do Quadril/cirurgia , Idoso , Feminino , Masculino , Idoso de 80 Anos ou mais , Artroplastia de Quadril , Hemiartroplastia , Guias de Prática Clínica como Assunto , Doença Aguda , Idioma
14.
Clin Spine Surg ; 2024 Jun 03.
Artigo em Inglês | MEDLINE | ID: mdl-38828954

RESUMO

STUDY DESIGN: Retrospective cohort. OBJECTIVE: The purpose of this study was to evaluate the effect of overdistraction on interbody cage subsidence. BACKGROUND: Vertebral overdistraction due to the use of large intervertebral cage sizes may increase the risk of postoperative subsidence. METHODS: Patients who underwent anterior cervical discectomy and fusion between 2016 and 2021 were included. All measurements were performed using lateral cervical radiographs at 3 time points - preoperative, immediate postoperative, and final follow-up >6 months postoperatively. Anterior and posterior distraction were calculated by subtracting the preoperative disc height from the immediate postoperative disc height. Cage subsidence was calculated by subtracting the final follow-up postoperative disc height from the immediate postoperative disc height. Associations between anterior and posterior subsidence and distraction were determined using multivariable linear regression models. The analyses controlled for cage type, cervical level, sex, age, smoking status, and osteopenia. RESULTS: Sixty-eight patients and 125 fused levels were included in the study. Of the 68 fusions, 22 were single-level fusions, 35 were 2-level, and 11 were 3-level. The median final follow-up interval was 368 days (range: 181-1257 d). Anterior disc space subsidence was positively associated with anterior distraction (beta = 0.23; 95% CI: 0.08, 0.38; P = 0.004), and posterior disc space subsidence was positively associated with posterior distraction (beta = 0.29; 95% CI: 0.13, 0.45; P < 0.001). No significant associations between anterior distraction and posterior subsidence (beta = 0.07; 95% CI: -0.06, 0.20; P = 0.270) or posterior distraction and anterior subsidence (beta = 0.06; 95% CI: -0.14, 0.27; P = 0.541) were observed. CONCLUSIONS: We found that overdistraction of the disc space was associated with increased postoperative subsidence after anterior cervical discectomy and fusion. Surgeons should consider choosing a smaller cage size to avoid overdistraction and minimize postoperative subsidence.

15.
Neurospine ; 21(1): 128-146, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38569639

RESUMO

OBJECTIVE: Large language models, such as chat generative pre-trained transformer (ChatGPT), have great potential for streamlining medical processes and assisting physicians in clinical decision-making. This study aimed to assess the potential of ChatGPT's 2 models (GPT-3.5 and GPT-4.0) to support clinical decision-making by comparing its responses for antibiotic prophylaxis in spine surgery to accepted clinical guidelines. METHODS: ChatGPT models were prompted with questions from the North American Spine Society (NASS) Evidence-based Clinical Guidelines for Multidisciplinary Spine Care for Antibiotic Prophylaxis in Spine Surgery (2013). Its responses were then compared and assessed for accuracy. RESULTS: Of the 16 NASS guideline questions concerning antibiotic prophylaxis, 10 responses (62.5%) were accurate in ChatGPT's GPT-3.5 model and 13 (81%) were accurate in GPT-4.0. Twenty-five percent of GPT-3.5 answers were deemed as overly confident while 62.5% of GPT-4.0 answers directly used the NASS guideline as evidence for its response. CONCLUSION: ChatGPT demonstrated an impressive ability to accurately answer clinical questions. GPT-3.5 model's performance was limited by its tendency to give overly confident responses and its inability to identify the most significant elements in its responses. GPT-4.0 model's responses had higher accuracy and cited the NASS guideline as direct evidence many times. While GPT-4.0 is still far from perfect, it has shown an exceptional ability to extract the most relevant research available compared to GPT-3.5. Thus, while ChatGPT has shown far-reaching potential, scrutiny should still be exercised regarding its clinical use at this time.

16.
Spine (Phila Pa 1976) ; 49(9): 640-651, 2024 May 01.
Artigo em Inglês | MEDLINE | ID: mdl-38213186

RESUMO

STUDY DESIGN: Comparative analysis. OBJECTIVE: To evaluate Chat Generative Pre-trained Transformer (ChatGPT's) ability to predict appropriate clinical recommendations based on the most recent clinical guidelines for the diagnosis and treatment of low back pain. BACKGROUND: Low back pain is a very common and often debilitating condition that affects many people globally. ChatGPT is an artificial intelligence model that may be able to generate recommendations for low back pain. MATERIALS AND METHODS: Using the North American Spine Society Evidence-Based Clinical Guidelines as the gold standard, 82 clinical questions relating to low back pain were entered into ChatGPT (GPT-3.5) independently. For each question, we recorded ChatGPT's answer, then used a point-answer system-the point being the guideline recommendation and the answer being ChatGPT's response-and asked ChatGPT if the point was mentioned in the answer to assess for accuracy. This response accuracy was repeated with one caveat-a prior prompt is given in ChatGPT to answer as an experienced orthopedic surgeon-for each question by guideline category. A two-sample proportion z test was used to assess any differences between the preprompt and postprompt scenarios with alpha=0.05. RESULTS: ChatGPT's response was accurate 65% (72% postprompt, P =0.41) for guidelines with clinical recommendations, 46% (58% postprompt, P =0.11) for guidelines with insufficient or conflicting data, and 49% (16% postprompt, P =0.003*) for guidelines with no adequate study to address the clinical question. For guidelines with insufficient or conflicting data, 44% (25% postprompt, P =0.01*) of ChatGPT responses wrongly suggested that sufficient evidence existed. CONCLUSION: ChatGPT was able to produce a sufficient clinical guideline recommendation for low back pain, with overall improvements if initially prompted. However, it tended to wrongly suggest evidence and often failed to mention, especially postprompt, when there is not enough evidence to adequately give an accurate recommendation.


Assuntos
Dor Lombar , Cirurgiões Ortopédicos , Humanos , Dor Lombar/diagnóstico , Dor Lombar/terapia , Inteligência Artificial , Coluna Vertebral
17.
Neurosurgery ; 93(3): 670-677, 2023 09 01.
Artigo em Inglês | MEDLINE | ID: mdl-36995101

RESUMO

BACKGROUND: Pain evaluation remains largely subjective in neurosurgical practice, but machine learning provides the potential for objective pain assessment tools. OBJECTIVE: To predict daily pain levels using speech recordings from personal smartphones of a cohort of patients with diagnosed neurological spine disease. METHODS: Patients with spine disease were enrolled through a general neurosurgical clinic with approval from the institutional ethics committee. At-home pain surveys and speech recordings were administered at regular intervals through the Beiwe smartphone application. Praat audio features were extracted from the speech recordings to be used as input to a K-nearest neighbors (KNN) machine learning model. The pain scores were transformed from a 0 to 10 scale to low and high pain for better discriminative capacity. RESULTS: A total of 60 patients were enrolled, and 384 observations were used to train and test the prediction model. Using the KNN prediction model, an accuracy of 71% with a positive predictive value of 0.71 was achieved in classifying pain intensity into high and low. The model showed 0.71 precision for high pain and 0.70 precision for low pain. Recall of high pain was 0.74, and recall of low pain was 0.67. The overall F1 score was 0.73. CONCLUSION: Our study uses a KNN to model the relationship between speech features and pain levels collected from personal smartphones of patients with spine disease. The proposed model is a stepping stone for the development of objective pain assessment in neurosurgery clinical practice.


Assuntos
Smartphone , Doenças da Coluna Vertebral , Humanos , Fala , Doenças da Coluna Vertebral/complicações , Doenças da Coluna Vertebral/diagnóstico , Doenças da Coluna Vertebral/cirurgia , Coluna Vertebral , Dor/diagnóstico , Dor/etiologia
18.
J Orthop ; 35: 150-154, 2023 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-36506264

RESUMO

Introduction: The purpose of this study is to report a systematic review and meta-analysis of solid organ transplant (SOT) patients undergoing shoulder arthroplasty to compare functional and radiographic outcomes, demographics, and complications with non-transplant patients. Methods: Studies were included if they examined patients undergoing shoulder arthroplasty in the setting of prior solid organ transplantation and included post operative range of motion, patient-reported outcomes, complications, or revisions. Studies were excluded if they were national database analyses or lacked clinical data. Pubmed, MEDLine, Scopus, and Web of Science were queried using relevant search terms in July 2022. Data was pooled, weighted, and a paired t-test and chi-square analysis was performed. Results: There were 71 SOT and 159 non-SOT shoulders included in the study. The most common indication for surgery was avascular necrosis (n = 26) in the solid organ transplant group and osteoarthritis (n = 60) in the non-SOT group. Forward elevation, external rotation, ASES, and VAS pain scores improved significantly in both cohorts following surgery. There was no significant difference in age at surgery (p-value = 0.20), postoperative forward elevation (p-value = 0.08), postoperative external rotation (0.84), and postoperative ASES scores (p-value = 0.11) between the two cohorts. VAS pain scores were significantly lower in the SOT cohort (p-value<0.01). The risk of death was significantly higher in the SOT group (p-value<0.01). but the rate of overall complications (p = 0.47), surgical complication (p-value = 0.79), or revision surgery (p-value = 1.00) was not significantly different between the two cohorts. Conclusion: Shoulder arthroplasty is a safe, effective option in patients following solid organ transplant. There is not an increased risk of adverse outcomes, and SOT patients had comparable range of motion and patient-reported outcomes when compared to their non-SOT peers. Level of evidence: III.

19.
J Orthop ; 36: 120-124, 2023 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-36710938

RESUMO

Background: The two main glenoid types used in total shoulder arthroplasty (TSA) are the pegged and keeled glenoid designs. We aimed to determine if a pegged glenoid is superior to a keeled glenoid at long-term follow-up as measured by range of motion (ROM), patient reported outcomes (PROs), and radiographic glenoid loosening. Methods: We retrospectively reviewed all patients undergoing TSA by a single surgeon at an urban, academic hospital. The cohort was stratified into two groups based on glenoid type - one group consisting of keeled implants and a second group consisting of pegged implants. For each patient, forward elevation (FE), internal rotation (IR), external rotation (ER), visual analog scale (VAS), American Shoulder and Elbow Surgeons (ASES) shoulder score, and simple shoulder test (SST) scores were collected preoperatively and at the most recent follow-up visit. Radiographic variables included acromiohumeral interval (AHI) and glenoid loosening. Results: After applying exclusion criteria, 144 TSAs were included in our study. Of these, 42 (29.2%) had keeled glenoids and 102 (70.8%) had pegged glenoids. Patients with a pegged glenoid implant were older (67.4 vs. 60.7 years; p < 0.001) and had a shorter follow-up time (9.3 vs. 14.4 years; p < 0.001) than patients with a keeled glenoid implant. At the most recent follow-up visit, there were no significant differences among postoperative FE, ER, AHI, or PROs. However, pegged glenoid implants provided significantly more internal rotation (T11 vs. L1; p = 0.010) and were less likely to show evidence of radiographic glenoid loosening (16.7% vs. 42.9%; p=<0.001). Revision rates were not significantly different between the pegged and keeled groups (6.9% vs. 14.3%; p = 0.158). Conclusion: Although a pegged design correlated with superior internal rotation and less radiographic glenoid loosening, both pegged and keeled glenoid designs offered favorable long-term clinical outcomes following TSA over the long-term.

20.
J Orthop ; 35: 13-17, 2023 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-36338316

RESUMO

Background: Alcohol use disorder has been associated with broad health consequences that may interfere with healing after total shoulder arthroplasty. The aim of this study was to explore the impact of alcohol use disorder on readmissions and complications following total shoulder arthroplasty. Methods: We used data from the Healthcare Cost and Utilization Project National Readmissions Database (NRD) from 2016 to 2018. Patients were included based on International Classification of Diseases, 10th Revision (ICD-10) procedure codes for anatomic total shoulder arthroplasty (aTSA) and reverse total shoulder arthroplasty (rTSA). Patients with an alcohol use disorder (AUD) were identified using the ICD-10 diagnosis code F10.20. Demographics, complications, and 30-day and 90-day readmission were collected for all patients. A univariate logistic regression was performed to investigate AUD as a factor affecting readmission and complication rates. A multivariate logistic regression model was created to assess the impact of alcohol use disorder on complications and readmission while controlling for demographic factors. Results: In total, 164,527 patients were included, and 503 (0.3%) patients had a prior diagnosis of AUD. Revision surgery was more common in patients with an alcohol use disorder (8.8% vs. 6.2%; p = 0.022). Postoperative infection (p = 0.026), dislocation (p = 0.025), liver complications (p < 0.01), and 90-day readmission (p < 0.01) were more common in patients with a diagnosed AUD. On multivariate analysis, patients with an AUD were found to be at increased odds for liver complications (OR: 46.8; 95% CI: [32.8, 66.8]; p < 0.01). Comparatively, mean age, length of stay, and over healthcare costs were also higher for patients with an AUD. Conclusion: Patients with a diagnosis of AUD were more likely to suffer from shoulder dislocation, liver complications, and 90-day readmission, while also being younger and having longer hospital stays. Therefore, surgeons should take caution to anticipate and prevent complications and readmissions following total shoulder arthroplasty in patients with an AUD.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA