Pesquisa | Secretaria de Estado da Saúde

1.

ChatGPT versus NASS clinical guidelines for degenerative spondylolisthesis: a comparative analysis.

Ahmed, Wasil; Saturno, Michael; Rajjoub, Rami; Duey, Akiro H; Zaidat, Bashar; Hoang, Timothy; Restrepo Mejia, Mateo; Gallate, Zachary S; Shrestha, Nancy; Tang, Justin; Zapolsky, Ivan; Kim, Jun S; Cho, Samuel K.

Eur Spine J ; 2024 Mar 15.

Artigo em Inglês | MEDLINE | ID: mdl-38489044

RESUMO

BACKGROUND CONTEXT: Clinical guidelines, developed in concordance with the literature, are often used to guide surgeons' clinical decision making. Recent advancements of large language models and artificial intelligence (AI) in the medical field come with exciting potential. OpenAI's generative AI model, known as ChatGPT, can quickly synthesize information and generate responses grounded in medical literature, which may prove to be a useful tool in clinical decision-making for spine care. The current literature has yet to investigate the ability of ChatGPT to assist clinical decision making with regard to degenerative spondylolisthesis. PURPOSE: The study aimed to compare ChatGPT's concordance with the recommendations set forth by The North American Spine Society (NASS) Clinical Guideline for the Diagnosis and Treatment of Degenerative Spondylolisthesis and assess ChatGPT's accuracy within the context of the most recent literature. METHODS: ChatGPT-3.5 and 4.0 was prompted with questions from the NASS Clinical Guideline for the Diagnosis and Treatment of Degenerative Spondylolisthesis and graded its recommendations as "concordant" or "nonconcordant" relative to those put forth by NASS. A response was considered "concordant" when ChatGPT generated a recommendation that accurately reproduced all major points made in the NASS recommendation. Any responses with a grading of "nonconcordant" were further stratified into two subcategories: "Insufficient" or "Over-conclusive," to provide further insight into grading rationale. Responses between GPT-3.5 and 4.0 were compared using Chi-squared tests. RESULTS: ChatGPT-3.5 answered 13 of NASS's 28 total clinical questions in concordance with NASS's guidelines (46.4%). Categorical breakdown is as follows: Definitions and Natural History (1/1, 100%), Diagnosis and Imaging (1/4, 25%), Outcome Measures for Medical Intervention and Surgical Treatment (0/1, 0%), Medical and Interventional Treatment (4/6, 66.7%), Surgical Treatment (7/14, 50%), and Value of Spine Care (0/2, 0%). When NASS indicated there was sufficient evidence to offer a clear recommendation, ChatGPT-3.5 generated a concordant response 66.7% of the time (6/9). However, ChatGPT-3.5's concordance dropped to 36.8% when asked clinical questions that NASS did not provide a clear recommendation on (7/19). A further breakdown of ChatGPT-3.5's nonconcordance with the guidelines revealed that a vast majority of its inaccurate recommendations were due to them being "over-conclusive" (12/15, 80%), rather than "insufficient" (3/15, 20%). ChatGPT-4.0 answered 19 (67.9%) of the 28 total questions in concordance with NASS guidelines (P = 0.177). When NASS indicated there was sufficient evidence to offer a clear recommendation, ChatGPT-4.0 generated a concordant response 66.7% of the time (6/9). ChatGPT-4.0's concordance held up at 68.4% when asked clinical questions that NASS did not provide a clear recommendation on (13/19, P = 0.104). CONCLUSIONS: This study sheds light on the duality of LLM applications within clinical settings: one of accuracy and utility in some contexts versus inaccuracy and risk in others. ChatGPT was concordant for most clinical questions NASS offered recommendations for. However, for questions NASS did not offer best practices, ChatGPT generated answers that were either too general or inconsistent with the literature, and even fabricated data/citations. Thus, clinicians should exercise extreme caution when attempting to consult ChatGPT for clinical recommendations, taking care to ensure its reliability within the context of recent literature.

2.

Robust prediction of nonhome discharge following elective anterior cervical discectomy and fusion using explainable machine learning.

Geng, Eric A; Gal, Jonathan S; Kim, Jun S; Martini, Michael L; Markowitz, Jonathan; Neifert, Sean N; Tang, Justin E; Shah, Kush C; White, Christopher A; Dominy, Calista L; Valliani, Aly A; Duey, Akiro H; Li, Gavin; Zaidat, Bashar; Bueno, Brian; Caridi, John M; Cho, Samuel K.

Eur Spine J ; 32(6): 2149-2156, 2023 06.

Artigo em Inglês | MEDLINE | ID: mdl-36854862

RESUMO

PURPOSE: Predict nonhome discharge (NHD) following elective anterior cervical discectomy and fusion (ACDF) using an explainable machine learning model. METHODS: 2227 patients undergoing elective ACDF from 2008 to 2019 were identified from a single institutional database. A machine learning model was trained on preoperative variables, including demographics, comorbidity indices, and levels fused. The validation technique was repeated stratified K-Fold cross validation with the area under the receiver operating curve (AUROC) statistic as the performance metric. Shapley Additive Explanation (SHAP) values were calculated to provide further explainability regarding the model's decision making. RESULTS: The preoperative model performed with an AUROC of 0.83 ± 0.05. SHAP scores revealed the most pertinent risk factors to be age, medicare insurance, and American Society of Anesthesiology (ASA) score. Interaction analysis demonstrated that female patients over 65 with greater fusion levels were more likely to undergo NHD. Likewise, ASA demonstrated positive interaction effects with female sex, levels fused and BMI. CONCLUSION: We validated an explainable machine learning model for the prediction of NHD using common preoperative variables. Adding transparency is a key step towards clinical application because it demonstrates that our model's "thinking" aligns with clinical reasoning. Interactive analysis demonstrated that those of age over 65, female sex, higher ASA score, and greater fusion levels were more predisposed to NHD. Age and ASA score were similar in their predictive ability. Machine learning may be used to predict NHD, and can assist surgeons with patient counseling or early discharge planning.

Assuntos

Alta do Paciente , Fusão Vertebral , Humanos , Feminino , Idoso , Estados Unidos , Fusão Vertebral/métodos , Medicare , Discotomia/métodos , Aprendizado de Máquina , Estudos Retrospectivos

3.

A national analysis on complications and readmissions for adult cerebral palsy patients undergoing primary spinal fusion surgery.

Fields, Michael; Lee, Nathan J; McCormick, Kyle; Park, Paul J; Boddapati, Venkat; Cerpa, Meghan; Kim, Jun S; Sardar, Zeeshan M; Lenke, Lawrence G.

Eur Spine J ; 31(3): 718-725, 2022 03.

Artigo em Inglês | MEDLINE | ID: mdl-35067761

RESUMO

STUDY DESIGN: Retrospective National Database Study. OBJECTIVE: Surgical intervention with spinal fusion is often indicated in cerebral palsy (CP) patients with progressive scoliosis. The purpose of this study was to utilize the National Readmission Database to determine the national estimates of complication rates, 90-day readmission rates, and costs associated with spinal fusion in adult patients with CP. METHODS: The 2012-2015 NRD databases were queried for all adult (age ≥ 19 years) patients diagnosed with CP (ICD-9: 333.71, 343.0-4, and 343.8-9) undergoing spinal fusion (ICD-9: 81.00-08). RESULTS: 1166 adult patients with CP (42.7% female) underwent spinal fusion surgery between 2012 and 2015. 153 (13.1%) were readmitted within 90 days following the primary surgery, with a mean 33.8 ± 26.5 days. Mean hospital charge of the primary admission was $141,416 ± $157,359 and $167,081 ± $145,416 for the non-readmitted and readmitted patients, respectively (p = 0.06). The mean 90-day readmission charge was $72,479 ± $104,100. Most common complications with the primary admission included UTIs (no readmission vs. readmission: 7.6% vs. 4.8%; p = 0.18), respiratory (6.9% vs. 5.6%; p = 0.62), implant (3.8% vs. 6.0%; p = 0.21), and paralytic ileus (3.6% vs. 3.2%; p = 0.858). Multivariate analyses demonstrated the following as independent predictors for 90-day readmission: comorbid anemia (OR: 2.8; 95% CI: 1.6-4.9; p < 0.001), coagulopathy (2.9, 1.1-8.0, 0.037), perioperative blood transfusion (2.0, 1.1-3.8, 0.026), wound complication (6.4, 1.3-31.6, 0.023), and transfer to short-term hospital versus routine disposition (4.9, 1.0-23.3, 0.045). CONCLUSION: Quality improvement efforts should be aimed at reducing rates of infection related complications as this was the most common reason for short-term complications and unplanned readmission following surgery.

Assuntos

Paralisia Cerebral , Fusão Vertebral , Adulto , Paralisia Cerebral/complicações , Paralisia Cerebral/epidemiologia , Paralisia Cerebral/cirurgia , Feminino , Humanos , Masculino , Readmissão do Paciente , Complicações Pós-Operatórias/epidemiologia , Complicações Pós-Operatórias/etiologia , Estudos Retrospectivos , Fatores de Risco , Fusão Vertebral/efeitos adversos , Adulto Jovem

4.

Economic Impact of Unused Surgical Instruments in an Orthopaedic Surgery Department at an Academic Medical Center: A Prospective Cross-sectional Study.

Shim, Stephanie S; Danford, Nicholas C; Wright, Margaret L; Abernathie, Laura A Vogel; Kim, Jun S; Kadiyala, R Kumar; Vosseller, J Turner.

J Surg Orthop Adv ; 30(3): 131-135, 2021.

Artigo em Inglês | MEDLINE | ID: mdl-34590999

RESUMO

Orthopaedic surgical trays contain unused instruments, but we do not know which specific instruments go unused nor do we know the savings from eliminating them from a given tray. This was a single-site, observational study conducted at an academic medical center. The primary outcome was type of unused instruments and percentage of instruments used in two commonly used surgical trays. The secondary outcome was cost savings in United States dollars (USD) that could be attained by eliminating these instruments. In the first tray, five instruments (10.6%) were unused in any of 37 observed cases. In the second tray, nineteen instruments (19.6%) were unused in 37 observed cases. The total annual savings from replacement cost analysis and reprocessing cost analysis was $6,597.00 USD. Unused instruments are common in surgical trays. Eliminating unused instruments can result in immediate cost savings. (Journal of Surgical Orthopaedic Advances 30(3):131-135, 2021).

Assuntos

Salas Cirúrgicas , Procedimentos Ortopédicos , Centros Médicos Acadêmicos , Redução de Custos , Estudos Transversais , Humanos , Estudos Prospectivos , Instrumentos Cirúrgicos

5.

ChatGPT and its Role in the Decision-Making for the Diagnosis and Treatment of Lumbar Spinal Stenosis: A Comparative Analysis and Narrative Review.

Rajjoub, Rami; Arroyave, Juan Sebastian; Zaidat, Bashar; Ahmed, Wasil; Mejia, Mateo Restrepo; Tang, Justin; Kim, Jun S; Cho, Samuel K.

Global Spine J ; 14(3): 998-1017, 2024 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-37560946

RESUMO

STUDY DESIGN: Comparative Analysis and Narrative Review. OBJECTIVE: To assess and compare ChatGPT's responses to the clinical questions and recommendations proposed by The 2011 North American Spine Society (NASS) Clinical Guideline for the Diagnosis and Treatment of Degenerative Lumbar Spinal Stenosis (LSS). We explore the advantages and disadvantages of ChatGPT's responses through an updated literature review on spinal stenosis. METHODS: We prompted ChatGPT with questions from the NASS Evidence-based Clinical Guidelines for LSS and compared its generated responses with the recommendations provided by the guidelines. A review of the literature was performed via PubMed, OVID, and Cochrane on the diagnosis and treatment of lumbar spinal stenosis between January 2012 and April 2023. RESULTS: 14 questions proposed by the NASS guidelines for LSS were uploaded into ChatGPT and directly compared to the responses offered by NASS. Three questions were on the definition and history of LSS, one on diagnostic tests, seven on non-surgical interventions and three on surgical interventions. The review process found 40 articles that were selected for inclusion that helped corroborate or contradict the responses that were generated by ChatGPT. CONCLUSIONS: ChatGPT's responses were similar to findings in the current literature on LSS. These results demonstrate the potential for implementing ChatGPT into the spine surgeon's workplace as a means of supporting the decision-making process for LSS diagnosis and treatment. However, our narrative summary only provides a limited literature review and additional research is needed to standardize our findings as means of validating ChatGPT's use in the clinical space.

6.

An analysis of ChatGPT recommendations for the diagnosis and treatment of cervical radiculopathy.

Hoang, Timothy; Liou, Lathan; Rosenberg, Ashley M; Zaidat, Bashar; Duey, Akiro H; Shrestha, Nancy; Ahmed, Wasil; Tang, Justin; Kim, Jun S; Cho, Samuel K.

J Neurosurg Spine ; : 1-11, 2024 Jun 28.

Artigo em Inglês | MEDLINE | ID: mdl-38941643

RESUMO

OBJECTIVE: The objective of this study was to assess the safety and accuracy of ChatGPT recommendations in comparison to the evidence-based guidelines from the North American Spine Society (NASS) for the diagnosis and treatment of cervical radiculopathy. METHODS: ChatGPT was prompted with questions from the 2011 NASS clinical guidelines for cervical radiculopathy and evaluated for concordance. Selected key phrases within the NASS guidelines were identified. Completeness was measured as the number of overlapping key phrases between ChatGPT responses and NASS guidelines divided by the total number of key phrases. A senior spine surgeon evaluated the ChatGPT responses for safety and accuracy. ChatGPT responses were further evaluated on their readability, similarity, and consistency. Flesch Reading Ease scores and Flesch-Kincaid reading levels were measured to assess readability. The Jaccard Similarity Index was used to assess agreement between ChatGPT responses and NASS clinical guidelines. RESULTS: A total of 100 key phrases were identified across 14 NASS clinical guidelines. The mean completeness of ChatGPT-4 was 46%. ChatGPT-3.5 yielded a completeness of 34%. ChatGPT-4 outperformed ChatGPT-3.5 by a margin of 12%. ChatGPT-4.0 outputs had a mean Flesch reading score of 15.24, which is very difficult to read, requiring a college graduate education to understand. ChatGPT-3.5 outputs had a lower mean Flesch reading score of 8.73, indicating that they are even more difficult to read and require a professional education level to do so. However, both versions of ChatGPT were more accessible than NASS guidelines, which had a mean Flesch reading score of 4.58. Furthermore, with NASS guidelines as a reference, ChatGPT-3.5 registered a mean ± SD Jaccard Similarity Index score of 0.20 ± 0.078 while ChatGPT-4 had a mean of 0.18 ± 0.068. Based on physician evaluation, outputs from ChatGPT-3.5 and ChatGPT-4.0 were safe 100% of the time. Thirteen of 14 (92.8%) ChatGPT-3.5 responses and 14 of 14 (100%) ChatGPT-4.0 responses were in agreement with current best clinical practices for cervical radiculopathy according to a senior spine surgeon. CONCLUSIONS: ChatGPT models were able to provide safe and accurate but incomplete responses to NASS clinical guideline questions about cervical radiculopathy. Although the authors' results suggest that improvements are required before ChatGPT can be reliably deployed in a clinical setting, future versions of the LLM hold promise as an updated reference for guidelines on cervical radiculopathy. Future versions must prioritize accessibility and comprehensibility for a diverse audience.

7.

Can generative artificial intelligence pass the orthopaedic board examination?

Isleem, Ula N; Zaidat, Bashar; Ren, Renee; Geng, Eric A; Burapachaisri, Aonnicha; Tang, Justin E; Kim, Jun S; Cho, Samuel K.

J Orthop ; 53: 27-33, 2024 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-38450060

RESUMO

Background: Resident training programs in the US use the Orthopaedic In-Training Examination (OITE) developed by the American Academy of Orthopaedic Surgeons (AAOS) to assess the current knowledge of their residents and to identify the residents at risk of failing the Amerian Board of Orthopaedic Surgery (ABOS) examination. Optimal strategies for OITE preparation are constantly being explored. There may be a role for Large Language Models (LLMs) in orthopaedic resident education. ChatGPT, an LLM launched in late 2022 has demonstrated the ability to produce accurate, detailed answers, potentially enabling it to aid in medical education and clinical decision-making. The purpose of this study is to evaluate the performance of ChatGPT on Orthopaedic In-Training Examinations using Self-Assessment Exams from the AAOS database and approved literature as a proxy for the Orthopaedic Board Examination. Methods: 301 SAE questions from the AAOS database and associated AAOS literature were input into ChatGPT's interface in a question and multiple-choice format and the answers were then analyzed to determine which answer choice was selected. A new chat was used for every question. All answers were recorded, categorized, and compared to the answer given by the OITE and SAE exams, noting whether the answer was right or wrong. Results: Of the 301 questions asked, ChatGPT was able to correctly answer 183 (60.8%) of them. The subjects with the highest percentage of correct questions were basic science (81%), oncology (72.7%, shoulder and elbow (71.9%), and sports (71.4%). The questions were further subdivided into 3 groups: those about management, diagnosis, or knowledge recall. There were 86 management questions and 47 were correct (54.7%), 45 diagnosis questions with 32 correct (71.7%), and 168 knowledge recall questions with 102 correct (60.7%). Conclusions: ChatGPT has the potential to provide orthopedic educators and trainees with accurate clinical conclusions for the majority of board-style questions, although its reasoning should be carefully analyzed for accuracy and clinical validity. As such, its usefulness in a clinical educational context is currently limited but rapidly evolving. Clinical relevance: ChatGPT can access a multitude of medical data and may help provide accurate answers to clinical questions.

8.

Association Between Age-stratified Cohorts and Perioperative Complications and 30-day and 90-day Readmission in Patients Undergoing Single-level Anterior Cervical Discectomy and Fusion.

Yeshoua, Brandon J; Singh, Sirjanhar; Liu, Helen; Assad, Nima; Dominy, Calista L; Pasik, Sara D; Tang, Justin E; Patel, Akshar; Shah, Kush C; Ranson, William; Kim, Jun S; Cho, Samuel K.

Clin Spine Surg ; 37(1): E9-E17, 2024 02 01.

Artigo em Inglês | MEDLINE | ID: mdl-37559220

RESUMO

STUDY DESIGN: Retrospective analysis. OBJECTIVE: To assess perioperative complication rates and readmission rates after ACDF in a patient population of advanced age. SUMMARY OF BACKGROUND DATA: Readmission rates after ACDF are important markers of surgical quality and, with recent shifts in reimbursement schedules, they are rapidly gaining weight in the determination of surgeon and hospital reimbursement. METHODS: Patients 18 years of age and older who underwent elective single-level ACDF were identified in the National Readmissions Database (NRD) and stratified into 4 cohorts: 18-39 ("young"), 40-64 ("middle"), 65-74 ("senior"), and 75+ ("elderly") years of age. For each cohort, the perioperative complications, frequency of those complications, and number of patients with at least 1 readmission within 30 and 90 days of discharge were analyzed. χ 2 tests were used to calculate likelihood of complications and readmissions. RESULTS: There were 1174 "elderly" patients in 2016, 1072 in 2017, and 1010 in 2018 who underwent ACDF. Their rate of any complication was 8.95%, 11.00%, and 13.47%, respectively ( P <0.0001), with dysphagia and acute posthemorrhagic anemia being the most common across all 3 years. They experienced complications at a greater frequency than their younger counterparts (15.80%, P <0.0001; 16.98%, P <0.0001; 21.68%, P <0.0001). They also required 30-day and 90-day readmission more frequently ( P <0.0001). CONCLUSION: It has been well-established that advanced patient age brings greater risk of perioperative complications in ACDF surgery. What remains unsettled is the characterization of this age-complication relationship within specific age cohorts and how these complications inform patient hospital course. Our study provides an updated analysis of age-specific complications and readmission rates in ACDF patients. Orthopedic surgeons may account for the rise in complication and readmission rates in this population with the corresponding reduction in length and stay and consider this relationship before discharging elderly ACDF patients.

Assuntos

Readmissão do Paciente , Fusão Vertebral , Humanos , Adolescente , Adulto , Idoso , Estudos Retrospectivos , Vértebras Cervicais/cirurgia , Fusão Vertebral/efeitos adversos , Discotomia/efeitos adversos , Complicações Pós-Operatórias/epidemiologia

9.

Reliable Prediction of Discharge Disposition Following Cervical Spine Surgery With Ensemble Machine Learning and Validation on a National Cohort.

Feng, Rui; Valliani, Aly A; Martini, Michael L; Gal, Jonathan S; Neifert, Sean N; Kim, Nora C; Geng, Eric A; Kim, Jun S; Cho, Samuel K; Oermann, Eric K; Caridi, John M.

Clin Spine Surg ; 37(1): E30-E36, 2024 02 01.

Artigo em Inglês | MEDLINE | ID: mdl-38285429

RESUMO

STUDY DESIGN: A retrospective cohort study. OBJECTIVE: The purpose of this study is to develop a machine learning algorithm to predict nonhome discharge after cervical spine surgery that is validated and usable on a national scale to ensure generalizability and elucidate candidate drivers for prediction. SUMMARY OF BACKGROUND DATA: Excessive length of hospital stay can be attributed to delays in postoperative referrals to intermediate care rehabilitation centers or skilled nursing facilities. Accurate preoperative prediction of patients who may require access to these resources can facilitate a more efficient referral and discharge process, thereby reducing hospital and patient costs in addition to minimizing the risk of hospital-acquired complications. METHODS: Electronic medical records were retrospectively reviewed from a single-center data warehouse (SCDW) to identify patients undergoing cervical spine surgeries between 2008 and 2019 for machine learning algorithm development and internal validation. The National Inpatient Sample (NIS) database was queried to identify cervical spine fusion surgeries between 2009 and 2017 for external validation of algorithm performance. Gradient-boosted trees were constructed to predict nonhome discharge across patient cohorts. The area under the receiver operating characteristic curve (AUROC) was used to measure model performance. SHAP values were used to identify nonlinear risk factors for nonhome discharge and to interpret algorithm predictions. RESULTS: A total of 3523 cases of cervical spine fusion surgeries were included from the SCDW data set, and 311,582 cases were isolated from NIS. The model demonstrated robust prediction of nonhome discharge across all cohorts, achieving an area under the receiver operating characteristic curve of 0.87 (SD=0.01) on both the SCDW and nationwide NIS test sets. Anterior approach only, age, elective admission status, Medicare insurance status, and total Elixhauser Comorbidity Index score were the most important predictors of discharge destination. CONCLUSIONS: Machine learning algorithms reliably predict nonhome discharge across single-center and national cohorts and identify preoperative features of importance following cervical spine fusion surgery.

Assuntos

Medicare , Alta do Paciente , Estados Unidos , Humanos , Idoso , Estudos Retrospectivos , Aprendizado de Máquina , Vértebras Cervicais/cirurgia

10.

Performance of ChatGPT on NASS Clinical Guidelines for the Diagnosis and Treatment of Low Back Pain: A Comparison Study.

Shrestha, Nancy; Shen, Zekun; Zaidat, Bashar; Duey, Akiro H; Tang, Justin E; Ahmed, Wasil; Hoang, Timothy; Restrepo Mejia, Mateo; Rajjoub, Rami; Markowitz, Jonathan S; Kim, Jun S; Cho, Samuel K.

Spine (Phila Pa 1976) ; 49(9): 640-651, 2024 May 01.

Artigo em Inglês | MEDLINE | ID: mdl-38213186

RESUMO

STUDY DESIGN: Comparative analysis. OBJECTIVE: To evaluate Chat Generative Pre-trained Transformer (ChatGPT's) ability to predict appropriate clinical recommendations based on the most recent clinical guidelines for the diagnosis and treatment of low back pain. BACKGROUND: Low back pain is a very common and often debilitating condition that affects many people globally. ChatGPT is an artificial intelligence model that may be able to generate recommendations for low back pain. MATERIALS AND METHODS: Using the North American Spine Society Evidence-Based Clinical Guidelines as the gold standard, 82 clinical questions relating to low back pain were entered into ChatGPT (GPT-3.5) independently. For each question, we recorded ChatGPT's answer, then used a point-answer system-the point being the guideline recommendation and the answer being ChatGPT's response-and asked ChatGPT if the point was mentioned in the answer to assess for accuracy. This response accuracy was repeated with one caveat-a prior prompt is given in ChatGPT to answer as an experienced orthopedic surgeon-for each question by guideline category. A two-sample proportion z test was used to assess any differences between the preprompt and postprompt scenarios with alpha=0.05. RESULTS: ChatGPT's response was accurate 65% (72% postprompt, P =0.41) for guidelines with clinical recommendations, 46% (58% postprompt, P =0.11) for guidelines with insufficient or conflicting data, and 49% (16% postprompt, P =0.003*) for guidelines with no adequate study to address the clinical question. For guidelines with insufficient or conflicting data, 44% (25% postprompt, P =0.01*) of ChatGPT responses wrongly suggested that sufficient evidence existed. CONCLUSION: ChatGPT was able to produce a sufficient clinical guideline recommendation for low back pain, with overall improvements if initially prompted. However, it tended to wrongly suggest evidence and often failed to mention, especially postprompt, when there is not enough evidence to adequately give an accurate recommendation.

Assuntos

Dor Lombar , Cirurgiões Ortopédicos , Humanos , Dor Lombar/diagnóstico , Dor Lombar/terapia , Inteligência Artificial , Coluna Vertebral

11.

Performance of a Large Language Model in the Generation of Clinical Guidelines for Antibiotic Prophylaxis in Spine Surgery.

Zaidat, Bashar; Shrestha, Nancy; Rosenberg, Ashley M; Ahmed, Wasil; Rajjoub, Rami; Hoang, Timothy; Mejia, Mateo Restrepo; Duey, Akiro H; Tang, Justin E; Kim, Jun S; Cho, Samuel K.

Neurospine ; 21(1): 128-146, 2024 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-38569639

RESUMO

OBJECTIVE: Large language models, such as chat generative pre-trained transformer (ChatGPT), have great potential for streamlining medical processes and assisting physicians in clinical decision-making. This study aimed to assess the potential of ChatGPT's 2 models (GPT-3.5 and GPT-4.0) to support clinical decision-making by comparing its responses for antibiotic prophylaxis in spine surgery to accepted clinical guidelines. METHODS: ChatGPT models were prompted with questions from the North American Spine Society (NASS) Evidence-based Clinical Guidelines for Multidisciplinary Spine Care for Antibiotic Prophylaxis in Spine Surgery (2013). Its responses were then compared and assessed for accuracy. RESULTS: Of the 16 NASS guideline questions concerning antibiotic prophylaxis, 10 responses (62.5%) were accurate in ChatGPT's GPT-3.5 model and 13 (81%) were accurate in GPT-4.0. Twenty-five percent of GPT-3.5 answers were deemed as overly confident while 62.5% of GPT-4.0 answers directly used the NASS guideline as evidence for its response. CONCLUSION: ChatGPT demonstrated an impressive ability to accurately answer clinical questions. GPT-3.5 model's performance was limited by its tendency to give overly confident responses and its inability to identify the most significant elements in its responses. GPT-4.0 model's responses had higher accuracy and cited the NASS guideline as direct evidence many times. While GPT-4.0 is still far from perfect, it has shown an exceptional ability to extract the most relevant research available compared to GPT-3.5. Thus, while ChatGPT has shown far-reaching potential, scrutiny should still be exercised regarding its clinical use at this time.

12.

The Effect of Intraoperative Overdistraction on Subsidence Following Anterior Cervical Discectomy and Fusion.

Duey, Akiro H; Gonzalez, Christopher; Hoang, Timothy; Geng, Eric A; Ferriter, Pierce J; Rosenberg, Ashley M; Zaidat, Bashar; Zapolsky, Ivan J; Kim, Jun S; Cho, Samuel K.

Clin Spine Surg ; 2024 Jun 03.

Artigo em Inglês | MEDLINE | ID: mdl-38828954

RESUMO

STUDY DESIGN: Retrospective cohort. OBJECTIVE: The purpose of this study was to evaluate the effect of overdistraction on interbody cage subsidence. BACKGROUND: Vertebral overdistraction due to the use of large intervertebral cage sizes may increase the risk of postoperative subsidence. METHODS: Patients who underwent anterior cervical discectomy and fusion between 2016 and 2021 were included. All measurements were performed using lateral cervical radiographs at 3 time points - preoperative, immediate postoperative, and final follow-up >6 months postoperatively. Anterior and posterior distraction were calculated by subtracting the preoperative disc height from the immediate postoperative disc height. Cage subsidence was calculated by subtracting the final follow-up postoperative disc height from the immediate postoperative disc height. Associations between anterior and posterior subsidence and distraction were determined using multivariable linear regression models. The analyses controlled for cage type, cervical level, sex, age, smoking status, and osteopenia. RESULTS: Sixty-eight patients and 125 fused levels were included in the study. Of the 68 fusions, 22 were single-level fusions, 35 were 2-level, and 11 were 3-level. The median final follow-up interval was 368 days (range: 181-1257 d). Anterior disc space subsidence was positively associated with anterior distraction (beta = 0.23; 95% CI: 0.08, 0.38; P = 0.004), and posterior disc space subsidence was positively associated with posterior distraction (beta = 0.29; 95% CI: 0.13, 0.45; P < 0.001). No significant associations between anterior distraction and posterior subsidence (beta = 0.07; 95% CI: -0.06, 0.20; P = 0.270) or posterior distraction and anterior subsidence (beta = 0.06; 95% CI: -0.14, 0.27; P = 0.541) were observed. CONCLUSIONS: We found that overdistraction of the disc space was associated with increased postoperative subsidence after anterior cervical discectomy and fusion. Surgeons should consider choosing a smaller cage size to avoid overdistraction and minimize postoperative subsidence.

13.

How Are Patients Reviewing Spine Surgeons Online? A Sentiment Analysis of Physician Review Website Written Comments.

Tang, Justin E; Arvind, Varun; Dominy, Calista; White, Christopher A; Cho, Samuel K; Kim, Jun S.

Global Spine J ; 13(8): 2107-2114, 2023 Oct.

Artigo em Inglês | MEDLINE | ID: mdl-35085039

RESUMO

STUDY DESIGN: A Sentiment Analysis of online reviews of spine surgeons. OBJECTIVES: Physician review websites have significant impact on a patient's provider selection. Written reviews are subjective, but sentiment analysis through machine learning can quantitatively analyze these reviews. This study analyzes online written reviews of spine surgeons and reports biases associated with demographic factors and trends in words utilized. METHODS: Online written and star-reviews of spine surgeons were obtained from healthgrades.com. A sentiment analysis package was used to analyze the written reviews. The relationship of demographic variables to these scores was analyzed with t-tests and word and bigram frequency analyses were performed. Additionally, a multiple regression analysis was performed on key terms. RESULTS: 8357 reviews of 480 surgeons were analyzed. There was a significant difference between the means of sentiment analysis scores and star scores for both gender and age. Younger, male surgeons were rated more highly on average (P < .01). Word frequency analysis indicated that behavioral factors and pain were the main contributing factors to both the best and worst reviewed surgeons. Additionally, several clinically relevant words, when included in a review, affected the odds of a positive review. CONCLUSIONS: The best reviews laud surgeons for their ability to manage pain and for exhibiting positive bedside manner. However, the worst reviews primarily focus on pain and its management, as exhibited by the frequency and multivariate analysis. Pain is a clear contributing factor to reviews, thus emphasizing the importance of establishing proper pain expectations prior to any intervention.

14.

How are Patients Describing You Online? A Natural Language Processing Driven Sentiment Analysis of Online Reviews on CSRS Surgeons.

Tang, Justin; Arvind, Varun; White, Christopher A; Dominy, Calista; Cho, Samuel; Kim, Jun S.

Clin Spine Surg ; 36(2): E107-E113, 2023 03 01.

Artigo em Inglês | MEDLINE | ID: mdl-35945670

RESUMO

STUDY DESIGN: A quantitative analysis of written, online reviews of Cervical Spine Research Society (CSRS) surgeons. OBJECTIVE: This study quantitatively analyzes the written reviews of members of the CSRS to report biases associated with demographic factors and frequently used words in reviews to help aid physician practices. SUMMARY OF BACKGROUND DATA: Physician review websites have influence on a patient's selection of a provider, but written reviews are subjective. Sentiment analysis of writing through artificial intelligence can quantify surgeon reviews to provide actionable feedback. METHODS: Online written and star-rating reviews of CSRS surgeons were obtained from healthgrades.com. A sentiment analysis package was used to obtain compound scores of each physician's reviews. The relationship between demographic variables and average sentiment score of written reviews were evaluated through t -tests. Positive and negative word and bigram frequency analysis was performed to indicate trends in the reviews' language. RESULTS: In all, 2239 CSRS surgeon's reviews were analyzed. Analysis showed a positive correlation between the sentiment scores and overall average star-rated reviews ( r2 =0.60, P <0.01). There was no difference in review sentiment by provider sex. However, the age of surgeons showed a significant difference as those <55 had more positive reviews (mean=+0.50) than surgeons >=55 (mean=+0.37) ( P <0.01). The most positive reviews focused both on pain and behavioral factors, whereas the most negative focused mainly on pain. Behavioral attributes increased the odds of receiving positive reviews while pain decreased them. CONCLUSION: The top-rated surgeons were described as considerate providers and effective at managing pain in their most frequently used words and bigrams. However, the worst-rated ones were mainly described as unable to relieve pain. Through quantitative analysis of physician reviews, pain is a clear factor contributing to both positive and negative reviews of surgeons, reinforcing the need for proper pain expectation management. LEVEL OF EVIDENCE: Level 4-retrospective case-control study.

Assuntos

Processamento de Linguagem Natural , Cirurgiões , Humanos , Estudos Retrospectivos , Análise de Sentimentos , Estudos de Casos e Controles , Inteligência Artificial , Satisfação do Paciente , Dor , Vértebras Cervicais , Internet

15.

Artificially Intelligent Billing in Spine Surgery: An Analysis of a Large Language Model.

Zaidat, Bashar; Lahoti, Yash S; Yu, Alexander; Mohamed, Kareem S; Cho, Samuel K; Kim, Jun S.

Global Spine J ; : 21925682231224753, 2023 Dec 26.

Artigo em Inglês | MEDLINE | ID: mdl-38147047

RESUMO

STUDY DESIGN: Retrospective cohort study. OBJECTIVES: This study assessed the effectiveness of a popular large language model, ChatGPT-4, in predicting Current Procedural Terminology (CPT) codes from surgical operative notes. By employing a combination of prompt engineering, natural language processing (NLP), and machine learning techniques on standard operative notes, the study sought to enhance billing efficiency, optimize revenue collection, and reduce coding errors. METHODS: The model was given 3 different types of prompts for 50 surgical operative notes from 2 spine surgeons. The first trial was simply asking the model to generate CPT codes for a given OP note. The second trial included 3 OP notes and associated CPT codes to, and the third trial included a list of every possible CPT code in the dataset to prime the model. CPT codes generated by the model were compared to those generated by the billing department. Model evaluation was performed in the form of calculating the area under the ROC (AUROC), and area under precision-recall curves (AUPRC). RESULTS: The trial that involved priming ChatGPT with a list of every possible CPT code performed the best, with an AUROC of .87 and an AUPRC of .67, and an AUROC of .81 and AUPRC of .76 when examining only the most common CPT codes. CONCLUSIONS: ChatGPT-4 can aid in automating CPT billing from orthopedic surgery operative notes, driving down healthcare expenditures and enhancing billing code precision as the model evolves and fine-tuning becomes available.

16.

Using Sentiment Analysis to Understand What Patients Are Saying About Hand Surgeons Online.

Tang, Justin E; Arvind, Varun; White, Christopher A; Dominy, Calista; Kim, Jun S; Cho, Samuel K; Walsh, Amanda.

Hand (N Y) ; 18(5): 854-860, 2023 07.

Artigo em Inglês | MEDLINE | ID: mdl-34969297

RESUMO

BACKGROUND: Physician review websites have influence on a patient's selection of a provider. Written reviews are subjective and difficult to quantitatively analyze. Sentiment analysis of writing can quantitatively assess surgeon reviews to provide actionable feedback for surgeons to improve practice. The objective of this study is to quantitatively analyze large subset of written reviews of hand surgeons using sentiment analysis and report unbiased trends in words used to describe the reviewed surgeons and biases associated with surgeon demographic factors. METHODS: Online written and star-rating reviews of hand surgeons were obtained from healthgrades.com and webmd.com. A sentiment analysis package was used to calculate compound scores of all reviews. Mann-Whitney U tests were performed to determine the relationship between demographic variables and average sentiment score of written reviews. Positive and negative word and word-pair frequency analysis was also performed. RESULTS: A total of 786 hand surgeons' reviews were analyzed. Analysis showed a significant relationship between the sentiment scores and overall average star-rated reviews (r2 = 0.604, P ≤ .01). There was no significant difference in review sentiment by provider sex; however, surgeons aged 50 years and younger had more positive reviews than older (P < .01). The most frequently used bigrams used to describe top-rated surgeons were associated with good bedside manner and efficient pain management, whereas those with the worst reviews are often characterized as rude and unable to relieve pain. CONCLUSIONS: This study provides insight into both demographic and behavioral factors contributing to positive reviews and reinforces the importance of pain expectation management.

Assuntos

Competência Clínica , Cirurgiões , Humanos , Análise de Sentimentos , Satisfação do Paciente

17.

Readmission and Associated Factors in Surgical Versus Non-Surgical Management of Spinal Epidural Abscess: A Nationwide Readmissions Database Analysis.

Pitaro, Nicholas L; Tang, Justin E; Arvind, Varun; Cho, Brian H; Geng, Eric A; Amakiri, Uchechukwu O; Cho, Samuel K; Kim, Jun S.

Global Spine J ; 13(6): 1533-1540, 2023 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-34866455

RESUMO

STUDY DESIGN: Retrospective cohort study. OBJECTIVES: Spinal epidural abscess (SEA) is a rare but potentially life-threatening infection treated with antimicrobials and, in most cases, immediate surgical decompression. Previous studies comparing medical and surgical management of SEA are low powered and limited to a single institution. As such, the present study compares readmission in surgical and non-surgical management using a large national dataset. METHODS: We identified all hospital admissions for SEA using the Nationwide Readmissions Database (NRD), which is the largest collection of hospital admissions data. Patients were grouped into surgically and non-surgically managed cohorts using ICD-10 coding and compared using information retrieved from the NRD such as demographics, comorbidities, length of stay and cost of admission. RESULTS: We identified 350 surgically managed and 350 non-surgically managed patients. The 90-day readmission rates for surgical and non-surgical management were 26.0% and 35.1%, respectively (P < .05). Expectedly, surgical management was associated with a significantly higher charge and length of stay at index hospital admission. Surgically managed patients had a significantly lower risk of readmission for osteomyelitis (P < .05). Finally, in patients with a low comorbidity burden, we observed a significantly lower 90-day readmission rate for surgically managed patients (surgical: 23.0%, non-surgical: 33.8%, P < .05). CONCLUSION: In patients with a low comorbidity burden, we observed a significantly lower readmission rate for surgically managed patients than non-surgically managed patients. The results of this study suggest a lower readmission rate as an advantage to surgical management of SEA and emphasize the importance of SEA as a not-to-miss diagnosis.

18.

Previous Emergency Department Admission Is Associated With Increased 90-Day Readmission Following Cervical Spine Surgery: Evidenced Using Propensity Score Matching.

Amakiri, Uchechukwu O; Dominy, Calista; Kumar, Anish; Arvind, Varun; Pitaro, Nicholas L; Kim, Jun S; Cho, Samuel K.

Clin Spine Surg ; 36(5): E198-E205, 2023 06 01.

Artigo em Inglês | MEDLINE | ID: mdl-36727862

RESUMO

STUDY DESIGN: This was a retrospective case-control study. OBJECTIVE: The objective of this study was to evaluate whether prior emergency department admission was associated with an increased risk for 90-day readmission following elective cervical spinal fusion. SUMMARY OF BACKGROUND DATA: The incidence of cervical spine fusion reoperations has increased, necessitating the improvement of patient outcomes following surgery. Currently, there are no studies assessing the impact of emergency department visits before surgery on the risk of 90-day readmission following elective cervical spine surgery. This study aimed to fill this gap and identify a novel risk factor for readmission following elective cervical fusion. METHODS: The 2016-2018 Nationwide Readmissions Database was queried for patients aged 18 years and older who underwent an elective cervical fusion. Prior emergency admissions were defined using the variable HCUP_ED in the Nationwide Readmissions Database database. Univariate analysis of patient demographic details, comorbidities, discharge disposition, and perioperative complication was evaluated using a χ 2 test followed by multivariate logistic regression. RESULTS: In all, 2766 patients fit the inclusion criteria, and 18.62% of patients were readmitted within 90 days. Intraoperative complications, gastrointestinal complications, valvular, uncomplicated hypertension, peripheral vascular disorders, chronic obstructive pulmonary disease, cancer, and experiencing less than 3 Charlson comorbidities were identified as independent predictors of 90-day readmission. Patients with greater than 3 Charlson comorbidities (OR=0.04, 95% CI 0.01-0.12, P <0.001) and neurological complications (OR=0.29, 95% CI 0.10-0.86, P =0.026) had decreased odds for 90-day readmission. Importantly, previous emergency department visits within the calendar year before surgery were a new independent predictor of 90-day readmission (OR=9.74, 95% CI 6.86-13.83, P <0.001). CONCLUSIONS: A positive association exists between emergency department admission history and 90-day readmission following elective cervical fusion. Screening cervical fusion patients for this history and optimizing outcomes in those patients may reduce 90-day readmission rates.

Assuntos

Doenças da Coluna Vertebral , Fusão Vertebral , Humanos , Readmissão do Paciente , Estudos Retrospectivos , Complicações Pós-Operatórias/epidemiologia , Estudos de Casos e Controles , Pontuação de Propensão , Doenças da Coluna Vertebral/cirurgia , Fusão Vertebral/efeitos adversos , Fatores de Risco , Vértebras Cervicais/cirurgia , Serviço Hospitalar de Emergência

19.

Development of a machine learning algorithm to identify total and reverse shoulder arthroplasty implants from X-ray images.

Geng, Eric A; Cho, Brian H; Valliani, Aly A; Arvind, Varun; Patel, Akshar V; Cho, Samuel K; Kim, Jun S; Cagle, Paul J.

J Orthop ; 35: 74-78, 2023 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-36411845

RESUMO

Introduction: Demand for total shoulder arthroplasty (TSA) has risen significantly and is projected to continue growing. From 2012 to 2017, the incidence of reverse total shoulder arthroplasty (rTSA) rose from 7.3 cases per 100,000 to 19.3 per 100,000. Anatomical TSA saw a growth from 9.5 cases per 100,000 to 12.5 per 100,000. Failure to identify implants in a timely manner can increase operative time, cost and risk of complications. Several machine learning models have been developed to perform medical image analysis. However, they have not been widely applied in shoulder surgery. The authors developed a machine learning model to identify shoulder implant manufacturers and type from anterior-posterior X-ray images. Methods: The model deployed was a convolutional neural network (CNN), which has been widely used in computer vision tasks. 696 radiographs were obtained from a single institution. 70% were used to train the model, while evaluation was done on 30%. Results: On the evaluation set, the model performed with an overall accuracy of 93.9% with positive predictive value, sensitivity and F-1 scores of 94% across 10 different implant types (4 reverse, 6 anatomical). Average identification time was 0.110 s per implant. Conclusion: This proof of concept study demonstrates that machine learning can assist with preoperative planning and improve cost-efficiency in shoulder surgery.

20.

A national analysis of the effect alcohol use disorder has on short-term complications and readmissions following total shoulder arthroplasty.

White, Christopher A; Quinones, Addison; Tang, Justin E; Butler, Liam R; Duey, Akiro H; Kim, Jun S; Cho, Samuel K; Cagle, Paul J.

J Orthop ; 35: 13-17, 2023 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-36338316

RESUMO

Background: Alcohol use disorder has been associated with broad health consequences that may interfere with healing after total shoulder arthroplasty. The aim of this study was to explore the impact of alcohol use disorder on readmissions and complications following total shoulder arthroplasty. Methods: We used data from the Healthcare Cost and Utilization Project National Readmissions Database (NRD) from 2016 to 2018. Patients were included based on International Classification of Diseases, 10th Revision (ICD-10) procedure codes for anatomic total shoulder arthroplasty (aTSA) and reverse total shoulder arthroplasty (rTSA). Patients with an alcohol use disorder (AUD) were identified using the ICD-10 diagnosis code F10.20. Demographics, complications, and 30-day and 90-day readmission were collected for all patients. A univariate logistic regression was performed to investigate AUD as a factor affecting readmission and complication rates. A multivariate logistic regression model was created to assess the impact of alcohol use disorder on complications and readmission while controlling for demographic factors. Results: In total, 164,527 patients were included, and 503 (0.3%) patients had a prior diagnosis of AUD. Revision surgery was more common in patients with an alcohol use disorder (8.8% vs. 6.2%; p = 0.022). Postoperative infection (p = 0.026), dislocation (p = 0.025), liver complications (p < 0.01), and 90-day readmission (p < 0.01) were more common in patients with a diagnosed AUD. On multivariate analysis, patients with an AUD were found to be at increased odds for liver complications (OR: 46.8; 95% CI: [32.8, 66.8]; p < 0.01). Comparatively, mean age, length of stay, and over healthcare costs were also higher for patients with an AUD. Conclusion: Patients with a diagnosis of AUD were more likely to suffer from shoulder dislocation, liver complications, and 90-day readmission, while also being younger and having longer hospital stays. Therefore, surgeons should take caution to anticipate and prevent complications and readmissions following total shoulder arthroplasty in patients with an AUD.

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

Detalhe da pesquisa