ABSTRACT
INTRODUCTION: Generative artificial intelligence (AI) chatbots have recently been posited as potential sources of online medical information for patients making medical decisions. Existing online patient-oriented medical information has repeatedly been shown to be of variable quality and difficult readability. Therefore, we sought to evaluate the content and quality of AI-generated medical information on acute appendicitis. METHODS: A modified DISCERN assessment tool, comprising 16 distinct criteria each scored on a 5-point Likert scale (score range 16-80), was used to assess AI-generated content. Readability was determined using the Flesch Reading Ease (FRE) and Flesch-Kincaid Grade Level (FKGL) scores. Four popular chatbots, ChatGPT-3.5 and ChatGPT-4, Bard, and Claude-2, were prompted to generate medical information about appendicitis. Three investigators independently scored the generated texts blinded to the identity of the AI platforms. RESULTS: ChatGPT-3.5, ChatGPT-4, Bard, and Claude-2 had overall mean (SD) quality scores of 60.7 (1.2), 62.0 (1.0), 62.3 (1.2), and 51.3 (2.3), respectively, on a scale of 16-80. Inter-rater reliability was 0.81, 0.75, 0.81, and 0.72, respectively, indicating substantial agreement. Claude-2 demonstrated a significantly lower mean quality score compared to ChatGPT-4 (p = 0.001), ChatGPT-3.5 (p = 0.005), and Bard (p = 0.001). Bard was the only AI platform that listed verifiable sources, while Claude-2 provided fabricated sources. All chatbots except for Claude-2 advised readers to consult a physician if experiencing symptoms. Regarding readability, FKGL and FRE scores of ChatGPT-3.5, ChatGPT-4, Bard, and Claude-2 were 14.6 and 23.8, 11.9 and 33.9, 8.6 and 52.8, 11.0 and 36.6, respectively, indicating difficulty readability at a college reading skill level. CONCLUSION: AI-generated medical information on appendicitis scored favorably upon quality assessment, but most either fabricated sources or did not provide any altogether. Additionally, overall readability far exceeded recommended levels for the public. Generative AI platforms demonstrate measured potential for patient education and engagement about appendicitis.
Subject(s)
Appendicitis , Artificial Intelligence , Humans , Comprehension , Internet , Consumer Health Information/standards , Patient Education as Topic/methodsABSTRACT
BACKGROUND: The recurrence rate of paraesophageal hernia repair (PEHR) is high with reported rates of recurrence varying between 25 and 42%. We present a novel approach to PEHR that involves the visualization of a critical view to decrease recurrence rate. Our study aims to investigate the outcomes of PEHR following the implementation of a critical view. METHODS: This is a single-center retrospective study that examines operative outcomes in patients who underwent PEHR with a critical view in comparison to patients who underwent standard repair. The critical view is defined as full dissection of the posterior mediastinum with complete mobilization of the esophagus to the level of the inferior pulmonary vein, visualization of the left crus of the diaphragm as well as the left gastric artery while the distal esophagus is retracted to expose the spleen in the background. Bivariate chi-squared analysis and multivariable logistic and linear regressions were used for statistical analysis. RESULTS: A total of 297 patients underwent PEHR between 2015 and 2023, including 207 with critical view and 90 with standard repair which represents the historic control. Type III hernias were most common (48%) followed by type I (36%), type IV (13%), and type II (2.0%). Robotic-assisted repair was most common (65%), followed by laparoscopic (22%) and open repair (14%). Fundoplications performed included Dor (59%), Nissen (14%), Belsey (5%), and Toupet (2%). Patients who underwent PEHR with critical view had lower hernia recurrence rates compared to standard (9.7% vs 20%, P < .01) and lower reoperation rates (0.5% vs 10%, P < .001). There were no differences in postoperative complications on unadjusted bivariate analysis; however, adjusted outcomes revealed a lower odds of postoperative complications in patients with critical view (AOR .13, 95% CI .05-.31, P < .001). CONCLUSION: We present dissection of a novel critical view during repair of all types of paraesophageal hernia that results in reproducible, consistent, and durable postoperative outcomes, including a significant reduction in recurrence and reoperation.
Subject(s)
Hernia, Hiatal , Herniorrhaphy , Recurrence , Hernia, Hiatal/surgery , Humans , Female , Retrospective Studies , Male , Herniorrhaphy/methods , Middle Aged , Aged , Treatment Outcome , Laparoscopy/methods , Robotic Surgical Procedures/methods , Postoperative Complications/epidemiology , Postoperative Complications/etiology , Postoperative Complications/prevention & control , Esophagus/surgeryABSTRACT
BACKGROUND AND OBJECTIVES: We aim to assess the quality and readability of online information available to patients considering cytoreductive surgery with hyperthermic intraperitoneal chemotherapy (CRS-HIPEC). METHODS: The top three search engines (Google, Bing, and Yahoo) were searched in March 2022. Websites were classified as academic, hospital-affiliated, foundation/advocacy, commercial, or unspecified. Quality of information was assessed using the JAMA benchmark criteria (0-4) and DISCERN tool (16-80), and the presence of a Health On the Net code (HONcode) seal. Readability was evaluated using the Flesch Reading Ease score. RESULTS: Fifty unique websites were included. The average JAMA and DISCERN scores of all websites were 0.72 ± 1.14 and 39.58 ± 13.71, respectively. Foundation/advocacy websites had significantly higher JAMA mean score than commercial (p = 0.044), academic (p < 0.001), and hospital-affiliated websites (p = 0.001). Foundation/advocacy sites had a significantly higher DISCERN mean score than hospital-affiliated (p = 0.035) and academic websites (p = 0.030). The HONcode seal was present in 4 (8%) websites analyzed. Readability was difficult and at the level of college students. CONCLUSIONS: The overall quality of patient-oriented online information on CRS-HIPEC is poor and available resources may not be comprehensible to the general public. Patients seeking information on CRS-HIPEC should be directed to sites affiliated with foundation/advocacy organizations.
Subject(s)
Comprehension , Hyperthermic Intraperitoneal Chemotherapy , Humans , Cytoreduction Surgical Procedures , Search Engine , InternetABSTRACT
BACKGROUND: As patients seek online health information to supplement their medical decision-making, the aim of this study is to assess the quality and readability of internet information on the left ventricular assist device (LVAD). METHODS: Three online search engines (Google, Bing, and Yahoo) were searched for "LVAD" and "Left ventricular assist device." Included websites were classified as academic, foundation/advocacy, hospital-affiliated, commercial, or unspecified. The quality of information was assessed using the JAMA benchmark criteria (0-4), DISCERN tool (16-80), and the presence of Health On the Net code (HONcode) accreditation. Readability was assessed using the Flesch Reading Ease score. RESULTS: A total of 38 unique websites were included. The average JAMA and DISCERN scores of all websites were 0.82 ± 1.11 and 52.45 ± 13.51, respectively. Academic sites had a significantly lower JAMA mean score than commercial (p < 0.001) and unspecified (p < 0.001) websites, as well as a significantly lower DISCERN, mean score than commercial sites (p = 0.002). HONcode certification was present in 6 (15%) websites analyzed, which had significantly higher JAMA (p < 0.001) and DISCERN (p < 0.016) mean scores than sites without HONcode certification. Readability was fairly difficult and at the level of high school students. CONCLUSIONS: The quality of online information on the LVAD is variable, and overall readability exceeds the recommended level for the public. Patients accessing online information on the LVAD should be referred to sites with HONcode accreditation. Academic institutions must provide higher quality online patient literature on LVADs.
Subject(s)
Comprehension , Heart-Assist Devices , Humans , BenchmarkingABSTRACT
PURPOSE: Patients increasingly access online materials for health-related information. Using validated assessment tools, we aim to assess the quality and readability of online information for patients considering incisional hernia (IH) repair. METHODS: The top three online search engines (Google, Bing, Yahoo) were searched in July 2022 for "Incisional hernia repair" and "Surgical hernia repair". Included websites were classified as academic, hospital-affiliated, commercial, and unspecified. The quality of information was assessed using the Journal of the American Medical Association (JAMA) benchmark criteria (0-4), DISCERN instrument (16-80), and the presence of Health On the Net code (HONcode) certification. Readability was assessed using the Flesch Reading Ease (FRE) and Flesch-Kincaid Grade Level (FKGL) tests. RESULTS: 25 unique websites were included. The average JAMA and DISCERN scores of all websites were 0.68 ± 1.02 and 36.50 ± 10.91, respectively. Commercial sites showed a significantly higher DISCERN mean score than academic sites (p = 0.034), while no significant difference was demonstrated between other website categories. 3 (12%) websites reported HONcode certification and had significantly higher JAMA (p = 0.016) and DISCERN (p = 0.045) mean scores than sites without certification. Average FRE and FKGL scores were 39.84 ± 13.11 and 10.62 ± 1.76, respectively, corresponding to college- and high school-level comprehensibility. CONCLUSIONS: Our findings suggest online patient resources on IH repair are of poor overall quality and may not be comprehensible to the public. Patients accessing internet resources for additional information on IH repair should be made aware of these inadequacies and directed to sites bearing HONcode certification.
Subject(s)
Incisional Hernia , Reading , United States , Humans , Incisional Hernia/surgery , Benchmarking , Comprehension , Search Engine , InternetABSTRACT
OBJECTIVE: Immersive virtual reality (IVR) can be utilized to provide low cost and easily accessible simulation on all aspects of surgical education. In addition to technical skills training in surgery, IVR simulation has been utilized for nontechnical skills training in domains such as clinical decision-making and pre-operative planning. This systematic review examines the current literature on the effectiveness of IVR for nontechnical skill acquisition in surgical education. DESIGN: A literature search was performed using MEDLINE, EMBASE, and Web of Science for primary studies published between January 1, 1995 and February 9, 2022. Four reviewers screened titles, abstracts, full texts, extracted data, and analyzed included studies to answer 5 key questions: How is IVR being utilized in nontechnical skills surgical education? What is the methodological quality of studies? What technologies are being utilized? What metrics are reported? What are the findings of these studies? RESULTS: The literature search yielded 2340 citations, with 12 articles included for qualitative synthesis. Of included articles, 33% focused on clinical decision-making and 67% on anatomy/pre-operative planning. Motion sickness was a recorded metric in 25% of studies, with an aggregate incidence of 13% (11/87). An application score was reported in 33% and time to completion in 16.7%. A commercially developed application was utilized in 25%, while 75% employed a noncommercial application. The Oculus Rift was used in 41.7% of studies, HTC Vive in 25%, Samsung Gear in 16.7% of studies, Google Daydream in 8%, and 1 study did not report. The mean Medical Education Research Quality Instrument (MERSQI) score was 10.3 ± 2.3 (out of 18). In all studies researching clinical decision-making, participants preferred IVR to conventional teaching methods and in a nonrandomized control study it was found to be more effective. Averaged across all studies, mean scores were 4.33 for enjoyment, 4.16 for utility, 4.11 for usability, and 3.73 for immersion on a 5-point Likert scale. CONCLUSIONS: The IVR nontechnical skills applications for surgical education are designed for clinical decision-making or anatomy/pre-operative planning. These applications are primarily noncommercially produced and rely upon a diverse array of HMDs for content delivery, suggesting that development is primarily coming from within academia and still without clarity on optimal utilization of the technology. Excitingly, users find these applications to be immersive, enjoyable, usable, and of utility in learning. Although a few studies suggest that IVR is additive or superior to conventional teaching or imaging methods, the data is mixed and derived from studies with weak design. Motion sickness with IVR remains a complication of IVR use needing further study to determine the cause and means of mitigation.
Subject(s)
Motion Sickness , Simulation Training , Virtual Reality , Humans , Clinical Competence , Computer Simulation , Simulation Training/methodsABSTRACT
Background: There is a paucity of recent literature investigating the sole effect of income level on the treatment and survival of patients with rectal cancer. Methods: We analyzed all cases of rectal cancer in the Rectal Cancer PUF of the NCDB from 2010 to 2020. We utilized the Median Income Quartiles 2016-2020 to define our income levels. The two lower quartiles were combined to create a lower income group, with the upper two quartiles creating the higher income group. The total cohort included 201,329 patients, with 116,843 and 84,486 in the higher and lower income groups, respectively. Results: Lower income patients were more often black (17 % vs 6 %), lived farther from the nearest hospital (33.5 miles vs 25.7 miles) despite being more likely to live in urban areas (25 % vs 7 %), and had lower levels of private insurance (36 % vs 49 %). They underwent more APRs (17 % vs 14 %) and had a 13 % higher chance of undergoing an open operation (OR 1.13, CI 1.09-1.17). Higher income patients had a 12 % reduction in 90-day (OR 0.88, 95 % CI 0.82-0.96) and overall mortality (OR 0.88, 95 % CI 0.86-0.89). Conclusions: Clinicians should be aware that lower income patients are often faced with unique challenges that may impact care delivery.
ABSTRACT
INTRODUCTION: The advent of generative artificial intelligence (AI) dialogue platforms and large language models (LLMs) may help facilitate ongoing efforts to improve health literacy. Additionally, recent studies have highlighted inadequate health literacy among patients with cardiac disease. The aim of the present study was to ascertain whether two freely available generative AI dialogue platforms could rewrite online aortic stenosis (AS) patient education materials (PEMs) to meet recommended reading skill levels for the public. METHODS: Online PEMs were gathered from a professional cardiothoracic surgical society and academic institutions in the USA. PEMs were then inputted into two AI-powered LLMs, ChatGPT-3.5 and Bard, with the prompt "translate to 5th-grade reading level". Readability of PEMs before and after AI conversion was measured using the validated Flesch Reading Ease (FRE), Flesch-Kincaid Grade Level (FKGL), Simple Measure of Gobbledygook Index (SMOGI), and Gunning-Fog Index (GFI) scores. RESULTS: Overall, 21 PEMs on AS were gathered. Original readability measures indicated difficult readability at the 10th-12th grade reading level. ChatGPT-3.5 successfully improved readability across all four measures (p < 0.001) to the approximately 6th-7th grade reading level. Bard successfully improved readability across all measures (p < 0.001) except for SMOGI (p = 0.729) to the approximately 8th-9th grade level. Neither platform generated PEMs written below the recommended 6th-grade reading level. ChatGPT-3.5 demonstrated significantly more favorable post-conversion readability scores, percentage change in readability scores, and conversion time compared to Bard (all p < 0.001). CONCLUSION: AI dialogue platforms can enhance the readability of PEMs for patients with AS but may not fully meet recommended reading skill levels, highlighting potential tools to help strengthen cardiac health literacy in the future.
ABSTRACT
PURPOSE: For patients with obesity and congestive heart failure (CHF) who require heart transplantation (HT), aggressive weight loss has been associated with ventricular remodeling, or subclinical alterations in left and right ventricular structure that affect systolic function. Many have suggested offering metabolic and bariatric surgery (MBS) for these patients. As such, we evaluated the role of MBS in HT for patients with obesity and CHF using predictive modelling techniques. MATERIALS AND METHODS: Markov decision analysis was performed to simulate the life expectancy of 30,000 patients with concomitant obesity, CHF, and 30% ejection fraction (EF) who were deemed ineligible to be waitlisted for HT unless they achieved a BMI < 35 kg/m2. Life expectancy following diet and exercise (DE), Roux-en-Y gastric bypass (RYGB), and sleeve gastrectomy (SG) was estimated. Base case patients were defined as having a pre-intervention BMI of 45 kg/m2. Sensitivity analysis of initial BMI was performed. RESULTS: RYGB patients had lower rates of HT and received HT quicker when needed. Base case patients who underwent RYGB gained 2.2 additional mean years survival compared with patients who underwent SG and 10.3 additional mean years survival compared with DE. SG patients gained 6.2 mean years of life compared with DE. CONCLUSION: In this simulation of 30,000 patients with obesity, CHF, and reduced EF, MBS was associated with improved survival by not only decreasing the need for transplantation due to improvements in EF, but also increasing access to HT when needed due to lower average BMI.
Subject(s)
Bariatric Surgery , Gastric Bypass , Heart Failure , Heart Transplantation , Obesity, Morbid , Humans , Obesity, Morbid/surgery , Ventricular Remodeling , Gastric Bypass/methods , Obesity/surgery , Gastrectomy/methods , Heart Failure/surgery , Retrospective Studies , Treatment OutcomeABSTRACT
BACKGROUND: Laparoscopic cholecystectomy is the most common laparoscopic procedure performed in the US and a key component of general surgery training. Surgical trainees frequently access YouTube for educational walkthroughs of surgical procedures. This study aims to evaluate the educational quality of YouTube video walkthroughs on laparoscopic cholecystectomy by using the LAParoscopic surgery Video Educational GuidelineS (LAP-VEGaS) video assessment tool. METHODS: A YouTube search was conducted using "laparoscopic cholecystectomy." Results were sorted by relevance, and the top 100 videos were gathered. Videos with patient education or concomitant procedures were excluded. Included videos were categorized as Physician (produced by an individual physician), Academic (produced by a university or medical school), Commercial (produced by a surgical company), and Society (produced by a professional surgical society) and were rated by 3 investigators using the LAP-VEGaS video assessment tool (0-18). RESULTS: In all, 33 videos met the selection criteria. The average LAP-VEGaS score was 7.96 ± 3.95, and inter-rater reliability was .86. Academic videos demonstrated a significantly higher mean LAP-VEGaS score than Commercial (10.69 ± 3.54 vs 5.25 ± 2.38, P = .033). Most academic videos failed to provide formal case presentations (63%), patient positioning (50%), intraoperative findings (50%), graphic aids (63%), and operative time (75%). CONCLUSION: This is the first study to evaluate the quality of YouTube video walkthroughs on LC using the LAP-VEGaS tool. Despite demonstrating higher LAP-VEGaS scores than other categories, video walkthroughs provided by academic institutions still lack several essential educational criteria for this procedure, highlighting areas of improvement for educators.