Search | Brasil - Virtual Health Library

1.

Generative Large Language Models in Electronic Health Records for Patient Care Since 2023: A Systematic Review.

Du, Xinsong; Wang, Yifei; Zhou, Zhengyang; Chuang, Ya-Wen; Yang, Richard; Zhang, Wenyu; Wang, Xinyi; Zhang, Rui; Hong, Pengyu; Bates, David W; Zhou, Li.

medRxiv ; 2024 Aug 19.

Article in English | MEDLINE | ID: mdl-39228726

ABSTRACT

BACKGROUND: Generative Large language models (LLMs) represent a significant advancement in natural language processing, achieving state-of-the-art performance across various tasks. However, their application in clinical settings using real electronic health records (EHRs) is still rare and presents numerous challenges. OBJECTIVE: This study aims to systematically review the use of generative LLMs, and the effectiveness of relevant techniques in patient care-related topics involving EHRs, summarize the challenges faced, and suggest future directions. METHODS: A Boolean search for peer-reviewed articles was conducted on May 19th, 2024 using PubMed and Web of Science to include research articles published since 2023, which was one month after the release of ChatGPT. The search results were deduplicated. Multiple reviewers, including biomedical informaticians, computer scientists, and a physician, screened the publications for eligibility and conducted data extraction. Only studies utilizing generative LLMs to analyze real EHR data were included. We summarized the use of prompt engineering, fine-tuning, multimodal EHR data, and evaluation matrices. Additionally, we identified current challenges in applying LLMs in clinical settings as reported by the included studies and proposed future directions. RESULTS: The initial search identified 6,328 unique studies, with 76 studies included after eligibility screening. Of these, 67 studies (88.2%) employed zero-shot prompting, five of them reported 100% accuracy on five specific clinical tasks. Nine studies used advanced prompting strategies; four tested these strategies experimentally, finding that prompt engineering improved performance, with one study noting a non-linear relationship between the number of examples in a prompt and performance improvement. Eight studies explored fine-tuning generative LLMs, all reported performance improvements on specific tasks, but three of them noted potential performance degradation after fine-tuning on certain tasks. Only two studies utilized multimodal data, which improved LLM-based decision-making and enabled accurate rare disease diagnosis and prognosis. The studies employed 55 different evaluation metrics for 22 purposes, such as correctness, completeness, and conciseness. Two studies investigated LLM bias, with one detecting no bias and the other finding that male patients received more appropriate clinical decision-making suggestions. Six studies identified hallucinations, such as fabricating patient names in structured thyroid ultrasound reports. Additional challenges included but were not limited to the impersonal tone of LLM consultations, which made patients uncomfortable, and the difficulty patients had in understanding LLM responses. CONCLUSION: Our review indicates that few studies have employed advanced computational techniques to enhance LLM performance. The diverse evaluation metrics used highlight the need for standardization. LLMs currently cannot replace physicians due to challenges such as bias, hallucinations, and impersonal responses.

2.

Factors Associated with Completeness of Sex and Gender Fields in Electronic Health Records.

McDowell, Alex; Fung, Vicki; Bates, David W; Foer, Dinah.

LGBT Health ; 2024 Aug 16.

Article in English | MEDLINE | ID: mdl-39149787

ABSTRACT

Purpose: Our purpose was to understand the completeness of sex and gender fields in electronic health record (EHR) data and patient-level factors associated with completeness of those fields. In doing so, we aimed to inform approaches to EHR sex and gender data collection. Methods: This was a retrospective observational study using 2016-2021 deidentified EHR data from a large health care system. Our sample included adults who had an encounter at any of three hospitals within the health care system or were enrolled in the health care system's Accountable Care Organization. The sex and gender fields of interest were gender identity, sex assigned at birth (SAB), and legal sex. Patient characteristics included demographics, clinical features, and health care utilization. Results: In the final study sample (N = 3,473,123), gender identity, SAB, and legal sex (required for system registration) were missing for 75.4%, 75.8%, and 0.1% of individuals, respectively. Several demographic and clinical factors were associated with having complete gender identity and SAB. Notably, the odds of having complete gender identity and SAB were greater among individuals with an activated patient portal (odds ratio [OR] = 2.68; 95% confidence interval [CI] = 2.66-2.70) and with more outpatient visits (OR = 4.34; 95% CI = 4.29-4.38 for 5+ visits); odds of completeness were lower among those with any urgent care visits (OR = 0.80; 95% CI = 0.78-0.82). Conclusions: Missingness of sex and gender data in the EHR was high and associated with a range of patient factors. Key features associated with completeness highlight multiple opportunities for intervention with a focus on patient portal use, primary care provider reporting, and urgent care settings.

3.

Racial and ethnic associations with interstitial lung disease and healthcare utilization in patients with systemic sclerosis.

Tukpah, Ann-Marcia C; Rose, Jonathan A; Seger, Diane L; Dellaripa, Paul F; Hunninghake, Gary Matthew; Bates, David W.

Rheumatology (Oxford) ; 2024 Aug 09.

Article in English | MEDLINE | ID: mdl-39120917

ABSTRACT

OBJECTIVE: Racial and ethnic differences in presentation and outcomes have been reported in systemic sclerosis (SSc) and SSc-interstitial lung disease (ILD). However, prior studies have limited diversity. We aim to evaluate if there are racial/ethnic differences associated with ILD, time intervals between SSc and ILD and with emergency department (ED) visit or hospitalization rates. METHODS: Clinical and sociodemographic variables were extracted for 756 patients with SSc from longitudinal health records in an integrated health-system. Logistic regression models analyzed the association of covariates with ILD and age at SSc-ILD. Healthcare outcomes were analyzed with complementary log-log regression models. RESULTS: Overall, 33.7% of patients in the cohort had an ILD code, with increased odds for Asian (odds ratio [OR], 2.60; 95% confidence interval [CI], 1.29-5.28; p=0.008) compared with White patients. The predicted age in years of SSc-ILD was younger for Hispanic (estimate, -6.5; 95% CI, -13--0.21; p = 0.04) and Black/African American patients (-10; 95% CI -16--4.9; p < 0.001) compared with White patients. Black/African American patients were more likely to have an ILD code before an SSc code (59% compared with 20.6% of White patients), and the shortest interval from SSc to ILD (3 months). Black/African American (HR, 2.59; 95% CI 1.47-4.49; p = 0.001) and Hispanic patients (HR 2.29; 95% CI 1.37- 3.82; p = 0.002) had higher rates of an ED visit. CONCLUSION: We found that odds of SSc-ILD differed by racial/ethnic group, minoritized patients had earlier age of presentation, and greater rates of an ED visit.

4.

Opportunities to improve the diagnosis and treatment of primary hyperparathyroidism: retrospective cohort study.

Vivero, Matthew P; Chen, Yu-Jen; Antunez, Alexis G; Cho, Nancy L; Nehs, Matthew A; Doherty, Gerard M; Bates, David W; Liu, Jason B.

Gland Surg ; 13(7): 1201-1213, 2024 Jul 30.

Article in English | MEDLINE | ID: mdl-39175695

ABSTRACT

Background: Although primary hyperparathyroidism (PHPT) is readily diagnosed biochemically and can be cured with low-risk surgery, it is often underrecognized and undertreated. Our objectives were to characterize, within our health system, how often patients with hypercalcemia were evaluated for PHPT and how often patients with PHPT underwent definitive treatment with parathyroidectomy. Methods: Ambulatory patients aged 18 years or older seen at our health system between January 2018 and June 2023 with chronic hypercalcemia were identified from the medical record. After excluding causes of secondary hyperparathyroidism, the proportion of patients with parathyroid hormone (PTH) tests was calculated. Among patients with biochemical evidence of PHPT, the proportion of patients who underwent parathyroidectomy was calculated. Multivariable logistic regression was used to identify factors associated with an evaluation for PHPT and, separately, with parathyroidectomy. Results: Of 7,675 patients with chronic hypercalcemia, 3,323 (43.3%) had a PTH test obtained within 6 months. An age between 40-49 vs. <30 years [(odds ratio (OR) =3.2; 95% confidence interval (CI): 1.8-5.6; P<0.001], a serum calcium level between 11.6-12.0 vs. <11.0 mg/dL (OR =3.9; 95% CI: 3.2-4.7; P<0.001), and osteoporosis (OR =3.1; 95% CI: 2.7-3.5; P<0.001) were associated with an evaluation for PHPT. Among those with PTH levels, 1,327 (39.9%) had PHPT but only 916 (69.0%) were recognized. Three hundred and forty-five (26.0%) patients with PHPT underwent parathyroidectomy. An increasing number of surgical indications was associated with parathyroidectomy (P<0.001), though overall rates remained less than 40%. Among indications for surgery, including age and serum total calcium level, only osteoporosis was associated with parathyroidectomy (OR =2.0; 95% CI: 1.4-2.8; P<0.001). Conclusions: In this study, more than half of patients with chronic hypercalcemia were not evaluated for PHPT. Among patients with biochemical evidence of PHPT, one-third were unrecognized and only one-in-four received curative treatment. Opportunities to improve the management of PHPT exist within our large integrated health system.

5.

Frequency and preventability of adverse drug events in the outpatient setting.

Wasserman, Rachel L; Edrees, Heba H; Amato, Mary G; Seger, Diane L; Frits, Michelle L; Hwang, Andrew Y; Iannaccone, Christine; Bates, David W.

BMJ Qual Saf ; 2024 Jul 09.

Article in English | MEDLINE | ID: mdl-38981627

ABSTRACT

BACKGROUND: Limited data exist regarding adverse drug events (ADEs) in the outpatient setting. The objective of this study was to determine the incidence, severity, and preventability of ADEs in the outpatient setting and identify potential prevention strategies. METHODS: We conducted an analysis of ADEs identified in a retrospective electronic health records review of outpatient encounters in 2018 at 13 outpatient sites in Massachusetts that included 13 416 outpatient encounters in 3323 patients. Triggers were identified in the medical record including medications, consultations, laboratory results, and others. If a trigger was detected, a further in-depth review was conducted by nurses and adjudicated by physicians to examine the relevant information in the medical record. Patients were included in the study if they were at least 18 years of age with at least one outpatient encounter with a physician, nurse practitioner or physician's assistant in that calendar year. Patients were excluded from the study if the outpatient encounter occurred in outpatient surgery, psychiatry, rehabilitation, and paediatrics. RESULTS: In all, 5% of patients experienced an ADE over the 1-year period. We identified 198 ADEs among 170 patients, who had a mean age of 60. Most patients experienced one ADE (87%), 10% experienced two ADEs and 3% experienced three or more ADEs. The most frequent drug classes resulting in ADEs were cardiovascular (25%), central nervous system (14%), and anti-infective agents (14%). Severity was ranked as significant in 85%, 14% were serious, 1% were life-threatening, and there were no fatal ADEs. Of the ADEs, 22% were classified as preventable and 78% were not preventable. We identified 246 potential prevention strategies, and 23% of ADEs had more than one prevention strategy possibility. CONCLUSIONS: Despite efforts to prioritise patient safety, medication-related harms are still frequent. These results underscore the need for further patient safety improvement in the outpatient setting.

6.

A Calculated Risk: Evaluation of QTc Drug-Drug Interaction (DDI) Clinical Decision Support (CDS) Alerts and Performance of the Tisdale Risk Score Calculator.

Wasserman, Rachel L; Seger, Diane L; Amato, Mary G; Hwang, Andrew Y; Fiskio, Julie; Bates, David W.

Drug Saf ; 2024 Jul 09.

Article in English | MEDLINE | ID: mdl-38982033

ABSTRACT

INTRODUCTION: A risk factor for a potentially fatal ventricular arrhythmia Torsade de Pointes is a prolongation in the heart rate-corrected QT interval (QTc) ≥ 500 milliseconds (ms) or an increase of ≥ 60 ms from a patient's baseline value, which can cause sudden cardiac death. The Tisdale risk score calculator uses clinical variables to predict which hospitalized patients are at the highest risk for QTc prolongation. OBJECTIVE: To determine the rate of overridden QTc drug-drug interaction (DDI)-related clinical decision support (CDS) alerts per patient admission and the prevalence by Tisdale risk score category of these overridden alerts. Secondary outcome was to determine the rate of drug-induced QTc prolongation (diQTP) associated with overrides. METHODS: Our organization's enterprise data warehouse was used to retrospectively access QTc DDI alerts presented for patients aged ≥ 18 years who were admitted to Brigham and Women's Hospital during 2022. The QTc DDI CDS alerts were included if shown to a physician, fellow, resident, physician assistant, or nurse practitioner when entering the order in inpatient areas for patients with a length of stay of at least 2 days. Variables collected for the Tisdale calculator included age, sex, whether patient was on a loop diuretic, potassium level, admission QTc value, admitting diagnosis of acute myocardial infarction, sepsis, or heart failure, and number of QTc-prolonging drugs given to the patient. RESULTS: A total of 2649 patients with 3033 patient admissions had 18,432 QTc DDI alerts presented that were overridden. An average of 3 unique QTc DDI alerts were presented per patient admission and the alerts were overridden an average of 6 times per patient admission. Overall, 6% of patient admissions were low risk (score ≤ 6), 64% moderate risk (score 7-10), and 30% high risk (score ≥ 11) of QTc prolongation. The most common QTc DDI alerts overridden resulting in an diQTP were quetiapine and propofol (11%) and amiodarone and haloperidol (7%). The diQTP occurred in 883 of patient admissions (29%) and was more frequent in those with higher risk score, with 46% of patient admissions with diQTP in high risk, 23% in moderate risk, and 8% in low risk. CONCLUSION: Use of the Tisdale calculator to assess patient-specific risk of QT prolongation combined with CDS may improve overall alert quality and acceptance rate, which may decrease the diQTP rate.

7.

Deep survival analysis for interpretable time-varying prediction of preeclampsia risk.

Eberhard, Braden W; Gray, Kathryn J; Bates, David W; Kovacheva, Vesela P.

J Biomed Inform ; 156: 104688, 2024 Aug.

Article in English | MEDLINE | ID: mdl-39002866

ABSTRACT

OBJECTIVE: Survival analysis is widely utilized in healthcare to predict the timing of disease onset. Traditional methods of survival analysis are usually based on Cox Proportional Hazards model and assume proportional risk for all subjects. However, this assumption is rarely true for most diseases, as the underlying factors have complex, non-linear, and time-varying relationships. This concern is especially relevant for pregnancy, where the risk for pregnancy-related complications, such as preeclampsia, varies across gestation. Recently, deep learning survival models have shown promise in addressing the limitations of classical models, as the novel models allow for non-proportional risk handling, capturing nonlinear relationships, and navigating complex temporal dynamics. METHODS: We present a methodology to model the temporal risk of preeclampsia during pregnancy and investigate the associated clinical risk factors. We utilized a retrospective dataset including 66,425 pregnant individuals who delivered in two tertiary care centers from 2015 to 2023. We modeled the preeclampsia risk by modifying DeepHit, a deep survival model, which leverages neural network architecture to capture time-varying relationships between covariates in pregnancy. We applied time series k-means clustering to DeepHit's normalized output and investigated interpretability using Shapley values. RESULTS: We demonstrate that DeepHit can effectively handle high-dimensional data and evolving risk hazards over time with performance similar to the Cox Proportional Hazards model, achieving an area under the curve (AUC) of 0.78 for both models. The deep survival model outperformed traditional methodology by identifying time-varied risk trajectories for preeclampsia, providing insights for early and individualized intervention. K-means clustering resulted in patients delineating into low-risk, early-onset, and late-onset preeclampsia groups-notably, each of those has distinct risk factors. CONCLUSION: This work demonstrates a novel application of deep survival analysis in time-varying prediction of preeclampsia risk. Our results highlight the advantage of deep survival models compared to Cox Proportional Hazards models in providing personalized risk trajectory and demonstrating the potential of deep survival models to generate interpretable and meaningful clinical applications in medicine.

Subject(s)

Pre-Eclampsia , Humans , Pre-Eclampsia/mortality , Pregnancy , Female , Survival Analysis , Risk Factors , Deep Learning , Adult , Retrospective Studies , Proportional Hazards Models , Neural Networks, Computer , Risk Assessment/methods

8.

Proactively Designing Generative Artificial Intelligence for Primary Care-Reply.

Sarkar, Urmimala; Bates, David W.

JAMA Intern Med ; 184(8): 992, 2024 Aug 01.

Article in English | MEDLINE | ID: mdl-38913345

Subject(s)

Artificial Intelligence , Primary Health Care , Humans

9.

Straight to the point: evaluation of a Point of Care Information (POCI) resource in answering disease-related questions.

Wasserman, Rachel L; Seger, Diane L; Amato, Mary G; Co, Zoe; Mugal, Aqsa; Rui, Angela; Garabedian, Pamela M; Marceau, Marlika; Syrowatka, Ania; Volk, Lynn A; Bates, David W.

J Med Libr Assoc ; 112(1): 13-21, 2024 Jan 16.

Article in English | MEDLINE | ID: mdl-38911524

ABSTRACT

Objective: To evaluate the ability of DynaMedex, an evidence-based drug and disease Point of Care Information (POCI) resource, in answering clinical queries using keyword searches. Methods: Real-world disease-related questions compiled from clinicians at an academic medical center, DynaMedex search query data, and medical board review resources were categorized into five clinical categories (complications & prognosis, diagnosis & clinical presentation, epidemiology, prevention & screening/monitoring, and treatment) and six specialties (cardiology, endocrinology, hematology-oncology, infectious disease, internal medicine, and neurology). A total of 265 disease-related questions were evaluated by pharmacist reviewers based on if an answer was found (yes, no), whether the answer was relevant (yes, no), difficulty in finding the answer (easy, not easy), cited best evidence available (yes, no), clinical practice guidelines included (yes, no), and level of detail provided (detailed, limited details). Results: An answer was found for 259/265 questions (98%). Both reviewers found an answer for 241 questions (91%), neither found the answer for 6 questions (2%), and only one reviewer found an answer for 18 questions (7%). Both reviewers found a relevant answer 97% of the time when an answer was found. Of all relevant answers found, 68% were easy to find, 97% cited best quality of evidence available, 72% included clinical guidelines, and 95% were detailed. Recommendations for areas of resource improvement were identified. Conclusions: The resource enabled reviewers to answer most questions easily with the best quality of evidence available, providing detailed answers and clinical guidelines, with a high level of replication of results across users.

Subject(s)

Point-of-Care Systems , Humans , Evidence-Based Medicine

10.

A Machine Learning Application to Classify Patients at Differing Levels of Risk of Opioid Use Disorder: Clinician-Based Validation Study.

Eguale, Tewodros; Bastardot, François; Song, Wenyu; Motta-Calderon, Daniel; Elsobky, Yasmin; Rui, Angela; Marceau, Marlika; Davis, Clark; Ganesan, Sandya; Alsubai, Ava; Matthews, Michele; Volk, Lynn A; Bates, David W; Rozenblum, Ronen.

JMIR Med Inform ; 12: e53625, 2024 Jun 04.

Article in English | MEDLINE | ID: mdl-38842167

ABSTRACT

Background: Despite restrictive opioid management guidelines, opioid use disorder (OUD) remains a major public health concern. Machine learning (ML) offers a promising avenue for identifying and alerting clinicians about OUD, thus supporting better clinical decision-making regarding treatment. Objective: This study aimed to assess the clinical validity of an ML application designed to identify and alert clinicians of different levels of OUD risk by comparing it to a structured review of medical records by clinicians. Methods: The ML application generated OUD risk alerts on outpatient data for 649,504 patients from 2 medical centers between 2010 and 2013. A random sample of 60 patients was selected from 3 OUD risk level categories (n=180). An OUD risk classification scheme and standardized data extraction tool were developed to evaluate the validity of the alerts. Clinicians independently conducted a systematic and structured review of medical records and reached a consensus on a patient's OUD risk level, which was then compared to the ML application's risk assignments. Results: A total of 78,587 patients without cancer with at least 1 opioid prescription were identified as follows: not high risk (n=50,405, 64.1%), high risk (n=16,636, 21.2%), and suspected OUD or OUD (n=11,546, 14.7%). The sample of 180 patients was representative of the total population in terms of age, sex, and race. The interrater reliability between the ML application and clinicians had a weighted kappa coefficient of 0.62 (95% CI 0.53-0.71), indicating good agreement. Combining the high risk and suspected OUD or OUD categories and using the review of medical records as a gold standard, the ML application had a corrected sensitivity of 56.6% (95% CI 48.7%-64.5%) and a corrected specificity of 94.2% (95% CI 90.3%-98.1%). The positive and negative predictive values were 93.3% (95% CI 88.2%-96.3%) and 60.0% (95% CI 50.4%-68.9%), respectively. Key themes for disagreements between the ML application and clinician reviews were identified. Conclusions: A systematic comparison was conducted between an ML application and clinicians for identifying OUD risk. The ML application generated clinically valid and useful alerts about patients' different OUD risk levels. ML applications hold promise for identifying patients at differing levels of OUD risk and will likely complement traditional rule-based approaches to generating alerts about opioid safety issues.

11.

Multisite Pragmatic Cluster-Randomized Controlled Trial of the CONCERN Early Warning System.

Rossetti, Sarah C; Dykes, Patricia C; Knaplund, Chris; Cho, Sandy; Withall, Jennifer; Lowenthal, Graham; Albers, David; Lee, Rachel; Jia, Haomiao; Bakken, Suzanne; Kang, Min-Jeoung; Chang, Frank Y; Zhou, Li; Bates, David W; Daramola, Temiloluwa; Liu, Fang; Schwartz-Dillard, Jessica; Tran, Mai; Abbas Bokhari, Syed Mohtashim; Thate, Jennifer; Cato, Kenrick D.

medRxiv ; 2024 Jun 04.

Article in English | MEDLINE | ID: mdl-38883706

ABSTRACT

Importance: Late predictions of hospitalized patient deterioration, resulting from early warning systems (EWS) with limited data sources and/or a care team's lack of shared situational awareness, contribute to delays in clinical interventions. The COmmunicating Narrative Concerns Entered by RNs (CONCERN) Early Warning System (EWS) uses real-time nursing surveillance documentation patterns in its machine learning algorithm to identify patients' deterioration risk up to 42 hours earlier than other EWSs. Objective: To test our a priori hypothesis that patients with care teams informed by the CONCERN EWS intervention have a lower mortality rate and shorter length of stay (LOS) than the patients with teams not informed by CONCERN EWS. Design: One-year multisite, pragmatic controlled clinical trial with cluster-randomization of acute and intensive care units to intervention or usual-care groups. Setting: Two large U.S. health systems. Participants: Adult patients admitted to acute and intensive care units, excluding those on hospice/palliative/comfort care, or with Do Not Resuscitate/Do Not Intubate orders. Intervention: The CONCERN EWS intervention calculates patient deterioration risk based on nurses' concern levels measured by surveillance documentation patterns, and it displays the categorical risk score (low, increased, high) in the electronic health record (EHR) for care team members. Main Outcomes and Measures: Primary outcomes: in-hospital mortality, LOS; survival analysis was used. Secondary outcomes: cardiopulmonary arrest, sepsis, unanticipated ICU transfers, 30-day hospital readmission. Results: A total of 60 893 hospital encounters (33 024 intervention and 27 869 usual-care) were included. Both groups had similar patient age, race, ethnicity, and illness severity distributions. Patients in the intervention group had a 35.6% decreased risk of death (adjusted hazard ratio [HR], 0.644; 95% confidence interval [CI], 0.532-0.778; P<.0001), 11.2% decreased LOS (adjusted incidence rate ratio, 0.914; 95% CI, 0.902-0.926; P<.0001), 7.5% decreased risk of sepsis (adjusted HR, 0.925; 95% CI, 0.861-0.993; P=.0317), and 24.9% increased risk of unanticipated ICU transfer (adjusted HR, 1.249; 95% CI, 1.093-1.426; P=.0011) compared with patients in the usual-care group. Conclusions and Relevance: A hospital-wide EWS based on nursing surveillance patterns decreased in-hospital mortality, sepsis, and LOS when integrated into the care team's EHR workflow. Trial Registration: ClinicalTrials.gov Identifier: NCT03911687.

12.

Improving medication safety in both adults and children: what will it take?

Bates, David W; Sakuma, Mio.

BMJ Qual Saf ; 2024 Jun 20.

Article in English | MEDLINE | ID: mdl-38902019

13.

Virtual Scribes and Physician Time Spent on Electronic Health Records.

Rotenstein, Lisa; Melnick, Edward R; Iannaccone, Christine; Zhang, Jianyi; Mugal, Aqsa; Lipsitz, Stuart R; Healey, Michael J; Holland, Christopher; Snyder, Richard; Sinsky, Christine A; Ting, David; Bates, David W.

JAMA Netw Open ; 7(5): e2413140, 2024 May 01.

Article in English | MEDLINE | ID: mdl-38787556

ABSTRACT

Importance: Time on the electronic health record (EHR) is associated with burnout among physicians. Newer virtual scribe models, which enable support from either a real-time or asynchronous scribe, have the potential to reduce the burden of the EHR and EHR-related documentation. Objective: To characterize the association of use of virtual scribes with changes in physicians' EHR time and note and order composition and to identify the physician, scribe, and scribe response factors associated with changes in EHR time upon virtual scribe use. Design, Setting, and Participants: Retrospective, pre-post quality improvement study of 144 physicians across specialties who had used a scribe for at least 3 months from January 2020 to September 2022, were affiliated with Brigham and Women's Hospital and Massachusetts General Hospital, and cared for patients in the outpatient setting. Data were analyzed from November 2022 to January 2024. Exposure: Use of either a real-time or asynchronous virtual scribe. Main Outcomes: Total EHR time, time on notes, and pajama time (5:30 pm to 7:00 am on weekdays and nonscheduled weekends and holidays), all per appointment; proportion of the note written by the physician and team contribution to orders. Results: The main study sample included 144 unique physicians who had used a virtual scribe for at least 3 months in 152 unique scribe participation episodes (134 [88.2%] had used an asynchronous scribe service). Nearly two-thirds of the physicians (91 physicians [63.2%]) were female and more than half (86 physicians [59.7%]) were in primary care specialties. Use of a virtual scribe was associated with significant decreases in total EHR time per appointment (mean [SD] of 5.6 [16.4] minutes; P < .001) in the 3 months after vs the 3 months prior to scribe use. Scribe use was also associated with significant decreases in note time per appointment and pajama time per appointment (mean [SD] of 1.3 [3.3] minutes; P < .001 and 1.1 [4.0] minutes; P = .004). In a multivariable linear regression model, the following factors were associated with significant decreases in total EHR time per appointment with a scribe use at 3 months: practicing in a medical specialty (-7.8; 95% CI, -13.4 to -2.2 minutes), greater baseline EHR time per appointment (-0.3; 95% CI, -0.4 to -0.2 minutes per additional minute of baseline EHR time), and decrease in the percentage of the note contributed by the physician (-9.1; 95% CI, -17.3 to -0.8 minutes for every percentage point decrease). Conclusions and Relevance: In 2 academic medical centers, use of virtual scribes was associated with significant decreases in total EHR time, time spent on notes, and pajama time, all per appointment. Virtual scribes may be particularly effective among medical specialists and those physicians with greater baseline EHR time.

Subject(s)

Documentation , Electronic Health Records , Physicians , Humans , Retrospective Studies , Female , Male , Physicians/psychology , Documentation/methods , Time Factors , Quality Improvement , Adult , Middle Aged

14.

Telehealth Experience Among Patients With Limited English Proficiency.

Rodriguez, Jorge A; Khoong, Elaine C; Lipsitz, Stuart R; Lyles, Courtney R; Bates, David W; Samal, Lipika.

JAMA Netw Open ; 7(5): e2410691, 2024 May 01.

Article in English | MEDLINE | ID: mdl-38722633

ABSTRACT

This cross-sectional study assesses the implication of patients' English language skills for telehealth use and visit experience.

Subject(s)

Limited English Proficiency , Telemedicine , Humans , Telemedicine/methods , Male , Female , Middle Aged , Adult , Aged , Cross-Sectional Studies , Communication Barriers

15.

Patterns of Social Needs Predict Quality-of-Life and Healthcare Utilization Outcomes in Patients from a Large Hospital System.

Zeng, Chengbo; Kaur, Manraj N; Malapati, Sri Harshini; Liu, Jason B; Bryant, Allison S; Meyers, Peter M; Bates, David W; McCleary, Nadine J; Pusic, Andrea L; Edelen, Maria O.

J Gen Intern Med ; 39(11): 2060-2068, 2024 Aug.

Article in English | MEDLINE | ID: mdl-38710869

ABSTRACT

BACKGROUND: Unmet social needs (SNs) often coexist in distinct patterns within specific population subgroups, yet these patterns are understudied. OBJECTIVE: To identify patterns of social needs (PSNs) and characterize their associations with health-related quality-of-life (HRQoL) and healthcare utilization (HCU). DESIGN: Observational study using data on SNs screening, HRQoL (i.e., low mental and physical health), and 90-day HCU (i.e., emergency visits and hospital admission). Among patients with any SNs, latent class analysis was conducted to identify unique PSNs. For all patients and by race and age subgroups, compared with no SNs, we calculated the risks of poor HRQoL and time to first HCU following SNs screening for each PSN. PATIENTS: Adult patients undergoing SNs screening at the Mass General Brigham healthcare system in Massachusetts, United States, between March 2018 and January 2023. MAIN MEASURES: SNs included: education, employment, family care, food, housing, medication, transportation, and ability to pay for household utilities. HRQoL was assessed using the Patient-Reported Outcomes Measurement Information System Global-10. KEY RESULTS: Six unique PSNs were identified: "high number of social needs," "food and utility access," "employment needs," "interested in education," "housing instability," and "transportation barriers." In 14,230 patients with HRQoL data, PSNs increased the risks of poor mental health, with risk ratios ranging from 1.07(95%CI:1.01-1.13) to 1.80(95%CI:1.74-1.86). Analysis of poor physical health yielded similar findings, except that the "interested in education" showed a mild protective effect (0.97[95%CI:0.94-1.00]). In 105,110 patients, PSNs increased the risk of 90-day HCU, with hazard ratios ranging from 1.09(95%CI:0.99-1.21) to 1.70(95%CI:1.52-1.90). Findings were generally consistent in subgroup analyses by race and age. CONCLUSIONS: Certain SNs coexist in distinct patterns and result in poorer HRQoL and more HCU. Understanding PSNs allows policymakers, public health practitioners, and social workers to identify at-risk patients and implement integrated, system-wide, and community-based interventions.

Subject(s)

Patient Acceptance of Health Care , Quality of Life , Humans , Male , Female , Middle Aged , Adult , Patient Acceptance of Health Care/statistics & numerical data , Aged , Massachusetts , Health Services Needs and Demand

16.

The Safety of Outpatient Health Care : Review of Electronic Health Records.

Levine, David M; Syrowatka, Ania; Salmasian, Hojjat; Shahian, David M; Lipsitz, Stuart; Zebrowski, Jonathan P; Myers, Laura C; Logan, Merranda S; Roy, Christopher G; Iannaccone, Christine; Frits, Michelle L; Volk, Lynn A; Dulgarian, Sevan; Amato, Mary G; Edrees, Heba H; Sato, Luke; Folcarelli, Patricia; Einbinder, Jonathan S; Reynolds, Mark E; Mort, Elizabeth; Bates, David W.

Ann Intern Med ; 177(6): 738-748, 2024 Jun.

Article in English | MEDLINE | ID: mdl-38710086

ABSTRACT

BACKGROUND: Despite considerable emphasis on delivering safe care, substantial patient harm occurs. Although most care occurs in the outpatient setting, knowledge of outpatient adverse events (AEs) remains limited. OBJECTIVE: To measure AEs in the outpatient setting. DESIGN: Retrospective review of the electronic health record (EHR). SETTING: 11 outpatient sites in Massachusetts in 2018. PATIENTS: 3103 patients who received outpatient care. MEASUREMENTS: Using a trigger method, nurse reviewers identified possible AEs and physicians adjudicated them, ranked severity, and assessed preventability. Generalized estimating equations were used to assess the association of having at least 1 AE with age, sex, race, and primary insurance. Variation in AE rates was analyzed across sites. RESULTS: The 3103 patients (mean age, 52 years) were more often female (59.8%), White (75.1%), English speakers (90.8%), and privately insured (70.4%) and had a mean of 4 outpatient encounters in 2018. Overall, 7.0% (95% CI, 4.6% to 9.3%) of patients had at least 1 AE (8.6 events per 100 patients annually). Adverse drug events were the most common AE (63.8%), followed by health care-associated infections (14.8%) and surgical or procedural events (14.2%). Severity was serious in 17.4% of AEs, life-threatening in 2.1%, and never fatal. Overall, 23.2% of AEs were preventable. Having at least 1 AE was less often associated with ages 18 to 44 years than with ages 65 to 84 years (standardized risk difference, -0.05 [CI, -0.09 to -0.02]) and more often associated with Black race than with Asian race (standardized risk difference, 0.09 [CI, 0.01 to 0.17]). Across study sites, 1.8% to 23.6% of patients had at least 1 AE and clinical category of AEs varied substantially. LIMITATION: Retrospective EHR review may miss AEs. CONCLUSION: Outpatient harm was relatively common and often serious. Adverse drug events were most frequent. Rates were higher among older adults. Interventions to curtail outpatient harm are urgently needed. PRIMARY FUNDING SOURCE: Controlled Risk Insurance Company and the Risk Management Foundation of the Harvard Medical Institutions.

Subject(s)

Ambulatory Care , Electronic Health Records , Patient Safety , Humans , Female , Middle Aged , Male , Retrospective Studies , Adult , Aged , Massachusetts , Adolescent , Young Adult

17.

Enhancing Early Detection of Cognitive Decline in the Elderly: A Comparative Study Utilizing Large Language Models in Clinical Notes.

Du, Xinsong; Novoa-Laurentiev, John; Plasaek, Joseph M; Chuang, Ya-Wen; Wang, Liqin; Marshall, Gad; Mueller, Stephanie K; Chang, Frank; Datta, Surabhi; Paek, Hunki; Lin, Bin; Wei, Qiang; Wang, Xiaoyan; Wang, Jingqi; Ding, Hao; Manion, Frank J; Du, Jingcheng; Bates, David W; Zhou, Li.

medRxiv ; 2024 May 06.

Article in English | MEDLINE | ID: mdl-38633810

ABSTRACT

Background: Large language models (LLMs) have shown promising performance in various healthcare domains, but their effectiveness in identifying specific clinical conditions in real medical records is less explored. This study evaluates LLMs for detecting signs of cognitive decline in real electronic health record (EHR) clinical notes, comparing their error profiles with traditional models. The insights gained will inform strategies for performance enhancement. Methods: This study, conducted at Mass General Brigham in Boston, MA, analyzed clinical notes from the four years prior to a 2019 diagnosis of mild cognitive impairment in patients aged 50 and older. We used a randomly annotated sample of 4,949 note sections, filtered with keywords related to cognitive functions, for model development. For testing, a random annotated sample of 1,996 note sections without keyword filtering was utilized. We developed prompts for two LLMs, Llama 2 and GPT-4, on HIPAA-compliant cloud-computing platforms using multiple approaches (e.g., both hard and soft prompting and error analysis-based instructions) to select the optimal LLM-based method. Baseline models included a hierarchical attention-based neural network and XGBoost. Subsequently, we constructed an ensemble of the three models using a majority vote approach. Results: GPT-4 demonstrated superior accuracy and efficiency compared to Llama 2, but did not outperform traditional models. The ensemble model outperformed the individual models, achieving a precision of 90.3%, a recall of 94.2%, and an F1-score of 92.2%. Notably, the ensemble model showed a significant improvement in precision, increasing from a range of 70%-79% to above 90%, compared to the best-performing single model. Error analysis revealed that 63 samples were incorrectly predicted by at least one model; however, only 2 cases (3.2%) were mutual errors across all models, indicating diverse error profiles among them. Conclusions: LLMs and traditional machine learning models trained using local EHR data exhibited diverse error profiles. The ensemble of these models was found to be complementary, enhancing diagnostic performance. Future research should investigate integrating LLMs with smaller, localized models and incorporating medical data and domain knowledge to enhance performance on specific tasks.

18.

Barriers and enablers for externally and internally driven implementation processes in healthcare: a qualitative cross-case study.

Lyng, Hilda Bø; Ree, Eline; Strømme, Torunn; Johannessen, Terese; Aase, Ingunn; Ullebust, Berit; Thomsen, Line Hurup; Holen-Rabbersvik, Elisabeth; Schibevaag, Lene; Bates, David W; Wiig, Siri.

BMC Health Serv Res ; 24(1): 528, 2024 Apr 25.

Article in English | MEDLINE | ID: mdl-38664668

ABSTRACT

BACKGROUND: Quality in healthcare is a subject in need of continuous attention. Quality improvement (QI) programmes with the purpose of increasing service quality are therefore of priority for healthcare leaders and governments. This study explores the implementation process of two different QI programmes, one externally driven implementation and one internally driven, in Norwegian nursing homes and home care services. The aim for the study was to identify enablers and barriers for externally and internally driven implementation processes in nursing homes and homecare services, and furthermore to explore if identified enablers and barriers are different or similar across the different implementation processes. METHODS: This study is based on an exploratory qualitative methodology. The empirical data was collected through the 'Improving Quality and Safety in Primary Care - Implementing a Leadership Intervention in Nursing Homes and Homecare' (SAFE-LEAD) project. The SAFE-LEAD project is a multiple case study of two different QI programmes in primary care in Norway. A large externally driven implementation process was supplemented with a tracer project involving an internally driven implementation process to identify differences and similarities. The empirical data was inductively analysed in accordance with grounded theory. RESULTS: Enablers for both external and internal implementation processes were found to be technology and tools, dedication, and ownership. Other more implementation process specific enablers entailed continuous learning, simulation training, knowledge sharing, perceived relevance, dedication, ownership, technology and tools, a systematic approach and coordination. Only workload was identified as coincident barriers across both externally and internally implementation processes. Implementation process specific barriers included turnover, coping with given responsibilities, staff variety, challenges in coordination, technology and tools, standardizations not aligned with work, extensive documentation, lack of knowledge sharing. CONCLUSION: This study provides understanding that some enablers and barriers are present in both externally and internally driven implementation processes, while other are more implementation process specific. Dedication, engagement, technology and tools are coinciding enablers which can be drawn upon in different implementation processes, while workload acted as the main barrier in both externally and internally driven implementation processes. This means that some enablers and barriers can be expected in implementation of QI programmes in nursing homes and home care services, while others require contextual understanding of their setting and work.

Subject(s)

Home Care Services , Nursing Homes , Qualitative Research , Quality Improvement , Norway , Humans , Quality Improvement/organization & administration , Nursing Homes/organization & administration , Nursing Homes/standards , Home Care Services/organization & administration , Leadership , Primary Health Care/organization & administration

19.

Looking Beyond Mortality Prediction: Primary Care Physician Views of Patients' Palliative Care Needs Predicted by a Machine Learning Tool.

Rotenstein, Lisa; Wang, Liqin; Zupanc, Sophia N; Penumarthy, Akhila; Laurentiev, John; Lamey, Jan; Farah, Subrina; Lipsitz, Stuart; Jain, Nina; Bates, David W; Zhou, Li; Lakin, Joshua R.

Appl Clin Inform ; 15(3): 460-468, 2024 May.

Article in English | MEDLINE | ID: mdl-38636542

ABSTRACT

OBJECTIVES: To assess primary care physicians' (PCPs) perception of the need for serious illness conversations (SIC) or other palliative care interventions in patients flagged by a machine learning tool for high 1-year mortality risk. METHODS: We surveyed PCPs from four Brigham and Women's Hospital primary care practice sites. Multiple mortality prediction algorithms were ensembled to assess adult patients of these PCPs who were either enrolled in the hospital's integrated care management program or had one of several chronic conditions. The patients were classified as high or low risk of 1-year mortality. A blinded survey had PCPs evaluate these patients for palliative care needs. We measured PCP and machine learning tool agreement regarding patients' need for an SIC/elevated risk of mortality. RESULTS: Of 66 PCPs, 20 (30.3%) participated in the survey. Out of 312 patients evaluated, 60.6% were female, with a mean (standard deviation [SD]) age of 69.3 (17.5) years, and a mean (SD) Charlson Comorbidity Index of 2.80 (2.89). The machine learning tool identified 162 (51.9%) patients as high risk. Excluding deceased or unfamiliar patients, PCPs felt that an SIC was appropriate for 179 patients; the machine learning tool flagged 123 of these patients as high risk (68.7% concordance). For 105 patients whom PCPs deemed SIC unnecessary, the tool classified 83 as low risk (79.1% concordance). There was substantial agreement between PCPs and the tool (Gwet's agreement coefficient of 0.640). CONCLUSIONS: A machine learning mortality prediction tool offers promise as a clinical decision aid, helping clinicians pinpoint patients needing palliative care interventions.

Subject(s)

Machine Learning , Palliative Care , Physicians, Primary Care , Humans , Female , Male , Aged , Middle Aged , Surveys and Questionnaires , Mortality

20.

A qualitative study of leaders' experiences of handling challenges and changes induced by the COVID-19 pandemic in rural nursing homes and homecare services.

Glette, Malin Knutsen; Kringeland, Tone; Samal, Lipika; Bates, David W; Wiig, Siri.

BMC Health Serv Res ; 24(1): 442, 2024 Apr 09.

Article in English | MEDLINE | ID: mdl-38594669

ABSTRACT

BACKGROUND: The COVID-19 pandemic had a major impact on healthcare services globally. In care settings such as small rural nursing homes and homes care services leaders were forced to confront, and adapt to, both new and ongoing challenges to protect their employees and patients and maintain their organization's operation. The aim of this study was to assess how healthcare leaders, working in rural primary healthcare services, led nursing homes and homecare services during the COVID-19 pandemic. Moreover, the study sought to explore how adaptations to changes and challenges induced by the pandemic were handled by leaders in rural nursing homes and homecare services. METHODS: The study employed a qualitative explorative design with individual interviews. Nine leaders at different levels, working in small, rural nursing homes and homecare services in western Norway were included. RESULTS: Three main themes emerged from the thematic analysis: "Navigating the role of a leader during the pandemic," "The aftermath - management of COVID-19 in rural primary healthcare services", and "The benefits and drawbacks of being small and rural during the pandemic." CONCLUSIONS: Leaders in rural nursing homes and homecare services handled a multitude of immediate challenges and used a variety of adaptive strategies during the COVID-19 pandemic. While handling their own uncertainty and rapidly changing roles, they also coped with organizational challenges and adopted strategies to maintain good working conditions for their employees, as well as maintain sound healthcare management. The study results establish the intricate nature of resilient leadership, encompassing individual resilience, personality, governance, resource availability, and the capability to adjust to organizational and employee requirements, and how the rural context may affect these aspects.

Subject(s)

COVID-19 , Pandemics , Humans , COVID-19/epidemiology , Nursing Homes , Qualitative Research , Delivery of Health Care

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL