Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 80
Filtrar
1.
J Gen Intern Med ; 39(1): 27-35, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-37528252

RESUMO

BACKGROUND: Early detection of clinical deterioration among hospitalized patients is a clinical priority for patient safety and quality of care. Current automated approaches for identifying these patients perform poorly at identifying imminent events. OBJECTIVE: Develop a machine learning algorithm using pager messages sent between clinical team members to predict imminent clinical deterioration. DESIGN: We conducted a large observational study using long short-term memory machine learning models on the content and frequency of clinical pages. PARTICIPANTS: We included all hospitalizations between January 1, 2018 and December 31, 2020 at Vanderbilt University Medical Center that included at least one page message to physicians. Exclusion criteria included patients receiving palliative care, hospitalizations with a planned intensive care stay, and hospitalizations in the top 2% longest length of stay. MAIN MEASURES: Model classification performance to identify in-hospital cardiac arrest, transfer to intensive care, or Rapid Response activation in the next 3-, 6-, and 12-hours. We compared model performance against three common early warning scores: Modified Early Warning Score, National Early Warning Score, and the Epic Deterioration Index. KEY RESULTS: There were 87,783 patients (mean [SD] age 54.0 [18.8] years; 45,835 [52.2%] women) who experienced 136,778 hospitalizations. 6214 hospitalized patients experienced a deterioration event. The machine learning model accurately identified 62% of deterioration events within 3-hours prior to the event and 47% of events within 12-hours. Across each time horizon, the model surpassed performance of the best early warning score including area under the receiver operating characteristic curve at 6-hours (0.856 vs. 0.781), sensitivity at 6-hours (0.590 vs. 0.505), specificity at 6-hours (0.900 vs. 0.878), and F-score at 6-hours (0.291 vs. 0.220). CONCLUSIONS: Machine learning applied to the content and frequency of clinical pages improves prediction of imminent deterioration. Using clinical pages to monitor patient acuity supports improved detection of imminent deterioration without requiring changes to clinical workflow or nursing documentation.


Assuntos
Deterioração Clínica , Humanos , Feminino , Pessoa de Meia-Idade , Masculino , Hospitalização , Cuidados Críticos , Curva ROC , Algoritmos , Aprendizado de Máquina , Estudos Retrospectivos
2.
J Gen Intern Med ; 38(11): 2546-2552, 2023 08.
Artigo em Inglês | MEDLINE | ID: mdl-37254011

RESUMO

BACKGROUND: Clinical trials indicate continuous glucose monitor (CGM) use may benefit adults with type 2 diabetes, but CGM rates and correlates in real-world care settings are unknown. OBJECTIVE: We sought to ascertain prevalence and correlates of CGM use and to examine rates of new CGM prescriptions across clinic types and medication regimens. DESIGN: Retrospective cohort using electronic health records in a large academic medical center in the Southeastern US. PARTICIPANTS: Adults with type 2 diabetes and a primary care or endocrinology visit during 2021. MAIN MEASURES: Age, gender, race, ethnicity, insurance, clinic type, insulin regimen, hemoglobin A1c values, CGM prescriptions, and prescribing clinic type. KEY RESULTS: Among 30,585 adults with type 2 diabetes, 13% had used a CGM. CGM users were younger and more had private health insurance (p < .05) as compared to non-users; 72% of CGM users had an intensive insulin regimen, but 12% were not taking insulin. CGM users had higher hemoglobin A1c values (both most recent and most proximal to the first CGM prescription) than non-users. CGM users were more likely to receive endocrinology care than non-users, but 23% had only primary care visits in 2021. For each month in 2021, a mean of 90.5 (SD 12.5) people started using CGM. From 2020 to 2021, monthly rates of CGM prescriptions to new users grew 36% overall, but 125% in primary care. Most starting CGM in endocrinology had an intensive insulin regimen (82% vs. 49% starting in primary care), whereas 28% starting CGM in primary care were not using insulin (vs. 5% in endocrinology). CONCLUSION: CGM uptake for type 2 diabetes is increasing rapidly, with most growth in primary care. These trends present opportunities for healthcare system adaptations to support CGM use and related workflows in primary care to support growth in uptake.


Assuntos
Diabetes Mellitus Tipo 1 , Diabetes Mellitus Tipo 2 , Hipoglicemia , Adulto , Humanos , Diabetes Mellitus Tipo 2/tratamento farmacológico , Diabetes Mellitus Tipo 2/epidemiologia , Hemoglobinas Glicadas , Diabetes Mellitus Tipo 1/tratamento farmacológico , Hipoglicemia/epidemiologia , Estudos Retrospectivos , Automonitorização da Glicemia , Glicemia , Insulina/uso terapêutico , Atenção Primária à Saúde , Hipoglicemiantes/uso terapêutico
3.
Am J Emerg Med ; 67: 70-78, 2023 05.
Artigo em Inglês | MEDLINE | ID: mdl-36806978

RESUMO

BACKGROUND: Chest pain (CP) is the hallmark symptom for acute coronary syndrome (ACS) but is not reported in 20-30% of patients, especially women, elderly, non-white patients, presenting to the emergency department (ED) with an ST-segment elevation myocardial infarction (STEMI). METHODS: We used a retrospective 5-year adult ED sample of 279,132 patients to explore using CP alone to predict ACS, then we incrementally added other ACS chief complaints, age, and sex in a series of multivariable logistic regression models. We evaluated each model's identification of ACS and STEMI. RESULTS: Using CP alone would recommend ECGs for 8% of patients (sensitivity, 61%; specificity, 92%) but missed 28.4% of STEMIs. The model with all variables identified ECGs for 22% of patients (sensitivity, 82%; specificity, 78%) but missed 14.7% of STEMIs. The model with CP and other ACS chief complaints had the highest sensitivity (93%) and specificity (55%), identified 45.1% of patients for ECG, and only missed 4.4% of STEMIs. CONCLUSION: CP alone had highest specificity but lacked sensitivity. Adding other ACS chief complaints increased sensitivity but identified 2.2-fold more patients for ECGs. Achieving an ECG in 10 min for patients with ACS to identify all STEMIs will be challenging without introducing more complex risk calculation into clinical care.


Assuntos
Síndrome Coronariana Aguda , Infarto do Miocárdio com Supradesnível do Segmento ST , Adulto , Humanos , Feminino , Idoso , Infarto do Miocárdio com Supradesnível do Segmento ST/diagnóstico , Estudos Retrospectivos , Eletrocardiografia , Dor no Peito/diagnóstico , Dor no Peito/etiologia , Síndrome Coronariana Aguda/complicações , Síndrome Coronariana Aguda/diagnóstico , Serviço Hospitalar de Emergência
4.
Anesth Analg ; 135(1): 26-34, 2022 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-35343932

RESUMO

BACKGROUND: Patients taking high doses of opioids, or taking opioids in combination with other central nervous system depressants, are at increased risk of opioid overdose. Coprescribing the opioid-reversal agent naloxone is an essential safety measure, recommended by the surgeon general, but the rate of naloxone coprescribing is low. Therefore, we set out to determine whether a targeted clinical decision support alert could increase the rate of naloxone coprescribing. METHODS: We conducted a before-after study from January 2019 to April 2021 at a large academic health system in the Southeast. We developed a targeted point of care decision support notification in the electronic health record to suggest ordering naloxone for patients who have a high risk of opioid overdose based on a high morphine equivalent daily dose (MEDD) ≥90 mg, concomitant benzodiazepine prescription, or a history of opioid use disorder or opioid overdose. We measured the rate of outpatient naloxone prescribing as our primary measure. A multivariable logistic regression model with robust variance to adjust for prescriptions within the same prescriber was implemented to estimate the association between alerts and naloxone coprescribing. RESULTS: The baseline naloxone coprescribing rate in 2019 was 0.28 (95% confidence interval [CI], 0.24-0.31) naloxone prescriptions per 100 opioid prescriptions. After alert implementation, the naloxone coprescribing rate increased to 4.51 (95% CI, 4.33-4.68) naloxone prescriptions per 100 opioid prescriptions (P < .001). The adjusted odds of naloxone coprescribing after alert implementation were approximately 28 times those during the baseline period (95% CI, 15-52). CONCLUSIONS: A targeted decision support alert for patients at risk for opioid overdose significantly increased the rate of naloxone coprescribing and was relatively easy to build.


Assuntos
Overdose de Drogas , Overdose de Opiáceos , Transtornos Relacionados ao Uso de Opioides , Analgésicos Opioides/efeitos adversos , Overdose de Drogas/diagnóstico , Humanos , Naloxona/efeitos adversos , Antagonistas de Entorpecentes/efeitos adversos , Transtornos Relacionados ao Uso de Opioides/complicações , Transtornos Relacionados ao Uso de Opioides/diagnóstico , Transtornos Relacionados ao Uso de Opioides/epidemiologia , Melhoria de Qualidade
5.
Ann Intern Med ; 172(11 Suppl): S110-S115, 2020 06 02.
Artigo em Inglês | MEDLINE | ID: mdl-32479179

RESUMO

In the past 2 decades, the United States has seen widespread adoption of electronic health records (EHRs) and a transition from mostly locally developed EHRs to commercial systems. However, most research on quality improvement and safety interventions in EHRs is still conducted at a single site, in a single EHR. Although single-site studies are important early in the innovation lifecycle, multisite studies of EHR interventions are critical for generalizability. Because EHR software, configuration, and local context differ considerably across health care organizations, it can be difficult to implement a single, standardized intervention across multiple sites in a study. This article outlines key strengths, weaknesses, challenges, and opportunities for standardization of EHR interventions in multisite studies and describes flexible trial designs suitable for studying complex interventions, including EHR interventions. It also outlines key considerations for reporting on flexible trials of EHR interventions, including sharing details of the process for designing interventions and their content, details of outcomes being studied and approaches for pooling, and the importance of sharing code and configuration whenever possible.


Assuntos
Pesquisa Biomédica/normas , Registros Eletrônicos de Saúde/organização & administração , Guias como Assunto , Melhoria de Qualidade , Humanos
6.
J Med Internet Res ; 23(3): e22806, 2021 03 04.
Artigo em Inglês | MEDLINE | ID: mdl-33661128

RESUMO

BACKGROUND: Documentation burden is a common problem with modern electronic health record (EHR) systems. To reduce this burden, various recording methods (eg, voice recorders or motion sensors) have been proposed. However, these solutions are in an early prototype phase and are unlikely to transition into practice in the near future. A more pragmatic alternative is to directly modify the implementation of the existing functionalities of an EHR system. OBJECTIVE: This study aims to assess the nature of free-text comments entered into EHR flowsheets that supplement quantitative vital sign values and examine opportunities to simplify functionality and reduce documentation burden. METHODS: We evaluated 209,055 vital sign comments in flowsheets that were generated in the Epic EHR system at the Vanderbilt University Medical Center in 2018. We applied topic modeling, as well as the natural language processing Clinical Language Annotation, Modeling, and Processing software system, to extract generally discussed topics and detailed medical terms (expressed as probability distribution) to investigate the stories communicated in these comments. RESULTS: Our analysis showed that 63.33% (6053/9557) of the users who entered vital signs made at least one free-text comment in vital sign flowsheet entries. The user roles that were most likely to compose comments were registered nurse, technician, and licensed nurse. The most frequently identified topics were the notification of a result to health care providers (0.347), the context of a measurement (0.307), and an inability to obtain a vital sign (0.224). There were 4187 unique medical terms that were extracted from 46,029 (0.220) comments, including many symptom-related terms such as "pain," "upset," "dizziness," "coughing," "anxiety," "distress," and "fever" and drug-related terms such as "tylenol," "anesthesia," "cannula," "oxygen," "motrin," "rituxan," and "labetalol." CONCLUSIONS: Considering that flowsheet comments are generally not displayed or automatically pulled into any clinical notes, our findings suggest that the flowsheet comment functionality can be simplified (eg, via structured response fields instead of a text input dialog) to reduce health care provider effort. Moreover, rich and clinically important medical terms such as medications and symptoms should be explicitly recorded in clinical notes for better visibility.


Assuntos
Documentação , Registros Eletrônicos de Saúde , Centros Médicos Acadêmicos , Humanos , Processamento de Linguagem Natural , Sinais Vitais
7.
J Pediatr ; 222: 186-192.e1, 2020 07.
Artigo em Inglês | MEDLINE | ID: mdl-32417078

RESUMO

OBJECTIVE: Within the National Pediatric Cardiology Quality Improvement Collaborative (NPC-QIC), a learning health network developed to improve outcomes for patients with hypoplastic left heart syndrome and variants, we assessed which centers contributed to reductions in mortality and growth failure. STUDY DESIGN: Centers within the NPC-QIC were divided into tertiles based on early performance for mortality and separately for growth failure. These groups were evaluated for improvement from the early to late time period and compared with the other groups in the late time period. RESULTS: Mortality was 3.8% for the high-performing, 7.6% for the medium-performing, and 14.4% for the low-performing groups in the early time period. Only the low-performing group had a significant change (P < .001) from the early to late period. In the late period, there was no difference in mortality between the high- (5.7%), medium- (7%), and low- (4.6%) performing centers (P = .5). Growth failure occurred in 13.9% for the high-performing, 21.9% for the medium-performing, and 32.8% for the low-performing groups in the early time period. Only the low-performing group had a significant change (P < .001) over time. In the late period, there was no significant difference in growth failure between the high- (19.8%), medium- (21.5%), and low- (13.5%) performing groups (P = .054). CONCLUSIONS: Improvements in the NPC-QIC mortality and growth measures are primarily driven by improvement in those performing the worst in these areas initially without compromising the success of high-performing centers. Focus for improvement may vary by center based on performance.


Assuntos
Educação em Saúde , Síndrome do Coração Esquerdo Hipoplásico/cirurgia , Procedimentos de Norwood/métodos , Cuidados Paliativos/normas , Melhoria de Qualidade , Sistema de Registros , Feminino , Humanos , Síndrome do Coração Esquerdo Hipoplásico/mortalidade , Lactente , Masculino , Estudos Retrospectivos
8.
J Biomed Inform ; 53: 73-80, 2015 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-25236952

RESUMO

BACKGROUND: Therapy for certain medical conditions occurs in a stepwise fashion, where one medication is recommended as initial therapy and other medications follow. Sequential pattern mining is a data mining technique used to identify patterns of ordered events. OBJECTIVE: To determine whether sequential pattern mining is effective for identifying temporal relationships between medications and accurately predicting the next medication likely to be prescribed for a patient. DESIGN: We obtained claims data from Blue Cross Blue Shield of Texas for patients prescribed at least one diabetes medication between 2008 and 2011, and divided these into a training set (90% of patients) and test set (10% of patients). We applied the CSPADE algorithm to mine sequential patterns of diabetes medication prescriptions both at the drug class and generic drug level and ranked them by the support statistic. We then evaluated the accuracy of predictions made for which diabetes medication a patient was likely to be prescribed next. RESULTS: We identified 161,497 patients who had been prescribed at least one diabetes medication. We were able to mine stepwise patterns of pharmacological therapy that were consistent with guidelines. Within three attempts, we were able to predict the medication prescribed for 90.0% of patients when making predictions by drug class, and for 64.1% when making predictions at the generic drug level. These results were stable under 10-fold cross validation, ranging from 89.1%-90.5% at the drug class level and 63.5-64.9% at the generic drug level. Using 1 or 2 items in the patient's medication history led to more accurate predictions than not using any history, but using the entire history was sometimes worse. CONCLUSION: Sequential pattern mining is an effective technique to identify temporal relationships between medications and can be used to predict next steps in a patient's medication regimen. Accurate predictions can be made without using the patient's entire medication history.


Assuntos
Prescrições de Medicamentos/estatística & dados numéricos , Tratamento Farmacológico/métodos , Seguro Saúde/estatística & dados numéricos , Reconhecimento Automatizado de Padrão , Algoritmos , Mineração de Dados , Sistemas de Apoio a Decisões Clínicas , Diabetes Mellitus/tratamento farmacológico , Progressão da Doença , Humanos , Linguagens de Programação , Reprodutibilidade dos Testes , Compostos de Sulfonilureia/uso terapêutico , Texas
9.
J Biomed Inform ; 48: 66-72, 2014 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-24321170

RESUMO

BACKGROUND: Correlation of data within electronic health records is necessary for implementation of various clinical decision support functions, including patient summarization. A key type of correlation is linking medications to clinical problems; while some databases of problem-medication links are available, they are not robust and depend on problems and medications being encoded in particular terminologies. Crowdsourcing represents one approach to generating robust knowledge bases across a variety of terminologies, but more sophisticated approaches are necessary to improve accuracy and reduce manual data review requirements. OBJECTIVE: We sought to develop and evaluate a clinician reputation metric to facilitate the identification of appropriate problem-medication pairs through crowdsourcing without requiring extensive manual review. APPROACH: We retrieved medications from our clinical data warehouse that had been prescribed and manually linked to one or more problems by clinicians during e-prescribing between June 1, 2010 and May 31, 2011. We identified measures likely to be associated with the percentage of accurate problem-medication links made by clinicians. Using logistic regression, we created a metric for identifying clinicians who had made greater than or equal to 95% appropriate links. We evaluated the accuracy of the approach by comparing links made by those physicians identified as having appropriate links to a previously manually validated subset of problem-medication pairs. RESULTS: Of 867 clinicians who asserted a total of 237,748 problem-medication links during the study period, 125 had a reputation metric that predicted the percentage of appropriate links greater than or equal to 95%. These clinicians asserted a total of 2464 linked problem-medication pairs (983 distinct pairs). Compared to a previously validated set of problem-medication pairs, the reputation metric achieved a specificity of 99.5% and marginally improved the sensitivity of previously described knowledge bases. CONCLUSION: A reputation metric may be a valuable measure for identifying high quality clinician-entered, crowdsourced data.


Assuntos
Registros Eletrônicos de Saúde , Bases de Conhecimento , Informática Médica/métodos , Sistemas Computadorizados de Registros Médicos , Crowdsourcing , Humanos , Internet , Modelos Logísticos , Preparações Farmacêuticas , Médicos , Reprodutibilidade dos Testes , Software , Interface Usuário-Computador
10.
J Gastrointest Surg ; 2024 May 28.
Artigo em Inglês | MEDLINE | ID: mdl-38815800

RESUMO

BACKGROUND: Despite growing interest in patient-reported outcome measures to track the progression of Crohn's disease, frameworks to apply these questionnaires in the preoperative setting are lacking. Using the Short Inflammatory Bowel Disease Questionnaire (sIBDQ), this study aimed to describe the interpretable quality of life thresholds and examine potential associations with future bowel resection in Crohn's disease. METHODS: Adult patients with Crohn's disease completing an sIBDQ at a clinic visit between 2020 and 2022 were eligible. A stoplight framework was adopted for sIBDQ scores, including a "Resection Red" zone suggesting poor quality of life that may benefit from discussions about surgery as well as a "Nonoperative Green" zone. Thresholds were identified with both anchor- and distribution-based methods using receiver operating characteristic curve analysis and subgroup percentile scores, respectively. To quantify associations between sIBDQ scores and subsequent bowel resection, multivariable logistic regression models were fit with covariates of age, sex assigned at birth, body mass index, medications, disease pattern and location, resection history, and the Harvey Bradshaw Index. The incremental discriminatory value of the sIBDQ beyond clinical factors was assessed through the area under the receiver operating characteristics curve (AUC) with an internal validation through bootstrap resampling. RESULTS: Of the 2003 included patients, 102 underwent Crohn's-related bowel resection. The sIBDQ Nonoperative Green zone threshold ranged from 61 to 64 and the Resection Red zone from 36 to 38. When adjusting for clinical covariates, a worse sIBDQ score was associated with greater odds of subsequent 90-day bowel resection when considered as a 1-point (odds ratio [OR] [95% CI], 1.05 [1.03-1.07]) or 5-point change (OR [95% CI], 1.27 [1.14-1.41]). Inclusion of the sIBDQ modestly improved discriminative performance (AUC [95% CI], 0.85 [0.85-0.86]) relative to models that included only demographics (0.57 [0.57-0.58]) or demographics with clinical covariates (0.83 [0.83-0.84]). CONCLUSION: In the decision-making process for bowel resection, disease-specific patient-reported outcome measures may be useful to identify patients with Crohn's disease with poor quality of life and promote a shared understanding of personalized burden.

11.
Obstet Gynecol ; 144(1): 109-117, 2024 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-38723260

RESUMO

OBJECTIVE: To develop and validate a predictive model for postpartum hemorrhage that can be deployed in clinical care using automated, real-time electronic health record (EHR) data and to compare performance of the model with a nationally published risk prediction tool. METHODS: A multivariable logistic regression model was developed from retrospective EHR data from 21,108 patients delivering at a quaternary medical center between January 1, 2018, and April 30, 2022. Deliveries were divided into derivation and validation sets based on an 80/20 split by date of delivery. Postpartum hemorrhage was defined as blood loss of 1,000 mL or more in addition to postpartum transfusion of 1 or more units of packed red blood cells. Model performance was evaluated by the area under the receiver operating characteristic curve (AUC) and was compared with a postpartum hemorrhage risk assessment tool published by the CMQCC (California Maternal Quality Care Collaborative). The model was then programmed into the EHR and again validated with prospectively collected data from 928 patients between November 7, 2023, and January 31, 2024. RESULTS: Postpartum hemorrhage occurred in 235 of 16,862 patients (1.4%) in the derivation cohort. The predictive model included 21 risk factors and demonstrated an AUC of 0.81 (95% CI, 0.79-0.84) and calibration slope of 1.0 (Brier score 0.013). During external temporal validation, the model maintained discrimination (AUC 0.80, 95% CI, 0.72-0.84) and calibration (calibration slope 0.95, Brier score 0.014). This was superior to the CMQCC tool (AUC 0.69 [95% CI, 0.67-0.70], P <.001). The model maintained performance in prospective, automated data collected with the predictive model in real time (AUC 0.82 [95% CI, 0.73-0.91]). CONCLUSION: We created and temporally validated a postpartum hemorrhage prediction model, demonstrated its superior performance over a commonly used risk prediction tool, successfully coded the model into the EHR, and prospectively validated the model using risk factor data collected in real time. Future work should evaluate external generalizability and effects on patient outcomes; to facilitate this work, we have included the model coefficients and examples of EHR integration in the article.


Assuntos
Registros Eletrônicos de Saúde , Hemorragia Pós-Parto , Humanos , Feminino , Hemorragia Pós-Parto/terapia , Gravidez , Adulto , Estudos Retrospectivos , Medição de Risco/métodos , Fatores de Risco , Modelos Logísticos , Curva ROC
12.
J Am Med Inform Assoc ; 31(6): 1367-1379, 2024 May 20.
Artigo em Inglês | MEDLINE | ID: mdl-38497958

RESUMO

OBJECTIVE: This study aimed to develop and assess the performance of fine-tuned large language models for generating responses to patient messages sent via an electronic health record patient portal. MATERIALS AND METHODS: Utilizing a dataset of messages and responses extracted from the patient portal at a large academic medical center, we developed a model (CLAIR-Short) based on a pre-trained large language model (LLaMA-65B). In addition, we used the OpenAI API to update physician responses from an open-source dataset into a format with informative paragraphs that offered patient education while emphasizing empathy and professionalism. By combining with this dataset, we further fine-tuned our model (CLAIR-Long). To evaluate fine-tuned models, we used 10 representative patient portal questions in primary care to generate responses. We asked primary care physicians to review generated responses from our models and ChatGPT and rated them for empathy, responsiveness, accuracy, and usefulness. RESULTS: The dataset consisted of 499 794 pairs of patient messages and corresponding responses from the patient portal, with 5000 patient messages and ChatGPT-updated responses from an online platform. Four primary care physicians participated in the survey. CLAIR-Short exhibited the ability to generate concise responses similar to provider's responses. CLAIR-Long responses provided increased patient educational content compared to CLAIR-Short and were rated similarly to ChatGPT's responses, receiving positive evaluations for responsiveness, empathy, and accuracy, while receiving a neutral rating for usefulness. CONCLUSION: This subjective analysis suggests that leveraging large language models to generate responses to patient messages demonstrates significant potential in facilitating communication between patients and healthcare providers.


Assuntos
Portais do Paciente , Humanos , Registros Eletrônicos de Saúde , Relações Médico-Paciente , Processamento de Linguagem Natural , Empatia , Conjuntos de Dados como Assunto
13.
Artigo em Inglês | MEDLINE | ID: mdl-38917441

RESUMO

OBJECTIVE: This study aims to investigate the feasibility of using Large Language Models (LLMs) to engage with patients at the time they are drafting a question to their healthcare providers, and generate pertinent follow-up questions that the patient can answer before sending their message, with the goal of ensuring that their healthcare provider receives all the information they need to safely and accurately answer the patient's question, eliminating back-and-forth messaging, and the associated delays and frustrations. METHODS: We collected a dataset of patient messages sent between January 1, 2022 to March 7, 2023 at Vanderbilt University Medical Center. Two internal medicine physicians identified 7 common scenarios. We used 3 LLMs to generate follow-up questions: (1) Comprehensive LLM Artificial Intelligence Responder (CLAIR): a locally fine-tuned LLM, (2) GPT4 with a simple prompt, and (3) GPT4 with a complex prompt. Five physicians rated them with the actual follow-ups written by healthcare providers on clarity, completeness, conciseness, and utility. RESULTS: For five scenarios, our CLAIR model had the best performance. The GPT4 model received higher scores for utility and completeness but lower scores for clarity and conciseness. CLAIR generated follow-up questions with similar clarity and conciseness as the actual follow-ups written by healthcare providers, with higher utility than healthcare providers and GPT4, and lower completeness than GPT4, but better than healthcare providers. CONCLUSION: LLMs can generate follow-up patient messages designed to clarify a medical question that compares favorably to those generated by healthcare providers.

14.
JMIR Med Inform ; 12: e51842, 2024 May 08.
Artigo em Inglês | MEDLINE | ID: mdl-38722209

RESUMO

Background: Numerous pressure injury prediction models have been developed using electronic health record data, yet hospital-acquired pressure injuries (HAPIs) are increasing, which demonstrates the critical challenge of implementing these models in routine care. Objective: To help bridge the gap between development and implementation, we sought to create a model that was feasible, broadly applicable, dynamic, actionable, and rigorously validated and then compare its performance to usual care (ie, the Braden scale). Methods: We extracted electronic health record data from 197,991 adult hospital admissions with 51 candidate features. For risk prediction and feature selection, we used logistic regression with a least absolute shrinkage and selection operator (LASSO) approach. To compare the model with usual care, we used the area under the receiver operating curve (AUC), Brier score, slope, intercept, and integrated calibration index. The model was validated using a temporally staggered cohort. Results: A total of 5458 HAPIs were identified between January 2018 and July 2022. We determined 22 features were necessary to achieve a parsimonious and highly accurate model. The top 5 features included tracheostomy, edema, central line, first albumin measure, and age. Our model achieved higher discrimination than the Braden scale (AUC 0.897, 95% CI 0.893-0.901 vs AUC 0.798, 95% CI 0.791-0.803). Conclusions: We developed and validated an accurate prediction model for HAPIs that surpassed the standard-of-care risk assessment and fulfilled necessary elements for implementation. Future work includes a pragmatic randomized trial to assess whether our model improves patient outcomes.

15.
Front Immunol ; 15: 1384229, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38571954

RESUMO

Objective: Positive antinuclear antibodies (ANAs) cause diagnostic dilemmas for clinicians. Currently, no tools exist to help clinicians interpret the significance of a positive ANA in individuals without diagnosed autoimmune diseases. We developed and validated a risk model to predict risk of developing autoimmune disease in positive ANA individuals. Methods: Using a de-identified electronic health record (EHR), we randomly chart reviewed 2,000 positive ANA individuals to determine if a systemic autoimmune disease was diagnosed by a rheumatologist. A priori, we considered demographics, billing codes for autoimmune disease-related symptoms, and laboratory values as variables for the risk model. We performed logistic regression and machine learning models using training and validation samples. Results: We assembled training (n = 1030) and validation (n = 449) sets. Positive ANA individuals who were younger, female, had a higher titer ANA, higher platelet count, disease-specific autoantibodies, and more billing codes related to symptoms of autoimmune diseases were all more likely to develop autoimmune diseases. The most important variables included having a disease-specific autoantibody, number of billing codes for autoimmune disease-related symptoms, and platelet count. In the logistic regression model, AUC was 0.83 (95% CI 0.79-0.86) in the training set and 0.75 (95% CI 0.68-0.81) in the validation set. Conclusion: We developed and validated a risk model that predicts risk for developing systemic autoimmune diseases and can be deployed easily within the EHR. The model can risk stratify positive ANA individuals to ensure high-risk individuals receive urgent rheumatology referrals while reassuring low-risk individuals and reducing unnecessary referrals.


Assuntos
Doenças Autoimunes , Reumatologia , Feminino , Humanos , Anticorpos Antinucleares , Autoanticorpos , Doenças Autoimunes/diagnóstico , Registros Eletrônicos de Saúde , Masculino
16.
J Am Med Inform Assoc ; 31(4): 968-974, 2024 Apr 03.
Artigo em Inglês | MEDLINE | ID: mdl-38383050

RESUMO

OBJECTIVE: To develop and evaluate a data-driven process to generate suggestions for improving alert criteria using explainable artificial intelligence (XAI) approaches. METHODS: We extracted data on alerts generated from January 1, 2019 to December 31, 2020, at Vanderbilt University Medical Center. We developed machine learning models to predict user responses to alerts. We applied XAI techniques to generate global explanations and local explanations. We evaluated the generated suggestions by comparing with alert's historical change logs and stakeholder interviews. Suggestions that either matched (or partially matched) changes already made to the alert or were considered clinically correct were classified as helpful. RESULTS: The final dataset included 2 991 823 firings with 2689 features. Among the 5 machine learning models, the LightGBM model achieved the highest Area under the ROC Curve: 0.919 [0.918, 0.920]. We identified 96 helpful suggestions. A total of 278 807 firings (9.3%) could have been eliminated. Some of the suggestions also revealed workflow and education issues. CONCLUSION: We developed a data-driven process to generate suggestions for improving alert criteria using XAI techniques. Our approach could identify improvements regarding clinical decision support (CDS) that might be overlooked or delayed in manual reviews. It also unveils a secondary purpose for the XAI: to improve quality by discovering scenarios where CDS alerts are not accepted due to workflow, education, or staffing issues.


Assuntos
Inteligência Artificial , Sistemas de Apoio a Decisões Clínicas , Humanos , Aprendizado de Máquina , Centros Médicos Acadêmicos , Escolaridade
17.
J Am Med Inform Assoc ; 31(6): 1388-1396, 2024 May 20.
Artigo em Inglês | MEDLINE | ID: mdl-38452289

RESUMO

OBJECTIVES: To evaluate the capability of using generative artificial intelligence (AI) in summarizing alert comments and to determine if the AI-generated summary could be used to improve clinical decision support (CDS) alerts. MATERIALS AND METHODS: We extracted user comments to alerts generated from September 1, 2022 to September 1, 2023 at Vanderbilt University Medical Center. For a subset of 8 alerts, comment summaries were generated independently by 2 physicians and then separately by GPT-4. We surveyed 5 CDS experts to rate the human-generated and AI-generated summaries on a scale from 1 (strongly disagree) to 5 (strongly agree) for the 4 metrics: clarity, completeness, accuracy, and usefulness. RESULTS: Five CDS experts participated in the survey. A total of 16 human-generated summaries and 8 AI-generated summaries were assessed. Among the top 8 rated summaries, five were generated by GPT-4. AI-generated summaries demonstrated high levels of clarity, accuracy, and usefulness, similar to the human-generated summaries. Moreover, AI-generated summaries exhibited significantly higher completeness and usefulness compared to the human-generated summaries (AI: 3.4 ± 1.2, human: 2.7 ± 1.2, P = .001). CONCLUSION: End-user comments provide clinicians' immediate feedback to CDS alerts and can serve as a direct and valuable data resource for improving CDS delivery. Traditionally, these comments may not be considered in the CDS review process due to their unstructured nature, large volume, and the presence of redundant or irrelevant content. Our study demonstrates that GPT-4 is capable of distilling these comments into summaries characterized by high clarity, accuracy, and completeness. AI-generated summaries are equivalent and potentially better than human-generated summaries. These AI-generated summaries could provide CDS experts with a novel means of reviewing user comments to rapidly optimize CDS alerts both online and offline.


Assuntos
Inteligência Artificial , Sistemas de Apoio a Decisões Clínicas , Sistemas de Registro de Ordens Médicas , Humanos , Registros Eletrônicos de Saúde , Processamento de Linguagem Natural
18.
Clin Colon Rectal Surg ; 26(1): 23-30, 2013 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-24436644

RESUMO

Clinical decision support (CDS) has been shown to improve clinical processes, promote patient safety, and reduce costs in healthcare settings, and it is now a requirement for clinicians as part of the Meaningful Use Regulation. However, most evidence for CDS has been evaluated primarily in internal medicine care settings, and colon and rectal surgery (CRS) has unique needs with CDS that are not frequently described in the literature. The authors reviewed published literature in informatics and medical journals, combined with expert opinion to define CDS, describe the evidence for CDS, outline the implementation process for CDS, and present applications of CDS in CRS.CDS functionalities such as order sets, documentation templates, and order facilitation aids are most often described in the literature and most likely to be beneficial in CRS. Further research is necessary to identify and better evaluate additional CDS systems in the setting of CRS.

19.
Yearb Med Inform ; 32(1): 169-178, 2023 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-37414030

RESUMO

OBJECTIVES: This literature review summarizes relevant studies from the last three years (2020-2022) related to clinical decision support (CDS) and CDS impact on health disparities and the digital divide. This survey identifies current trends and synthesizes evidence-based recommendations and considerations for future development and implementation of CDS tools. METHODS: We conducted a search in PubMed for literature published between 2020 and 2022. Our search strategy was constructed as a combination of the MEDLINE®/PubMed® Health Disparities and Minority Health Search Strategy and relevant CDS MeSH terms and phrases. We then extracted relevant data from the studies, including priority population when applicable, domain of influence on the disparity being addressed, and the type of CDS being used. We also made note of when a study discussed the digital divide in some capacity and organized the comments into general themes through group discussion. RESULTS: Our search yielded 520 studies, with 45 included at the conclusion of screening. The most frequent CDS type in this review was point-of-care alerts/reminders (33.3%). Health Care System was the most frequent domain of influence (71.1%), and Blacks/African Americans were the most frequently included priority population (42.2%). Throughout the literature, we found four general themes related to the technology divide: inaccessibility of technology, access to care, trust of technology, and technology literacy.This survey revealed the diversity of CDS being used to address health disparities and several barriers which may make CDS less effective or potentially harmful to certain populations. Regular examinations of literature that feature CDS and address health disparities can help to reveal new strategies and patterns for improving healthcare.


Assuntos
Sistemas de Apoio a Decisões Clínicas , Exclusão Digital , Humanos , Atenção à Saúde , Inquéritos e Questionários , Desigualdades de Saúde
20.
medRxiv ; 2023 Jul 16.
Artigo em Inglês | MEDLINE | ID: mdl-37503263

RESUMO

Objective: This study aimed to develop and assess the performance of fine-tuned large language models for generating responses to patient messages sent via an electronic health record patient portal. Methods: Utilizing a dataset of messages and responses extracted from the patient portal at a large academic medical center, we developed a model (CLAIR-Short) based on a pre-trained large language model (LLaMA-65B). In addition, we used the OpenAI API to update physician responses from an open-source dataset into a format with informative paragraphs that offered patient education while emphasizing empathy and professionalism. By combining with this dataset, we further fine-tuned our model (CLAIR-Long). To evaluate the fine-tuned models, we used ten representative patient portal questions in primary care to generate responses. We asked primary care physicians to review generated responses from our models and ChatGPT and rated them for empathy, responsiveness, accuracy, and usefulness. Results: The dataset consisted of a total of 499,794 pairs of patient messages and corresponding responses from the patient portal, with 5,000 patient messages and ChatGPT-updated responses from an online platform. Four primary care physicians participated in the survey. CLAIR-Short exhibited the ability to generate concise responses similar to provider's responses. CLAIR-Long responses provided increased patient educational content compared to CLAIR-Short and were rated similarly to ChatGPT's responses, receiving positive evaluations for responsiveness, empathy, and accuracy, while receiving a neutral rating for usefulness. Conclusion: Leveraging large language models to generate responses to patient messages demonstrates significant potential in facilitating communication between patients and primary care providers.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA