RESUMEN
BACKGROUND: Early detection of clinical deterioration among hospitalized patients is a clinical priority for patient safety and quality of care. Current automated approaches for identifying these patients perform poorly at identifying imminent events. OBJECTIVE: Develop a machine learning algorithm using pager messages sent between clinical team members to predict imminent clinical deterioration. DESIGN: We conducted a large observational study using long short-term memory machine learning models on the content and frequency of clinical pages. PARTICIPANTS: We included all hospitalizations between January 1, 2018 and December 31, 2020 at Vanderbilt University Medical Center that included at least one page message to physicians. Exclusion criteria included patients receiving palliative care, hospitalizations with a planned intensive care stay, and hospitalizations in the top 2% longest length of stay. MAIN MEASURES: Model classification performance to identify in-hospital cardiac arrest, transfer to intensive care, or Rapid Response activation in the next 3-, 6-, and 12-hours. We compared model performance against three common early warning scores: Modified Early Warning Score, National Early Warning Score, and the Epic Deterioration Index. KEY RESULTS: There were 87,783 patients (mean [SD] age 54.0 [18.8] years; 45,835 [52.2%] women) who experienced 136,778 hospitalizations. 6214 hospitalized patients experienced a deterioration event. The machine learning model accurately identified 62% of deterioration events within 3-hours prior to the event and 47% of events within 12-hours. Across each time horizon, the model surpassed performance of the best early warning score including area under the receiver operating characteristic curve at 6-hours (0.856 vs. 0.781), sensitivity at 6-hours (0.590 vs. 0.505), specificity at 6-hours (0.900 vs. 0.878), and F-score at 6-hours (0.291 vs. 0.220). CONCLUSIONS: Machine learning applied to the content and frequency of clinical pages improves prediction of imminent deterioration. Using clinical pages to monitor patient acuity supports improved detection of imminent deterioration without requiring changes to clinical workflow or nursing documentation.
Asunto(s)
Deterioro Clínico , Humanos , Femenino , Persona de Mediana Edad , Masculino , Hospitalización , Cuidados Críticos , Curva ROC , Algoritmos , Aprendizaje Automático , Estudios RetrospectivosRESUMEN
Clinical prediction models have been widely acknowledged as informative tools providing evidence-based support for clinical decision making. However, prediction models are often underused in clinical practice due to many reasons including missing information upon real-time risk calculation in electronic health records (EHR) system. Existing literature to address this challenge focuses on statistical comparison of various approaches while overlooking the feasibility of their implementation in EHR. In this article, we propose a novel and feasible submodel approach to address this challenge for prediction models developed using the model approximation (also termed "preconditioning") method. The proposed submodel coefficients are equivalent to the corresponding original prediction model coefficients plus a correction factor. Comprehensive simulations were conducted to assess the performance of the proposed method and compared with the existing "one-step-sweep" approach as well as the imputation approach. In general, the simulation results show the preconditioning-based submodel approach is robust to various heterogeneity scenarios and is comparable to the imputation-based approach, while the "one-step-sweep" approach is less robust under certain heterogeneity scenarios. The proposed method was applied to facilitate real-time implementation of a prediction model to identify emergency department patients with acute heart failure who can be safely discharged home.
Asunto(s)
Simulación por Computador , Registros Electrónicos de Salud , Modelos Estadísticos , Humanos , Factores de Riesgo , Medición de Riesgo/métodosRESUMEN
BACKGROUND: Clinical trials indicate continuous glucose monitor (CGM) use may benefit adults with type 2 diabetes, but CGM rates and correlates in real-world care settings are unknown. OBJECTIVE: We sought to ascertain prevalence and correlates of CGM use and to examine rates of new CGM prescriptions across clinic types and medication regimens. DESIGN: Retrospective cohort using electronic health records in a large academic medical center in the Southeastern US. PARTICIPANTS: Adults with type 2 diabetes and a primary care or endocrinology visit during 2021. MAIN MEASURES: Age, gender, race, ethnicity, insurance, clinic type, insulin regimen, hemoglobin A1c values, CGM prescriptions, and prescribing clinic type. KEY RESULTS: Among 30,585 adults with type 2 diabetes, 13% had used a CGM. CGM users were younger and more had private health insurance (p < .05) as compared to non-users; 72% of CGM users had an intensive insulin regimen, but 12% were not taking insulin. CGM users had higher hemoglobin A1c values (both most recent and most proximal to the first CGM prescription) than non-users. CGM users were more likely to receive endocrinology care than non-users, but 23% had only primary care visits in 2021. For each month in 2021, a mean of 90.5 (SD 12.5) people started using CGM. From 2020 to 2021, monthly rates of CGM prescriptions to new users grew 36% overall, but 125% in primary care. Most starting CGM in endocrinology had an intensive insulin regimen (82% vs. 49% starting in primary care), whereas 28% starting CGM in primary care were not using insulin (vs. 5% in endocrinology). CONCLUSION: CGM uptake for type 2 diabetes is increasing rapidly, with most growth in primary care. These trends present opportunities for healthcare system adaptations to support CGM use and related workflows in primary care to support growth in uptake.
Asunto(s)
Diabetes Mellitus Tipo 1 , Diabetes Mellitus Tipo 2 , Hipoglucemia , Adulto , Humanos , Diabetes Mellitus Tipo 2/tratamiento farmacológico , Diabetes Mellitus Tipo 2/epidemiología , Hemoglobina Glucada , Diabetes Mellitus Tipo 1/tratamiento farmacológico , Hipoglucemia/epidemiología , Estudios Retrospectivos , Automonitorización de la Glucosa Sanguínea , Glucemia , Insulina/uso terapéutico , Atención Primaria de Salud , Hipoglucemiantes/uso terapéuticoRESUMEN
BACKGROUND: Chest pain (CP) is the hallmark symptom for acute coronary syndrome (ACS) but is not reported in 20-30% of patients, especially women, elderly, non-white patients, presenting to the emergency department (ED) with an ST-segment elevation myocardial infarction (STEMI). METHODS: We used a retrospective 5-year adult ED sample of 279,132 patients to explore using CP alone to predict ACS, then we incrementally added other ACS chief complaints, age, and sex in a series of multivariable logistic regression models. We evaluated each model's identification of ACS and STEMI. RESULTS: Using CP alone would recommend ECGs for 8% of patients (sensitivity, 61%; specificity, 92%) but missed 28.4% of STEMIs. The model with all variables identified ECGs for 22% of patients (sensitivity, 82%; specificity, 78%) but missed 14.7% of STEMIs. The model with CP and other ACS chief complaints had the highest sensitivity (93%) and specificity (55%), identified 45.1% of patients for ECG, and only missed 4.4% of STEMIs. CONCLUSION: CP alone had highest specificity but lacked sensitivity. Adding other ACS chief complaints increased sensitivity but identified 2.2-fold more patients for ECGs. Achieving an ECG in 10 min for patients with ACS to identify all STEMIs will be challenging without introducing more complex risk calculation into clinical care.
Asunto(s)
Síndrome Coronario Agudo , Infarto del Miocardio con Elevación del ST , Adulto , Humanos , Femenino , Anciano , Infarto del Miocardio con Elevación del ST/diagnóstico , Estudios Retrospectivos , Electrocardiografía , Dolor en el Pecho/diagnóstico , Dolor en el Pecho/etiología , Síndrome Coronario Agudo/complicaciones , Síndrome Coronario Agudo/diagnóstico , Servicio de Urgencia en HospitalRESUMEN
BACKGROUND: Patients taking high doses of opioids, or taking opioids in combination with other central nervous system depressants, are at increased risk of opioid overdose. Coprescribing the opioid-reversal agent naloxone is an essential safety measure, recommended by the surgeon general, but the rate of naloxone coprescribing is low. Therefore, we set out to determine whether a targeted clinical decision support alert could increase the rate of naloxone coprescribing. METHODS: We conducted a before-after study from January 2019 to April 2021 at a large academic health system in the Southeast. We developed a targeted point of care decision support notification in the electronic health record to suggest ordering naloxone for patients who have a high risk of opioid overdose based on a high morphine equivalent daily dose (MEDD) ≥90 mg, concomitant benzodiazepine prescription, or a history of opioid use disorder or opioid overdose. We measured the rate of outpatient naloxone prescribing as our primary measure. A multivariable logistic regression model with robust variance to adjust for prescriptions within the same prescriber was implemented to estimate the association between alerts and naloxone coprescribing. RESULTS: The baseline naloxone coprescribing rate in 2019 was 0.28 (95% confidence interval [CI], 0.24-0.31) naloxone prescriptions per 100 opioid prescriptions. After alert implementation, the naloxone coprescribing rate increased to 4.51 (95% CI, 4.33-4.68) naloxone prescriptions per 100 opioid prescriptions (P < .001). The adjusted odds of naloxone coprescribing after alert implementation were approximately 28 times those during the baseline period (95% CI, 15-52). CONCLUSIONS: A targeted decision support alert for patients at risk for opioid overdose significantly increased the rate of naloxone coprescribing and was relatively easy to build.
Asunto(s)
Sobredosis de Droga , Sobredosis de Opiáceos , Trastornos Relacionados con Opioides , Analgésicos Opioides/efectos adversos , Sobredosis de Droga/diagnóstico , Humanos , Naloxona/efectos adversos , Antagonistas de Narcóticos/efectos adversos , Trastornos Relacionados con Opioides/complicaciones , Trastornos Relacionados con Opioides/diagnóstico , Trastornos Relacionados con Opioides/epidemiología , Mejoramiento de la CalidadRESUMEN
In the past 2 decades, the United States has seen widespread adoption of electronic health records (EHRs) and a transition from mostly locally developed EHRs to commercial systems. However, most research on quality improvement and safety interventions in EHRs is still conducted at a single site, in a single EHR. Although single-site studies are important early in the innovation lifecycle, multisite studies of EHR interventions are critical for generalizability. Because EHR software, configuration, and local context differ considerably across health care organizations, it can be difficult to implement a single, standardized intervention across multiple sites in a study. This article outlines key strengths, weaknesses, challenges, and opportunities for standardization of EHR interventions in multisite studies and describes flexible trial designs suitable for studying complex interventions, including EHR interventions. It also outlines key considerations for reporting on flexible trials of EHR interventions, including sharing details of the process for designing interventions and their content, details of outcomes being studied and approaches for pooling, and the importance of sharing code and configuration whenever possible.
Asunto(s)
Investigación Biomédica/normas , Registros Electrónicos de Salud/organización & administración , Guías como Asunto , Mejoramiento de la Calidad , HumanosRESUMEN
BACKGROUND: Documentation burden is a common problem with modern electronic health record (EHR) systems. To reduce this burden, various recording methods (eg, voice recorders or motion sensors) have been proposed. However, these solutions are in an early prototype phase and are unlikely to transition into practice in the near future. A more pragmatic alternative is to directly modify the implementation of the existing functionalities of an EHR system. OBJECTIVE: This study aims to assess the nature of free-text comments entered into EHR flowsheets that supplement quantitative vital sign values and examine opportunities to simplify functionality and reduce documentation burden. METHODS: We evaluated 209,055 vital sign comments in flowsheets that were generated in the Epic EHR system at the Vanderbilt University Medical Center in 2018. We applied topic modeling, as well as the natural language processing Clinical Language Annotation, Modeling, and Processing software system, to extract generally discussed topics and detailed medical terms (expressed as probability distribution) to investigate the stories communicated in these comments. RESULTS: Our analysis showed that 63.33% (6053/9557) of the users who entered vital signs made at least one free-text comment in vital sign flowsheet entries. The user roles that were most likely to compose comments were registered nurse, technician, and licensed nurse. The most frequently identified topics were the notification of a result to health care providers (0.347), the context of a measurement (0.307), and an inability to obtain a vital sign (0.224). There were 4187 unique medical terms that were extracted from 46,029 (0.220) comments, including many symptom-related terms such as "pain," "upset," "dizziness," "coughing," "anxiety," "distress," and "fever" and drug-related terms such as "tylenol," "anesthesia," "cannula," "oxygen," "motrin," "rituxan," and "labetalol." CONCLUSIONS: Considering that flowsheet comments are generally not displayed or automatically pulled into any clinical notes, our findings suggest that the flowsheet comment functionality can be simplified (eg, via structured response fields instead of a text input dialog) to reduce health care provider effort. Moreover, rich and clinically important medical terms such as medications and symptoms should be explicitly recorded in clinical notes for better visibility.
Asunto(s)
Documentación , Registros Electrónicos de Salud , Centros Médicos Académicos , Humanos , Procesamiento de Lenguaje Natural , Signos VitalesRESUMEN
OBJECTIVE: Within the National Pediatric Cardiology Quality Improvement Collaborative (NPC-QIC), a learning health network developed to improve outcomes for patients with hypoplastic left heart syndrome and variants, we assessed which centers contributed to reductions in mortality and growth failure. STUDY DESIGN: Centers within the NPC-QIC were divided into tertiles based on early performance for mortality and separately for growth failure. These groups were evaluated for improvement from the early to late time period and compared with the other groups in the late time period. RESULTS: Mortality was 3.8% for the high-performing, 7.6% for the medium-performing, and 14.4% for the low-performing groups in the early time period. Only the low-performing group had a significant change (P < .001) from the early to late period. In the late period, there was no difference in mortality between the high- (5.7%), medium- (7%), and low- (4.6%) performing centers (P = .5). Growth failure occurred in 13.9% for the high-performing, 21.9% for the medium-performing, and 32.8% for the low-performing groups in the early time period. Only the low-performing group had a significant change (P < .001) over time. In the late period, there was no significant difference in growth failure between the high- (19.8%), medium- (21.5%), and low- (13.5%) performing groups (P = .054). CONCLUSIONS: Improvements in the NPC-QIC mortality and growth measures are primarily driven by improvement in those performing the worst in these areas initially without compromising the success of high-performing centers. Focus for improvement may vary by center based on performance.
Asunto(s)
Educación en Salud , Síndrome del Corazón Izquierdo Hipoplásico/cirugía , Procedimientos de Norwood/métodos , Cuidados Paliativos/normas , Mejoramiento de la Calidad , Sistema de Registros , Femenino , Humanos , Síndrome del Corazón Izquierdo Hipoplásico/mortalidad , Lactante , Masculino , Estudios RetrospectivosRESUMEN
BACKGROUND: Therapy for certain medical conditions occurs in a stepwise fashion, where one medication is recommended as initial therapy and other medications follow. Sequential pattern mining is a data mining technique used to identify patterns of ordered events. OBJECTIVE: To determine whether sequential pattern mining is effective for identifying temporal relationships between medications and accurately predicting the next medication likely to be prescribed for a patient. DESIGN: We obtained claims data from Blue Cross Blue Shield of Texas for patients prescribed at least one diabetes medication between 2008 and 2011, and divided these into a training set (90% of patients) and test set (10% of patients). We applied the CSPADE algorithm to mine sequential patterns of diabetes medication prescriptions both at the drug class and generic drug level and ranked them by the support statistic. We then evaluated the accuracy of predictions made for which diabetes medication a patient was likely to be prescribed next. RESULTS: We identified 161,497 patients who had been prescribed at least one diabetes medication. We were able to mine stepwise patterns of pharmacological therapy that were consistent with guidelines. Within three attempts, we were able to predict the medication prescribed for 90.0% of patients when making predictions by drug class, and for 64.1% when making predictions at the generic drug level. These results were stable under 10-fold cross validation, ranging from 89.1%-90.5% at the drug class level and 63.5-64.9% at the generic drug level. Using 1 or 2 items in the patient's medication history led to more accurate predictions than not using any history, but using the entire history was sometimes worse. CONCLUSION: Sequential pattern mining is an effective technique to identify temporal relationships between medications and can be used to predict next steps in a patient's medication regimen. Accurate predictions can be made without using the patient's entire medication history.
Asunto(s)
Prescripciones de Medicamentos/estadística & datos numéricos , Quimioterapia/métodos , Seguro de Salud/estadística & datos numéricos , Reconocimiento de Normas Patrones Automatizadas , Algoritmos , Minería de Datos , Sistemas de Apoyo a Decisiones Clínicas , Diabetes Mellitus/tratamiento farmacológico , Progresión de la Enfermedad , Humanos , Lenguajes de Programación , Reproducibilidad de los Resultados , Compuestos de Sulfonilurea/uso terapéutico , TexasRESUMEN
BACKGROUND: Correlation of data within electronic health records is necessary for implementation of various clinical decision support functions, including patient summarization. A key type of correlation is linking medications to clinical problems; while some databases of problem-medication links are available, they are not robust and depend on problems and medications being encoded in particular terminologies. Crowdsourcing represents one approach to generating robust knowledge bases across a variety of terminologies, but more sophisticated approaches are necessary to improve accuracy and reduce manual data review requirements. OBJECTIVE: We sought to develop and evaluate a clinician reputation metric to facilitate the identification of appropriate problem-medication pairs through crowdsourcing without requiring extensive manual review. APPROACH: We retrieved medications from our clinical data warehouse that had been prescribed and manually linked to one or more problems by clinicians during e-prescribing between June 1, 2010 and May 31, 2011. We identified measures likely to be associated with the percentage of accurate problem-medication links made by clinicians. Using logistic regression, we created a metric for identifying clinicians who had made greater than or equal to 95% appropriate links. We evaluated the accuracy of the approach by comparing links made by those physicians identified as having appropriate links to a previously manually validated subset of problem-medication pairs. RESULTS: Of 867 clinicians who asserted a total of 237,748 problem-medication links during the study period, 125 had a reputation metric that predicted the percentage of appropriate links greater than or equal to 95%. These clinicians asserted a total of 2464 linked problem-medication pairs (983 distinct pairs). Compared to a previously validated set of problem-medication pairs, the reputation metric achieved a specificity of 99.5% and marginally improved the sensitivity of previously described knowledge bases. CONCLUSION: A reputation metric may be a valuable measure for identifying high quality clinician-entered, crowdsourced data.
Asunto(s)
Registros Electrónicos de Salud , Bases del Conocimiento , Informática Médica/métodos , Sistemas de Registros Médicos Computarizados , Colaboración de las Masas , Humanos , Internet , Modelos Logísticos , Preparaciones Farmacéuticas , Médicos , Reproducibilidad de los Resultados , Programas Informáticos , Interfaz Usuario-ComputadorRESUMEN
BACKGROUND: The Vanderbilt Clinical Informatics Center (VCLIC) is based in the Department of Biomedical Informatics (DBMI) and operates across Vanderbilt University Medical Center (VUMC) and Vanderbilt University (VU) with a goal of enabling and supporting clinical informatics research and practice. VCLIC supports several types of applied clinical informatics teaching, including teaching of students in courses, professional education for staff and faculty throughout VUMC, and workshops and conferences that are open to the public. OBJECTIVES: In this paper, we provide a detailed accounting of our center and institution's methods of educating and training faculty, staff, students, and trainees from across the academic institution and health system on clinical informatics topics, including formal training programs and informal applied learning sessions. METHODS: Through a host of informal learning events, such as workshops, seminars, conference-style events, bite-size instructive videos, and hackathons, as well as several formal education programs, such as the Clinical Informatics Graduate Course, Master's in Applied Clinical Informatics, Medical Student Integrated Science Course, Graduate Medical Education Elective, and Fellowship in Clinical Informatics, VCLIC and VUMC provide opportunities for faculty, students, trainees, and even staff to engage with Clinical Informatics topics and learn related skills. RESULTS: The described programs have trained hundreds of participants from across the academic and clinical enterprises. Of the VCLIC-held events, the majority of attendees indicated through surveys that they were satisfied, with the average satisfaction score being 4.63/5, and all events averaging a satisfaction score of greater than 4. Across the 20 events VCLIC has held, our largest audiences are DBMI, HealthIT operational staff, and students from the medical and nursing schools. CONCLUSIONS: VCLIC has created and delivered a successful suite of formal and informal educational events and programs to disseminate clinical informatics knowledge and skills to learners across the academic institution and healthcare system.
RESUMEN
OBJECTIVE: This study aimed to develop and assess the performance of fine-tuned large language models for generating responses to patient messages sent via an electronic health record patient portal. MATERIALS AND METHODS: Utilizing a dataset of messages and responses extracted from the patient portal at a large academic medical center, we developed a model (CLAIR-Short) based on a pre-trained large language model (LLaMA-65B). In addition, we used the OpenAI API to update physician responses from an open-source dataset into a format with informative paragraphs that offered patient education while emphasizing empathy and professionalism. By combining with this dataset, we further fine-tuned our model (CLAIR-Long). To evaluate fine-tuned models, we used 10 representative patient portal questions in primary care to generate responses. We asked primary care physicians to review generated responses from our models and ChatGPT and rated them for empathy, responsiveness, accuracy, and usefulness. RESULTS: The dataset consisted of 499 794 pairs of patient messages and corresponding responses from the patient portal, with 5000 patient messages and ChatGPT-updated responses from an online platform. Four primary care physicians participated in the survey. CLAIR-Short exhibited the ability to generate concise responses similar to provider's responses. CLAIR-Long responses provided increased patient educational content compared to CLAIR-Short and were rated similarly to ChatGPT's responses, receiving positive evaluations for responsiveness, empathy, and accuracy, while receiving a neutral rating for usefulness. CONCLUSION: This subjective analysis suggests that leveraging large language models to generate responses to patient messages demonstrates significant potential in facilitating communication between patients and healthcare providers.
Asunto(s)
Portales del Paciente , Humanos , Registros Electrónicos de Salud , Relaciones Médico-Paciente , Procesamiento de Lenguaje Natural , Empatía , Conjuntos de Datos como AsuntoRESUMEN
OBJECTIVE: To develop and validate a predictive model for postpartum hemorrhage that can be deployed in clinical care using automated, real-time electronic health record (EHR) data and to compare performance of the model with a nationally published risk prediction tool. METHODS: A multivariable logistic regression model was developed from retrospective EHR data from 21,108 patients delivering at a quaternary medical center between January 1, 2018, and April 30, 2022. Deliveries were divided into derivation and validation sets based on an 80/20 split by date of delivery. Postpartum hemorrhage was defined as blood loss of 1,000 mL or more in addition to postpartum transfusion of 1 or more units of packed red blood cells. Model performance was evaluated by the area under the receiver operating characteristic curve (AUC) and was compared with a postpartum hemorrhage risk assessment tool published by the CMQCC (California Maternal Quality Care Collaborative). The model was then programmed into the EHR and again validated with prospectively collected data from 928 patients between November 7, 2023, and January 31, 2024. RESULTS: Postpartum hemorrhage occurred in 235 of 16,862 patients (1.4%) in the derivation cohort. The predictive model included 21 risk factors and demonstrated an AUC of 0.81 (95% CI, 0.79-0.84) and calibration slope of 1.0 (Brier score 0.013). During external temporal validation, the model maintained discrimination (AUC 0.80, 95% CI, 0.72-0.84) and calibration (calibration slope 0.95, Brier score 0.014). This was superior to the CMQCC tool (AUC 0.69 [95% CI, 0.67-0.70], P <.001). The model maintained performance in prospective, automated data collected with the predictive model in real time (AUC 0.82 [95% CI, 0.73-0.91]). CONCLUSION: We created and temporally validated a postpartum hemorrhage prediction model, demonstrated its superior performance over a commonly used risk prediction tool, successfully coded the model into the EHR, and prospectively validated the model using risk factor data collected in real time. Future work should evaluate external generalizability and effects on patient outcomes; to facilitate this work, we have included the model coefficients and examples of EHR integration in the article.
Asunto(s)
Registros Electrónicos de Salud , Hemorragia Posparto , Humanos , Femenino , Hemorragia Posparto/terapia , Embarazo , Adulto , Estudios Retrospectivos , Medición de Riesgo/métodos , Factores de Riesgo , Modelos Logísticos , Curva ROCRESUMEN
BACKGROUND: Despite growing interest in patient-reported outcome measures to track the progression of Crohn's disease, frameworks to apply these questionnaires in the preoperative setting are lacking. Using the Short Inflammatory Bowel Disease Questionnaire (sIBDQ), this study aimed to describe the interpretable quality of life thresholds and examine potential associations with future bowel resection in Crohn's disease. METHODS: Adult patients with Crohn's disease completing an sIBDQ at a clinic visit between 2020 and 2022 were eligible. A stoplight framework was adopted for sIBDQ scores, including a "Resection Red" zone suggesting poor quality of life that may benefit from discussions about surgery as well as a "Nonoperative Green" zone. Thresholds were identified with both anchor- and distribution-based methods using receiver operating characteristic curve analysis and subgroup percentile scores, respectively. To quantify associations between sIBDQ scores and subsequent bowel resection, multivariable logistic regression models were fit with covariates of age, sex assigned at birth, body mass index, medications, disease pattern and location, resection history, and the Harvey Bradshaw Index. The incremental discriminatory value of the sIBDQ beyond clinical factors was assessed through the area under the receiver operating characteristics curve (AUC) with an internal validation through bootstrap resampling. RESULTS: Of the 2003 included patients, 102 underwent Crohn's-related bowel resection. The sIBDQ Nonoperative Green zone threshold ranged from 61 to 64 and the Resection Red zone from 36 to 38. When adjusting for clinical covariates, a worse sIBDQ score was associated with greater odds of subsequent 90-day bowel resection when considered as a 1-point (odds ratio [OR] [95% CI], 1.05 [1.03-1.07]) or 5-point change (OR [95% CI], 1.27 [1.14-1.41]). Inclusion of the sIBDQ modestly improved discriminative performance (AUC [95% CI], 0.85 [0.85-0.86]) relative to models that included only demographics (0.57 [0.57-0.58]) or demographics with clinical covariates (0.83 [0.83-0.84]). CONCLUSION: In the decision-making process for bowel resection, disease-specific patient-reported outcome measures may be useful to identify patients with Crohn's disease with poor quality of life and promote a shared understanding of personalized burden.
Asunto(s)
Enfermedad de Crohn , Medición de Resultados Informados por el Paciente , Calidad de Vida , Humanos , Enfermedad de Crohn/cirugía , Enfermedad de Crohn/psicología , Masculino , Femenino , Adulto , Encuestas y Cuestionarios , Persona de Mediana Edad , Curva ROC , Adulto JovenRESUMEN
BACKGROUND: Over the past 30 years, the American Medical Informatics Association (AMIA) has played a pivotal role in fostering a collaborative community for professionals in biomedical and health informatics. As an interdisciplinary association, AMIA brings together individuals with clinical, research, and computer expertise and emphasizes the use of data to enhance biomedical research and clinical work. The need for a recognition program within AMIA, acknowledging applied informatics skills by members, led to the establishment of the Fellows of AMIA (FAMIA) Recognition Program in 2018. OBJECTIVES: To outline the evolution of the FAMIA program and shed light on its origins, development, and impact. This report explores factors that led to the establishment of FAMIA, considerations affecting its development, and the objectives FAMIA seeks to achieve within the broader context of AMIA. METHODS: The development of FAMIA is examined through a historical lens, encompassing key milestones, discussions, and decisions that shaped the program. Insights into the formation of FAMIA were gathered through discussions within AMIA membership and leadership, including proposals, board-level discussions, and the involvement of key stakeholders. Additionally, the report outlines criteria for FAMIA eligibility and the pathways available for recognition, namely the Certification Pathway and the Long-Term Experience Pathway. RESULTS: The FAMIA program has inducted five classes, totaling 602 fellows. An overview of disciplines, roles, and application pathways for FAMIA members is provided. A comparative analysis with other fellow recognition programs in related fields showcases the unique features and contributions of FAMIA in acknowledging applied informatics. CONCLUSION: Now in its sixth year, FAMIA acknowledges the growing influence of applied informatics within health information professionals, recognizing individuals with experience, training, and a commitment to the highest level of applied informatics and the science associated with it.
Asunto(s)
Informática Médica , Estados Unidos , Becas , Sociedades Médicas , Humanos , Historia del Siglo XXIRESUMEN
Objective: Positive antinuclear antibodies (ANAs) cause diagnostic dilemmas for clinicians. Currently, no tools exist to help clinicians interpret the significance of a positive ANA in individuals without diagnosed autoimmune diseases. We developed and validated a risk model to predict risk of developing autoimmune disease in positive ANA individuals. Methods: Using a de-identified electronic health record (EHR), we randomly chart reviewed 2,000 positive ANA individuals to determine if a systemic autoimmune disease was diagnosed by a rheumatologist. A priori, we considered demographics, billing codes for autoimmune disease-related symptoms, and laboratory values as variables for the risk model. We performed logistic regression and machine learning models using training and validation samples. Results: We assembled training (n = 1030) and validation (n = 449) sets. Positive ANA individuals who were younger, female, had a higher titer ANA, higher platelet count, disease-specific autoantibodies, and more billing codes related to symptoms of autoimmune diseases were all more likely to develop autoimmune diseases. The most important variables included having a disease-specific autoantibody, number of billing codes for autoimmune disease-related symptoms, and platelet count. In the logistic regression model, AUC was 0.83 (95% CI 0.79-0.86) in the training set and 0.75 (95% CI 0.68-0.81) in the validation set. Conclusion: We developed and validated a risk model that predicts risk for developing systemic autoimmune diseases and can be deployed easily within the EHR. The model can risk stratify positive ANA individuals to ensure high-risk individuals receive urgent rheumatology referrals while reassuring low-risk individuals and reducing unnecessary referrals.
Asunto(s)
Enfermedades Autoinmunes , Reumatología , Femenino , Humanos , Anticuerpos Antinucleares , Autoanticuerpos , Enfermedades Autoinmunes/diagnóstico , Registros Electrónicos de Salud , MasculinoRESUMEN
Background: Numerous pressure injury prediction models have been developed using electronic health record data, yet hospital-acquired pressure injuries (HAPIs) are increasing, which demonstrates the critical challenge of implementing these models in routine care. Objective: To help bridge the gap between development and implementation, we sought to create a model that was feasible, broadly applicable, dynamic, actionable, and rigorously validated and then compare its performance to usual care (ie, the Braden scale). Methods: We extracted electronic health record data from 197,991 adult hospital admissions with 51 candidate features. For risk prediction and feature selection, we used logistic regression with a least absolute shrinkage and selection operator (LASSO) approach. To compare the model with usual care, we used the area under the receiver operating curve (AUC), Brier score, slope, intercept, and integrated calibration index. The model was validated using a temporally staggered cohort. Results: A total of 5458 HAPIs were identified between January 2018 and July 2022. We determined 22 features were necessary to achieve a parsimonious and highly accurate model. The top 5 features included tracheostomy, edema, central line, first albumin measure, and age. Our model achieved higher discrimination than the Braden scale (AUC 0.897, 95% CI 0.893-0.901 vs AUC 0.798, 95% CI 0.791-0.803). Conclusions: We developed and validated an accurate prediction model for HAPIs that surpassed the standard-of-care risk assessment and fulfilled necessary elements for implementation. Future work includes a pragmatic randomized trial to assess whether our model improves patient outcomes.
RESUMEN
OBJECTIVE: This study aims to investigate the feasibility of using Large Language Models (LLMs) to engage with patients at the time they are drafting a question to their healthcare providers, and generate pertinent follow-up questions that the patient can answer before sending their message, with the goal of ensuring that their healthcare provider receives all the information they need to safely and accurately answer the patient's question, eliminating back-and-forth messaging, and the associated delays and frustrations. METHODS: We collected a dataset of patient messages sent between January 1, 2022 to March 7, 2023 at Vanderbilt University Medical Center. Two internal medicine physicians identified 7 common scenarios. We used 3 LLMs to generate follow-up questions: (1) Comprehensive LLM Artificial Intelligence Responder (CLAIR): a locally fine-tuned LLM, (2) GPT4 with a simple prompt, and (3) GPT4 with a complex prompt. Five physicians rated them with the actual follow-ups written by healthcare providers on clarity, completeness, conciseness, and utility. RESULTS: For five scenarios, our CLAIR model had the best performance. The GPT4 model received higher scores for utility and completeness but lower scores for clarity and conciseness. CLAIR generated follow-up questions with similar clarity and conciseness as the actual follow-ups written by healthcare providers, with higher utility than healthcare providers and GPT4, and lower completeness than GPT4, but better than healthcare providers. CONCLUSION: LLMs can generate follow-up patient messages designed to clarify a medical question that compares favorably to those generated by healthcare providers.
Asunto(s)
Inteligencia Artificial , Humanos , Relaciones Médico-Paciente , Estudios de Factibilidad , Envío de Mensajes de TextoRESUMEN
OBJECTIVES: To evaluate the capability of using generative artificial intelligence (AI) in summarizing alert comments and to determine if the AI-generated summary could be used to improve clinical decision support (CDS) alerts. MATERIALS AND METHODS: We extracted user comments to alerts generated from September 1, 2022 to September 1, 2023 at Vanderbilt University Medical Center. For a subset of 8 alerts, comment summaries were generated independently by 2 physicians and then separately by GPT-4. We surveyed 5 CDS experts to rate the human-generated and AI-generated summaries on a scale from 1 (strongly disagree) to 5 (strongly agree) for the 4 metrics: clarity, completeness, accuracy, and usefulness. RESULTS: Five CDS experts participated in the survey. A total of 16 human-generated summaries and 8 AI-generated summaries were assessed. Among the top 8 rated summaries, five were generated by GPT-4. AI-generated summaries demonstrated high levels of clarity, accuracy, and usefulness, similar to the human-generated summaries. Moreover, AI-generated summaries exhibited significantly higher completeness and usefulness compared to the human-generated summaries (AI: 3.4 ± 1.2, human: 2.7 ± 1.2, P = .001). CONCLUSION: End-user comments provide clinicians' immediate feedback to CDS alerts and can serve as a direct and valuable data resource for improving CDS delivery. Traditionally, these comments may not be considered in the CDS review process due to their unstructured nature, large volume, and the presence of redundant or irrelevant content. Our study demonstrates that GPT-4 is capable of distilling these comments into summaries characterized by high clarity, accuracy, and completeness. AI-generated summaries are equivalent and potentially better than human-generated summaries. These AI-generated summaries could provide CDS experts with a novel means of reviewing user comments to rapidly optimize CDS alerts both online and offline.
Asunto(s)
Inteligencia Artificial , Sistemas de Apoyo a Decisiones Clínicas , Sistemas de Entrada de Órdenes Médicas , Humanos , Registros Electrónicos de Salud , Procesamiento de Lenguaje NaturalRESUMEN
OBJECTIVE: To develop and evaluate a data-driven process to generate suggestions for improving alert criteria using explainable artificial intelligence (XAI) approaches. METHODS: We extracted data on alerts generated from January 1, 2019 to December 31, 2020, at Vanderbilt University Medical Center. We developed machine learning models to predict user responses to alerts. We applied XAI techniques to generate global explanations and local explanations. We evaluated the generated suggestions by comparing with alert's historical change logs and stakeholder interviews. Suggestions that either matched (or partially matched) changes already made to the alert or were considered clinically correct were classified as helpful. RESULTS: The final dataset included 2â991â823 firings with 2689 features. Among the 5 machine learning models, the LightGBM model achieved the highest Area under the ROC Curve: 0.919 [0.918, 0.920]. We identified 96 helpful suggestions. A total of 278â807 firings (9.3%) could have been eliminated. Some of the suggestions also revealed workflow and education issues. CONCLUSION: We developed a data-driven process to generate suggestions for improving alert criteria using XAI techniques. Our approach could identify improvements regarding clinical decision support (CDS) that might be overlooked or delayed in manual reviews. It also unveils a secondary purpose for the XAI: to improve quality by discovering scenarios where CDS alerts are not accepted due to workflow, education, or staffing issues.