Búsqueda | Portal de Búsqueda de la BVS

1.

Development and Validation of a Machine Learning Algorithm Using Clinical Pages to Predict Imminent Clinical Deterioration.

Steitz, Bryan D; McCoy, Allison B; Reese, Thomas J; Liu, Siru; Weavind, Liza; Shipley, Kipp; Russo, Elise; Wright, Adam.

J Gen Intern Med ; 39(1): 27-35, 2024 Jan.

Artículo en Inglés | MEDLINE | ID: mdl-37528252

RESUMEN

BACKGROUND: Early detection of clinical deterioration among hospitalized patients is a clinical priority for patient safety and quality of care. Current automated approaches for identifying these patients perform poorly at identifying imminent events. OBJECTIVE: Develop a machine learning algorithm using pager messages sent between clinical team members to predict imminent clinical deterioration. DESIGN: We conducted a large observational study using long short-term memory machine learning models on the content and frequency of clinical pages. PARTICIPANTS: We included all hospitalizations between January 1, 2018 and December 31, 2020 at Vanderbilt University Medical Center that included at least one page message to physicians. Exclusion criteria included patients receiving palliative care, hospitalizations with a planned intensive care stay, and hospitalizations in the top 2% longest length of stay. MAIN MEASURES: Model classification performance to identify in-hospital cardiac arrest, transfer to intensive care, or Rapid Response activation in the next 3-, 6-, and 12-hours. We compared model performance against three common early warning scores: Modified Early Warning Score, National Early Warning Score, and the Epic Deterioration Index. KEY RESULTS: There were 87,783 patients (mean [SD] age 54.0 [18.8] years; 45,835 [52.2%] women) who experienced 136,778 hospitalizations. 6214 hospitalized patients experienced a deterioration event. The machine learning model accurately identified 62% of deterioration events within 3-hours prior to the event and 47% of events within 12-hours. Across each time horizon, the model surpassed performance of the best early warning score including area under the receiver operating characteristic curve at 6-hours (0.856 vs. 0.781), sensitivity at 6-hours (0.590 vs. 0.505), specificity at 6-hours (0.900 vs. 0.878), and F-score at 6-hours (0.291 vs. 0.220). CONCLUSIONS: Machine learning applied to the content and frequency of clinical pages improves prediction of imminent deterioration. Using clinical pages to monitor patient acuity supports improved detection of imminent deterioration without requiring changes to clinical workflow or nursing documentation.

Asunto(s)

Deterioro Clínico , Humanos , Femenino , Persona de Mediana Edad , Masculino , Hospitalización , Cuidados Críticos , Curva ROC , Algoritmos , Aprendizaje Automático , Estudios Retrospectivos

2.

Rates and Correlates of Uptake of Continuous Glucose Monitors Among Adults with Type 2 Diabetes in Primary Care and Endocrinology Settings.

Mayberry, Lindsay S; Guy, Charmin; Hendrickson, Chase D; McCoy, Allison B; Elasy, Tom.

J Gen Intern Med ; 38(11): 2546-2552, 2023 08.

Artículo en Inglés | MEDLINE | ID: mdl-37254011

RESUMEN

BACKGROUND: Clinical trials indicate continuous glucose monitor (CGM) use may benefit adults with type 2 diabetes, but CGM rates and correlates in real-world care settings are unknown. OBJECTIVE: We sought to ascertain prevalence and correlates of CGM use and to examine rates of new CGM prescriptions across clinic types and medication regimens. DESIGN: Retrospective cohort using electronic health records in a large academic medical center in the Southeastern US. PARTICIPANTS: Adults with type 2 diabetes and a primary care or endocrinology visit during 2021. MAIN MEASURES: Age, gender, race, ethnicity, insurance, clinic type, insulin regimen, hemoglobin A1c values, CGM prescriptions, and prescribing clinic type. KEY RESULTS: Among 30,585 adults with type 2 diabetes, 13% had used a CGM. CGM users were younger and more had private health insurance (p < .05) as compared to non-users; 72% of CGM users had an intensive insulin regimen, but 12% were not taking insulin. CGM users had higher hemoglobin A1c values (both most recent and most proximal to the first CGM prescription) than non-users. CGM users were more likely to receive endocrinology care than non-users, but 23% had only primary care visits in 2021. For each month in 2021, a mean of 90.5 (SD 12.5) people started using CGM. From 2020 to 2021, monthly rates of CGM prescriptions to new users grew 36% overall, but 125% in primary care. Most starting CGM in endocrinology had an intensive insulin regimen (82% vs. 49% starting in primary care), whereas 28% starting CGM in primary care were not using insulin (vs. 5% in endocrinology). CONCLUSION: CGM uptake for type 2 diabetes is increasing rapidly, with most growth in primary care. These trends present opportunities for healthcare system adaptations to support CGM use and related workflows in primary care to support growth in uptake.

Asunto(s)

Diabetes Mellitus Tipo 1 , Diabetes Mellitus Tipo 2 , Hipoglucemia , Adulto , Humanos , Diabetes Mellitus Tipo 2/tratamiento farmacológico , Diabetes Mellitus Tipo 2/epidemiología , Hemoglobina Glucada , Diabetes Mellitus Tipo 1/tratamiento farmacológico , Hipoglucemia/epidemiología , Estudios Retrospectivos , Automonitorización de la Glucosa Sanguínea , Glucemia , Insulina/uso terapéutico , Atención Primaria de Salud , Hipoglucemiantes/uso terapéutico

3.

Beyond chest pain: Incremental value of other variables to identify patients for an early ECG.

Bunney, Gabrielle; Sundaram, Vandana; Graber-Naidich, Anna; Miller, Katharine; Brown, Ian; McCoy, Allison B; Freeze, Brian; Berger, David; Wright, Adam; Yiadom, Maame Yaa A B.

Am J Emerg Med ; 67: 70-78, 2023 05.

Artículo en Inglés | MEDLINE | ID: mdl-36806978

RESUMEN

BACKGROUND: Chest pain (CP) is the hallmark symptom for acute coronary syndrome (ACS) but is not reported in 20-30% of patients, especially women, elderly, non-white patients, presenting to the emergency department (ED) with an ST-segment elevation myocardial infarction (STEMI). METHODS: We used a retrospective 5-year adult ED sample of 279,132 patients to explore using CP alone to predict ACS, then we incrementally added other ACS chief complaints, age, and sex in a series of multivariable logistic regression models. We evaluated each model's identification of ACS and STEMI. RESULTS: Using CP alone would recommend ECGs for 8% of patients (sensitivity, 61%; specificity, 92%) but missed 28.4% of STEMIs. The model with all variables identified ECGs for 22% of patients (sensitivity, 82%; specificity, 78%) but missed 14.7% of STEMIs. The model with CP and other ACS chief complaints had the highest sensitivity (93%) and specificity (55%), identified 45.1% of patients for ECG, and only missed 4.4% of STEMIs. CONCLUSION: CP alone had highest specificity but lacked sensitivity. Adding other ACS chief complaints increased sensitivity but identified 2.2-fold more patients for ECGs. Achieving an ECG in 10 min for patients with ACS to identify all STEMIs will be challenging without introducing more complex risk calculation into clinical care.

Asunto(s)

Síndrome Coronario Agudo , Infarto del Miocardio con Elevación del ST , Adulto , Humanos , Femenino , Anciano , Infarto del Miocardio con Elevación del ST/diagnóstico , Estudios Retrospectivos , Electrocardiografía , Dolor en el Pecho/diagnóstico , Dolor en el Pecho/etiología , Síndrome Coronario Agudo/complicaciones , Síndrome Coronario Agudo/diagnóstico , Servicio de Urgencia en Hospital

4.

Machine Learning to Predict Interstage Mortality Following Single Ventricle Palliation: A NPC-QIC Database Analysis.

Sunthankar, Sudeep D; Zhao, Juan; Wei, Wei-Qi; Hill, Garick D; Parra, David A; Kohl, Karen; McCoy, Allison; Jayaram, Natalie M; Godown, Justin.

Pediatr Cardiol ; 44(6): 1242-1250, 2023 Aug.

Artículo en Inglés | MEDLINE | ID: mdl-36820914

RESUMEN

There is high risk of mortality between stage I and stage II palliation of single ventricle heart disease. This study aimed to leverage advanced machine learning algorithms to optimize risk-prediction models and identify features most predictive of interstage mortality. This study utilized retrospective data from the National Pediatric Cardiology Quality Improvement Collaborative and included all patients who underwent stage I palliation and survived to hospital discharge (2008-2019). Multiple machine learning models were evaluated, including logistic regression, random forest, gradient boosting trees, extreme gradient boost trees, and light gradient boosting machines. A total of 3267 patients were included with 208 (6.4%) interstage deaths. Machine learning models were trained on 180 clinical features. Digoxin use at discharge was the most influential factor resulting in a lower risk of interstage mortality (p < 0.0001). Stage I surgery with Blalock-Taussig-Thomas shunt portended higher risk than Sano conduit (7.8% vs 4.4%, p = 0.0002). Non-modifiable risk factors identified with increased risk of interstage mortality included female sex, lower gestational age, and lower birth weight. Post-operative risk factors included the requirement of unplanned catheterization and more severe atrioventricular valve insufficiency at discharge. Light gradient boosting machines demonstrated the best performance with an area under the receiver operative characteristic curve of 0.642. Advanced machine learning algorithms highlight a number of modifiable and non-modifiable risk factors for interstage mortality following stage I palliation. However, model performance remains modest, suggesting the presence of unmeasured confounders that contribute to interstage risk.

Asunto(s)

Síndrome del Corazón Izquierdo Hipoplásico , Procedimientos de Norwood , Corazón Univentricular , Niño , Humanos , Lactante , Estudios Retrospectivos , Ventrículos Cardíacos/cirugía , Resultado del Tratamiento , Factores de Riesgo , Cuidados Paliativos/métodos , Síndrome del Corazón Izquierdo Hipoplásico/cirugía , Procedimientos de Norwood/efectos adversos

5.

Assessment of a Naloxone Coprescribing Alert for Patients at Risk of Opioid Overdose: A Quality Improvement Project.

Nelson, Scott D; McCoy, Allison B; Rector, Hayley; Teare, Andrew J; Barrett, Tyler W; Sigworth, Elizabeth A; Chen, Qingxia; Edwards, David A; Marcovitz, David E; Wright, Adam.

Anesth Analg ; 135(1): 26-34, 2022 07 01.

Artículo en Inglés | MEDLINE | ID: mdl-35343932

RESUMEN

BACKGROUND: Patients taking high doses of opioids, or taking opioids in combination with other central nervous system depressants, are at increased risk of opioid overdose. Coprescribing the opioid-reversal agent naloxone is an essential safety measure, recommended by the surgeon general, but the rate of naloxone coprescribing is low. Therefore, we set out to determine whether a targeted clinical decision support alert could increase the rate of naloxone coprescribing. METHODS: We conducted a before-after study from January 2019 to April 2021 at a large academic health system in the Southeast. We developed a targeted point of care decision support notification in the electronic health record to suggest ordering naloxone for patients who have a high risk of opioid overdose based on a high morphine equivalent daily dose (MEDD) ≥90 mg, concomitant benzodiazepine prescription, or a history of opioid use disorder or opioid overdose. We measured the rate of outpatient naloxone prescribing as our primary measure. A multivariable logistic regression model with robust variance to adjust for prescriptions within the same prescriber was implemented to estimate the association between alerts and naloxone coprescribing. RESULTS: The baseline naloxone coprescribing rate in 2019 was 0.28 (95% confidence interval [CI], 0.24-0.31) naloxone prescriptions per 100 opioid prescriptions. After alert implementation, the naloxone coprescribing rate increased to 4.51 (95% CI, 4.33-4.68) naloxone prescriptions per 100 opioid prescriptions (P < .001). The adjusted odds of naloxone coprescribing after alert implementation were approximately 28 times those during the baseline period (95% CI, 15-52). CONCLUSIONS: A targeted decision support alert for patients at risk for opioid overdose significantly increased the rate of naloxone coprescribing and was relatively easy to build.

Asunto(s)

Sobredosis de Droga , Sobredosis de Opiáceos , Trastornos Relacionados con Opioides , Analgésicos Opioides/efectos adversos , Sobredosis de Droga/diagnóstico , Humanos , Naloxona/efectos adversos , Antagonistas de Narcóticos/efectos adversos , Trastornos Relacionados con Opioides/complicaciones , Trastornos Relacionados con Opioides/diagnóstico , Trastornos Relacionados con Opioides/epidemiología , Mejoramiento de la Calidad

6.

Recommendations for the Conduct and Reporting of Research Involving Flexible Electronic Health Record-Based Interventions.

Wright, Adam; McCoy, Allison B; Choudhry, Niteesh K.

Ann Intern Med ; 172(11 Suppl): S110-S115, 2020 06 02.

Artículo en Inglés | MEDLINE | ID: mdl-32479179

RESUMEN

In the past 2 decades, the United States has seen widespread adoption of electronic health records (EHRs) and a transition from mostly locally developed EHRs to commercial systems. However, most research on quality improvement and safety interventions in EHRs is still conducted at a single site, in a single EHR. Although single-site studies are important early in the innovation lifecycle, multisite studies of EHR interventions are critical for generalizability. Because EHR software, configuration, and local context differ considerably across health care organizations, it can be difficult to implement a single, standardized intervention across multiple sites in a study. This article outlines key strengths, weaknesses, challenges, and opportunities for standardization of EHR interventions in multisite studies and describes flexible trial designs suitable for studying complex interventions, including EHR interventions. It also outlines key considerations for reporting on flexible trials of EHR interventions, including sharing details of the process for designing interventions and their content, details of outcomes being studied and approaches for pooling, and the importance of sharing code and configuration whenever possible.

Asunto(s)

Investigación Biomédica/normas , Registros Electrónicos de Salud/organización & administración , Guías como Asunto , Mejoramiento de la Calidad , Humanos

7.

Contribution of Free-Text Comments to the Burden of Documentation: Assessment and Analysis of Vital Sign Comments in Flowsheets.

Yin, Zhijun; Liu, Yongtai; McCoy, Allison B; Malin, Bradley A; Sengstack, Patricia R.

J Med Internet Res ; 23(3): e22806, 2021 03 04.

Artículo en Inglés | MEDLINE | ID: mdl-33661128

RESUMEN

BACKGROUND: Documentation burden is a common problem with modern electronic health record (EHR) systems. To reduce this burden, various recording methods (eg, voice recorders or motion sensors) have been proposed. However, these solutions are in an early prototype phase and are unlikely to transition into practice in the near future. A more pragmatic alternative is to directly modify the implementation of the existing functionalities of an EHR system. OBJECTIVE: This study aims to assess the nature of free-text comments entered into EHR flowsheets that supplement quantitative vital sign values and examine opportunities to simplify functionality and reduce documentation burden. METHODS: We evaluated 209,055 vital sign comments in flowsheets that were generated in the Epic EHR system at the Vanderbilt University Medical Center in 2018. We applied topic modeling, as well as the natural language processing Clinical Language Annotation, Modeling, and Processing software system, to extract generally discussed topics and detailed medical terms (expressed as probability distribution) to investigate the stories communicated in these comments. RESULTS: Our analysis showed that 63.33% (6053/9557) of the users who entered vital signs made at least one free-text comment in vital sign flowsheet entries. The user roles that were most likely to compose comments were registered nurse, technician, and licensed nurse. The most frequently identified topics were the notification of a result to health care providers (0.347), the context of a measurement (0.307), and an inability to obtain a vital sign (0.224). There were 4187 unique medical terms that were extracted from 46,029 (0.220) comments, including many symptom-related terms such as "pain," "upset," "dizziness," "coughing," "anxiety," "distress," and "fever" and drug-related terms such as "tylenol," "anesthesia," "cannula," "oxygen," "motrin," "rituxan," and "labetalol." CONCLUSIONS: Considering that flowsheet comments are generally not displayed or automatically pulled into any clinical notes, our findings suggest that the flowsheet comment functionality can be simplified (eg, via structured response fields instead of a text input dialog) to reduce health care provider effort. Moreover, rich and clinically important medical terms such as medications and symptoms should be explicitly recorded in clinical notes for better visibility.

Asunto(s)

Documentación , Registros Electrónicos de Salud , Centros Médicos Académicos , Humanos , Procesamiento de Lenguaje Natural , Signos Vitales

8.

Improved National Outcomes Achieved in a Cardiac Learning Health Collaborative Based on Early Performance Level.

Hill, Garick D; Bingler, Michael; McCoy, Allison B; Oster, Matthew E; Uzark, Karen; Bates, Katherine E.

J Pediatr ; 222: 186-192.e1, 2020 07.

Artículo en Inglés | MEDLINE | ID: mdl-32417078

RESUMEN

OBJECTIVE: Within the National Pediatric Cardiology Quality Improvement Collaborative (NPC-QIC), a learning health network developed to improve outcomes for patients with hypoplastic left heart syndrome and variants, we assessed which centers contributed to reductions in mortality and growth failure. STUDY DESIGN: Centers within the NPC-QIC were divided into tertiles based on early performance for mortality and separately for growth failure. These groups were evaluated for improvement from the early to late time period and compared with the other groups in the late time period. RESULTS: Mortality was 3.8% for the high-performing, 7.6% for the medium-performing, and 14.4% for the low-performing groups in the early time period. Only the low-performing group had a significant change (P < .001) from the early to late period. In the late period, there was no difference in mortality between the high- (5.7%), medium- (7%), and low- (4.6%) performing centers (P = .5). Growth failure occurred in 13.9% for the high-performing, 21.9% for the medium-performing, and 32.8% for the low-performing groups in the early time period. Only the low-performing group had a significant change (P < .001) over time. In the late period, there was no significant difference in growth failure between the high- (19.8%), medium- (21.5%), and low- (13.5%) performing groups (P = .054). CONCLUSIONS: Improvements in the NPC-QIC mortality and growth measures are primarily driven by improvement in those performing the worst in these areas initially without compromising the success of high-performing centers. Focus for improvement may vary by center based on performance.

Asunto(s)

Educación en Salud , Síndrome del Corazón Izquierdo Hipoplásico/cirugía , Procedimientos de Norwood/métodos , Cuidados Paliativos/normas , Mejoramiento de la Calidad , Sistema de Registros , Femenino , Humanos , Síndrome del Corazón Izquierdo Hipoplásico/mortalidad , Lactante , Masculino , Estudios Retrospectivos

9.

The use of sequential pattern mining to predict next prescribed medications.

Wright, Aileen P; Wright, Adam T; McCoy, Allison B; Sittig, Dean F.

J Biomed Inform ; 53: 73-80, 2015 Feb.

Artículo en Inglés | MEDLINE | ID: mdl-25236952

RESUMEN

BACKGROUND: Therapy for certain medical conditions occurs in a stepwise fashion, where one medication is recommended as initial therapy and other medications follow. Sequential pattern mining is a data mining technique used to identify patterns of ordered events. OBJECTIVE: To determine whether sequential pattern mining is effective for identifying temporal relationships between medications and accurately predicting the next medication likely to be prescribed for a patient. DESIGN: We obtained claims data from Blue Cross Blue Shield of Texas for patients prescribed at least one diabetes medication between 2008 and 2011, and divided these into a training set (90% of patients) and test set (10% of patients). We applied the CSPADE algorithm to mine sequential patterns of diabetes medication prescriptions both at the drug class and generic drug level and ranked them by the support statistic. We then evaluated the accuracy of predictions made for which diabetes medication a patient was likely to be prescribed next. RESULTS: We identified 161,497 patients who had been prescribed at least one diabetes medication. We were able to mine stepwise patterns of pharmacological therapy that were consistent with guidelines. Within three attempts, we were able to predict the medication prescribed for 90.0% of patients when making predictions by drug class, and for 64.1% when making predictions at the generic drug level. These results were stable under 10-fold cross validation, ranging from 89.1%-90.5% at the drug class level and 63.5-64.9% at the generic drug level. Using 1 or 2 items in the patient's medication history led to more accurate predictions than not using any history, but using the entire history was sometimes worse. CONCLUSION: Sequential pattern mining is an effective technique to identify temporal relationships between medications and can be used to predict next steps in a patient's medication regimen. Accurate predictions can be made without using the patient's entire medication history.

Asunto(s)

Prescripciones de Medicamentos/estadística & datos numéricos , Quimioterapia/métodos , Seguro de Salud/estadística & datos numéricos , Reconocimiento de Normas Patrones Automatizadas , Algoritmos , Minería de Datos , Sistemas de Apoyo a Decisiones Clínicas , Diabetes Mellitus/tratamiento farmacológico , Progresión de la Enfermedad , Humanos , Lenguajes de Programación , Reproducibilidad de los Resultados , Compuestos de Sulfonilurea/uso terapéutico , Texas

10.

Development of a clinician reputation metric to identify appropriate problem-medication pairs in a crowdsourced knowledge base.

McCoy, Allison B; Wright, Adam; Rogith, Deevakar; Fathiamini, Safa; Ottenbacher, Allison J; Sittig, Dean F.

J Biomed Inform ; 48: 66-72, 2014 Apr.

Artículo en Inglés | MEDLINE | ID: mdl-24321170

RESUMEN

BACKGROUND: Correlation of data within electronic health records is necessary for implementation of various clinical decision support functions, including patient summarization. A key type of correlation is linking medications to clinical problems; while some databases of problem-medication links are available, they are not robust and depend on problems and medications being encoded in particular terminologies. Crowdsourcing represents one approach to generating robust knowledge bases across a variety of terminologies, but more sophisticated approaches are necessary to improve accuracy and reduce manual data review requirements. OBJECTIVE: We sought to develop and evaluate a clinician reputation metric to facilitate the identification of appropriate problem-medication pairs through crowdsourcing without requiring extensive manual review. APPROACH: We retrieved medications from our clinical data warehouse that had been prescribed and manually linked to one or more problems by clinicians during e-prescribing between June 1, 2010 and May 31, 2011. We identified measures likely to be associated with the percentage of accurate problem-medication links made by clinicians. Using logistic regression, we created a metric for identifying clinicians who had made greater than or equal to 95% appropriate links. We evaluated the accuracy of the approach by comparing links made by those physicians identified as having appropriate links to a previously manually validated subset of problem-medication pairs. RESULTS: Of 867 clinicians who asserted a total of 237,748 problem-medication links during the study period, 125 had a reputation metric that predicted the percentage of appropriate links greater than or equal to 95%. These clinicians asserted a total of 2464 linked problem-medication pairs (983 distinct pairs). Compared to a previously validated set of problem-medication pairs, the reputation metric achieved a specificity of 99.5% and marginally improved the sensitivity of previously described knowledge bases. CONCLUSION: A reputation metric may be a valuable measure for identifying high quality clinician-entered, crowdsourced data.

Asunto(s)

Registros Electrónicos de Salud , Bases del Conocimiento , Informática Médica/métodos , Sistemas de Registros Médicos Computarizados , Colaboración de las Masas , Humanos , Internet , Modelos Logísticos , Preparaciones Farmacéuticas , Médicos , Reproducibilidad de los Resultados , Programas Informáticos , Interfaz Usuario-Computador

11.

Accuracy and Bias in Artificial Intelligence Chatbot Recommendations for Oculoplastic Surgeons.

Parikh, Alomi O; Oca, Michael C; Conger, Jordan R; McCoy, Allison; Chang, Jessica; Zhang-Nunes, Sandy.

Cureus ; 16(4): e57611, 2024 Apr.

Artículo en Inglés | MEDLINE | ID: mdl-38707042

RESUMEN

Purpose The purpose of this study is to assess the accuracy of and bias in recommendations for oculoplastic surgeons from three artificial intelligence (AI) chatbot systems. Methods ChatGPT, Microsoft Bing Balanced, and Google Bard were asked for recommendations for oculoplastic surgeons practicing in 20 cities with the highest population in the United States. Three prompts were used: "can you help me find (an oculoplastic surgeon)/(a doctor who does eyelid lifts)/(an oculofacial plastic surgeon) in (city)." Results A total of 672 suggestions were made between (oculoplastic surgeon; doctor who does eyelid lifts; oculofacial plastic surgeon); 19.8% suggestions were excluded, leaving 539 suggested physicians. Of these, 64.1% were oculoplastics specialists (of which 70.1% were American Society of Ophthalmic Plastic and Reconstructive Surgery (ASOPRS) members); 16.1% were general plastic surgery trained, 9.0% were ENT trained, 8.8% were ophthalmology but not oculoplastics trained, and 1.9% were trained in another specialty. 27.7% of recommendations across all AI systems were female. Conclusions Among the chatbot systems tested, there were high rates of inaccuracy: up to 38% of recommended surgeons were nonexistent or not practicing in the city requested, and 35.9% of those recommended as oculoplastic/oculofacial plastic surgeons were not oculoplastics specialists. Choice of prompt affected the result, with requests for "a doctor who does eyelid lifts" resulting in more plastic surgeons and ENTs and fewer oculoplastic surgeons. It is important to identify inaccuracies and biases in recommendations provided by AI systems as more patients may start using them to choose a surgeon.

12.

Development and Validation of an Automated, Real-Time Predictive Model for Postpartum Hemorrhage.

Ende, Holly B; Domenico, Henry J; Polic, Aleksandra; Wesoloski, Amber; Zuckerwise, Lisa C; Mccoy, Allison B; Woytash, Annastacia R; Moore, Ryan P; Byrne, Daniel W.

Obstet Gynecol ; 144(1): 109-117, 2024 Jul 01.

Artículo en Inglés | MEDLINE | ID: mdl-38723260

RESUMEN

OBJECTIVE: To develop and validate a predictive model for postpartum hemorrhage that can be deployed in clinical care using automated, real-time electronic health record (EHR) data and to compare performance of the model with a nationally published risk prediction tool. METHODS: A multivariable logistic regression model was developed from retrospective EHR data from 21,108 patients delivering at a quaternary medical center between January 1, 2018, and April 30, 2022. Deliveries were divided into derivation and validation sets based on an 80/20 split by date of delivery. Postpartum hemorrhage was defined as blood loss of 1,000 mL or more in addition to postpartum transfusion of 1 or more units of packed red blood cells. Model performance was evaluated by the area under the receiver operating characteristic curve (AUC) and was compared with a postpartum hemorrhage risk assessment tool published by the CMQCC (California Maternal Quality Care Collaborative). The model was then programmed into the EHR and again validated with prospectively collected data from 928 patients between November 7, 2023, and January 31, 2024. RESULTS: Postpartum hemorrhage occurred in 235 of 16,862 patients (1.4%) in the derivation cohort. The predictive model included 21 risk factors and demonstrated an AUC of 0.81 (95% CI, 0.79-0.84) and calibration slope of 1.0 (Brier score 0.013). During external temporal validation, the model maintained discrimination (AUC 0.80, 95% CI, 0.72-0.84) and calibration (calibration slope 0.95, Brier score 0.014). This was superior to the CMQCC tool (AUC 0.69 [95% CI, 0.67-0.70], P <.001). The model maintained performance in prospective, automated data collected with the predictive model in real time (AUC 0.82 [95% CI, 0.73-0.91]). CONCLUSION: We created and temporally validated a postpartum hemorrhage prediction model, demonstrated its superior performance over a commonly used risk prediction tool, successfully coded the model into the EHR, and prospectively validated the model using risk factor data collected in real time. Future work should evaluate external generalizability and effects on patient outcomes; to facilitate this work, we have included the model coefficients and examples of EHR integration in the article.

Asunto(s)

Registros Electrónicos de Salud , Hemorragia Posparto , Humanos , Femenino , Hemorragia Posparto/terapia , Embarazo , Adulto , Estudios Retrospectivos , Medición de Riesgo/métodos , Factores de Riesgo , Modelos Logísticos , Curva ROC

13.

Leveraging large language models for generating responses to patient messages-a subjective analysis.

Liu, Siru; McCoy, Allison B; Wright, Aileen P; Carew, Babatunde; Genkins, Julian Z; Huang, Sean S; Peterson, Josh F; Steitz, Bryan; Wright, Adam.

J Am Med Inform Assoc ; 31(6): 1367-1379, 2024 May 20.

Artículo en Inglés | MEDLINE | ID: mdl-38497958

RESUMEN

OBJECTIVE: This study aimed to develop and assess the performance of fine-tuned large language models for generating responses to patient messages sent via an electronic health record patient portal. MATERIALS AND METHODS: Utilizing a dataset of messages and responses extracted from the patient portal at a large academic medical center, we developed a model (CLAIR-Short) based on a pre-trained large language model (LLaMA-65B). In addition, we used the OpenAI API to update physician responses from an open-source dataset into a format with informative paragraphs that offered patient education while emphasizing empathy and professionalism. By combining with this dataset, we further fine-tuned our model (CLAIR-Long). To evaluate fine-tuned models, we used 10 representative patient portal questions in primary care to generate responses. We asked primary care physicians to review generated responses from our models and ChatGPT and rated them for empathy, responsiveness, accuracy, and usefulness. RESULTS: The dataset consisted of 499 794 pairs of patient messages and corresponding responses from the patient portal, with 5000 patient messages and ChatGPT-updated responses from an online platform. Four primary care physicians participated in the survey. CLAIR-Short exhibited the ability to generate concise responses similar to provider's responses. CLAIR-Long responses provided increased patient educational content compared to CLAIR-Short and were rated similarly to ChatGPT's responses, receiving positive evaluations for responsiveness, empathy, and accuracy, while receiving a neutral rating for usefulness. CONCLUSION: This subjective analysis suggests that leveraging large language models to generate responses to patient messages demonstrates significant potential in facilitating communication between patients and healthcare providers.

Asunto(s)

Portales del Paciente , Humanos , Registros Electrónicos de Salud , Relaciones Médico-Paciente , Procesamiento de Lenguaje Natural , Empatía , Conjuntos de Datos como Asunto

14.

Surgically-relevant quality of life thresholds for the Short Inflammatory Bowel Disease Questionnaire in Crohn's disease.

Ueland, Thomas E; Horst, Sara N; Shroder, Megan M; Ye, Fei; Bai, Kun; McCoy, Allison B; Bachmann, Justin M; Hawkins, Alexander T.

J Gastrointest Surg ; 28(8): 1265-1272, 2024 Aug.

Artículo en Inglés | MEDLINE | ID: mdl-38815800

RESUMEN

BACKGROUND: Despite growing interest in patient-reported outcome measures to track the progression of Crohn's disease, frameworks to apply these questionnaires in the preoperative setting are lacking. Using the Short Inflammatory Bowel Disease Questionnaire (sIBDQ), this study aimed to describe the interpretable quality of life thresholds and examine potential associations with future bowel resection in Crohn's disease. METHODS: Adult patients with Crohn's disease completing an sIBDQ at a clinic visit between 2020 and 2022 were eligible. A stoplight framework was adopted for sIBDQ scores, including a "Resection Red" zone suggesting poor quality of life that may benefit from discussions about surgery as well as a "Nonoperative Green" zone. Thresholds were identified with both anchor- and distribution-based methods using receiver operating characteristic curve analysis and subgroup percentile scores, respectively. To quantify associations between sIBDQ scores and subsequent bowel resection, multivariable logistic regression models were fit with covariates of age, sex assigned at birth, body mass index, medications, disease pattern and location, resection history, and the Harvey Bradshaw Index. The incremental discriminatory value of the sIBDQ beyond clinical factors was assessed through the area under the receiver operating characteristics curve (AUC) with an internal validation through bootstrap resampling. RESULTS: Of the 2003 included patients, 102 underwent Crohn's-related bowel resection. The sIBDQ Nonoperative Green zone threshold ranged from 61 to 64 and the Resection Red zone from 36 to 38. When adjusting for clinical covariates, a worse sIBDQ score was associated with greater odds of subsequent 90-day bowel resection when considered as a 1-point (odds ratio [OR] [95% CI], 1.05 [1.03-1.07]) or 5-point change (OR [95% CI], 1.27 [1.14-1.41]). Inclusion of the sIBDQ modestly improved discriminative performance (AUC [95% CI], 0.85 [0.85-0.86]) relative to models that included only demographics (0.57 [0.57-0.58]) or demographics with clinical covariates (0.83 [0.83-0.84]). CONCLUSION: In the decision-making process for bowel resection, disease-specific patient-reported outcome measures may be useful to identify patients with Crohn's disease with poor quality of life and promote a shared understanding of personalized burden.

Asunto(s)

Enfermedad de Crohn , Medición de Resultados Informados por el Paciente , Calidad de Vida , Humanos , Enfermedad de Crohn/cirugía , Enfermedad de Crohn/psicología , Masculino , Femenino , Adulto , Encuestas y Cuestionarios , Persona de Mediana Edad , Curva ROC , Adulto Joven

15.

Fellows of the American Medical Informatics Association (FAMIA): Looking Back and Looking Ahead.

Heermann Langford, Laura; Fultz Hollis, Kate; Edmunds, Margo; McCoy, Allison B; Hall, Eric S; Nielson, Jeffrey A; Rosetti, Sarah Collins.

Appl Clin Inform ; 15(4): 650-659, 2024 Aug.

Artículo en Inglés | MEDLINE | ID: mdl-39111297

RESUMEN

BACKGROUND: Over the past 30 years, the American Medical Informatics Association (AMIA) has played a pivotal role in fostering a collaborative community for professionals in biomedical and health informatics. As an interdisciplinary association, AMIA brings together individuals with clinical, research, and computer expertise and emphasizes the use of data to enhance biomedical research and clinical work. The need for a recognition program within AMIA, acknowledging applied informatics skills by members, led to the establishment of the Fellows of AMIA (FAMIA) Recognition Program in 2018. OBJECTIVES: To outline the evolution of the FAMIA program and shed light on its origins, development, and impact. This report explores factors that led to the establishment of FAMIA, considerations affecting its development, and the objectives FAMIA seeks to achieve within the broader context of AMIA. METHODS: The development of FAMIA is examined through a historical lens, encompassing key milestones, discussions, and decisions that shaped the program. Insights into the formation of FAMIA were gathered through discussions within AMIA membership and leadership, including proposals, board-level discussions, and the involvement of key stakeholders. Additionally, the report outlines criteria for FAMIA eligibility and the pathways available for recognition, namely the Certification Pathway and the Long-Term Experience Pathway. RESULTS: The FAMIA program has inducted five classes, totaling 602 fellows. An overview of disciplines, roles, and application pathways for FAMIA members is provided. A comparative analysis with other fellow recognition programs in related fields showcases the unique features and contributions of FAMIA in acknowledging applied informatics. CONCLUSION: Now in its sixth year, FAMIA acknowledges the growing influence of applied informatics within health information professionals, recognizing individuals with experience, training, and a commitment to the highest level of applied informatics and the science associated with it.

Asunto(s)

Informática Médica , Estados Unidos , Becas , Sociedades Médicas , Humanos , Historia del Siglo XXI

16.

Identifying antinuclear antibody positive individuals at risk for developing systemic autoimmune disease: development and validation of a real-time risk model.

Barnado, April; Moore, Ryan P; Domenico, Henry J; Green, Sarah; Camai, Alex; Suh, Ashley; Han, Bryan; Walker, Katherine; Anderson, Audrey; Caruth, Lannawill; Katta, Anish; McCoy, Allison B; Byrne, Daniel W.

Front Immunol ; 15: 1384229, 2024.

Artículo en Inglés | MEDLINE | ID: mdl-38571954

RESUMEN

Objective: Positive antinuclear antibodies (ANAs) cause diagnostic dilemmas for clinicians. Currently, no tools exist to help clinicians interpret the significance of a positive ANA in individuals without diagnosed autoimmune diseases. We developed and validated a risk model to predict risk of developing autoimmune disease in positive ANA individuals. Methods: Using a de-identified electronic health record (EHR), we randomly chart reviewed 2,000 positive ANA individuals to determine if a systemic autoimmune disease was diagnosed by a rheumatologist. A priori, we considered demographics, billing codes for autoimmune disease-related symptoms, and laboratory values as variables for the risk model. We performed logistic regression and machine learning models using training and validation samples. Results: We assembled training (n = 1030) and validation (n = 449) sets. Positive ANA individuals who were younger, female, had a higher titer ANA, higher platelet count, disease-specific autoantibodies, and more billing codes related to symptoms of autoimmune diseases were all more likely to develop autoimmune diseases. The most important variables included having a disease-specific autoantibody, number of billing codes for autoimmune disease-related symptoms, and platelet count. In the logistic regression model, AUC was 0.83 (95% CI 0.79-0.86) in the training set and 0.75 (95% CI 0.68-0.81) in the validation set. Conclusion: We developed and validated a risk model that predicts risk for developing systemic autoimmune diseases and can be deployed easily within the EHR. The model can risk stratify positive ANA individuals to ensure high-risk individuals receive urgent rheumatology referrals while reassuring low-risk individuals and reducing unnecessary referrals.

Asunto(s)

Enfermedades Autoinmunes , Reumatología , Femenino , Humanos , Anticuerpos Antinucleares , Autoanticuerpos , Enfermedades Autoinmunes/diagnóstico , Registros Electrónicos de Salud , Masculino

17.

Implementable Prediction of Pressure Injuries in Hospitalized Adults: Model Development and Validation.

Reese, Thomas J; Domenico, Henry J; Hernandez, Antonio; Byrne, Daniel W; Moore, Ryan P; Williams, Jessica B; Douthit, Brian J; Russo, Elise; McCoy, Allison B; Ivory, Catherine H; Steitz, Bryan D; Wright, Adam.

JMIR Med Inform ; 12: e51842, 2024 May 08.

Artículo en Inglés | MEDLINE | ID: mdl-38722209

RESUMEN

Background: Numerous pressure injury prediction models have been developed using electronic health record data, yet hospital-acquired pressure injuries (HAPIs) are increasing, which demonstrates the critical challenge of implementing these models in routine care. Objective: To help bridge the gap between development and implementation, we sought to create a model that was feasible, broadly applicable, dynamic, actionable, and rigorously validated and then compare its performance to usual care (ie, the Braden scale). Methods: We extracted electronic health record data from 197,991 adult hospital admissions with 51 candidate features. For risk prediction and feature selection, we used logistic regression with a least absolute shrinkage and selection operator (LASSO) approach. To compare the model with usual care, we used the area under the receiver operating curve (AUC), Brier score, slope, intercept, and integrated calibration index. The model was validated using a temporally staggered cohort. Results: A total of 5458 HAPIs were identified between January 2018 and July 2022. We determined 22 features were necessary to achieve a parsimonious and highly accurate model. The top 5 features included tracheostomy, edema, central line, first albumin measure, and age. Our model achieved higher discrimination than the Braden scale (AUC 0.897, 95% CI 0.893-0.901 vs AUC 0.798, 95% CI 0.791-0.803). Conclusions: We developed and validated an accurate prediction model for HAPIs that surpassed the standard-of-care risk assessment and fulfilled necessary elements for implementation. Future work includes a pragmatic randomized trial to assess whether our model improves patient outcomes.

18.

Using large language model to guide patients to create efficient and comprehensive clinical care message.

Liu, Siru; Wright, Aileen P; Mccoy, Allison B; Huang, Sean S; Genkins, Julian Z; Peterson, Josh F; Kumah-Crystal, Yaa A; Martinez, William; Carew, Babatunde; Mize, Dara; Steitz, Bryan; Wright, Adam.

J Am Med Inform Assoc ; 31(8): 1665-1670, 2024 Aug 01.

Artículo en Inglés | MEDLINE | ID: mdl-38917441

RESUMEN

OBJECTIVE: This study aims to investigate the feasibility of using Large Language Models (LLMs) to engage with patients at the time they are drafting a question to their healthcare providers, and generate pertinent follow-up questions that the patient can answer before sending their message, with the goal of ensuring that their healthcare provider receives all the information they need to safely and accurately answer the patient's question, eliminating back-and-forth messaging, and the associated delays and frustrations. METHODS: We collected a dataset of patient messages sent between January 1, 2022 to March 7, 2023 at Vanderbilt University Medical Center. Two internal medicine physicians identified 7 common scenarios. We used 3 LLMs to generate follow-up questions: (1) Comprehensive LLM Artificial Intelligence Responder (CLAIR): a locally fine-tuned LLM, (2) GPT4 with a simple prompt, and (3) GPT4 with a complex prompt. Five physicians rated them with the actual follow-ups written by healthcare providers on clarity, completeness, conciseness, and utility. RESULTS: For five scenarios, our CLAIR model had the best performance. The GPT4 model received higher scores for utility and completeness but lower scores for clarity and conciseness. CLAIR generated follow-up questions with similar clarity and conciseness as the actual follow-ups written by healthcare providers, with higher utility than healthcare providers and GPT4, and lower completeness than GPT4, but better than healthcare providers. CONCLUSION: LLMs can generate follow-up patient messages designed to clarify a medical question that compares favorably to those generated by healthcare providers.

Asunto(s)

Inteligencia Artificial , Humanos , Relaciones Médico-Paciente , Estudios de Factibilidad , Envío de Mensajes de Texto

19.

Why do users override alerts? Utilizing large language model to summarize comments and optimize clinical decision support.

Liu, Siru; McCoy, Allison B; Wright, Aileen P; Nelson, Scott D; Huang, Sean S; Ahmad, Hasan B; Carro, Sabrina E; Franklin, Jacob; Brogan, James; Wright, Adam.

J Am Med Inform Assoc ; 31(6): 1388-1396, 2024 May 20.

Artículo en Inglés | MEDLINE | ID: mdl-38452289

RESUMEN

OBJECTIVES: To evaluate the capability of using generative artificial intelligence (AI) in summarizing alert comments and to determine if the AI-generated summary could be used to improve clinical decision support (CDS) alerts. MATERIALS AND METHODS: We extracted user comments to alerts generated from September 1, 2022 to September 1, 2023 at Vanderbilt University Medical Center. For a subset of 8 alerts, comment summaries were generated independently by 2 physicians and then separately by GPT-4. We surveyed 5 CDS experts to rate the human-generated and AI-generated summaries on a scale from 1 (strongly disagree) to 5 (strongly agree) for the 4 metrics: clarity, completeness, accuracy, and usefulness. RESULTS: Five CDS experts participated in the survey. A total of 16 human-generated summaries and 8 AI-generated summaries were assessed. Among the top 8 rated summaries, five were generated by GPT-4. AI-generated summaries demonstrated high levels of clarity, accuracy, and usefulness, similar to the human-generated summaries. Moreover, AI-generated summaries exhibited significantly higher completeness and usefulness compared to the human-generated summaries (AI: 3.4 ± 1.2, human: 2.7 ± 1.2, P = .001). CONCLUSION: End-user comments provide clinicians' immediate feedback to CDS alerts and can serve as a direct and valuable data resource for improving CDS delivery. Traditionally, these comments may not be considered in the CDS review process due to their unstructured nature, large volume, and the presence of redundant or irrelevant content. Our study demonstrates that GPT-4 is capable of distilling these comments into summaries characterized by high clarity, accuracy, and completeness. AI-generated summaries are equivalent and potentially better than human-generated summaries. These AI-generated summaries could provide CDS experts with a novel means of reviewing user comments to rapidly optimize CDS alerts both online and offline.

Asunto(s)

Inteligencia Artificial , Sistemas de Apoyo a Decisiones Clínicas , Sistemas de Entrada de Órdenes Médicas , Humanos , Registros Electrónicos de Salud , Procesamiento de Lenguaje Natural

20.

Leveraging explainable artificial intelligence to optimize clinical decision support.

Liu, Siru; McCoy, Allison B; Peterson, Josh F; Lasko, Thomas A; Sittig, Dean F; Nelson, Scott D; Andrews, Jennifer; Patterson, Lorraine; Cobb, Cheryl M; Mulherin, David; Morton, Colleen T; Wright, Adam.

J Am Med Inform Assoc ; 31(4): 968-974, 2024 04 03.

Artículo en Inglés | MEDLINE | ID: mdl-38383050

RESUMEN

OBJECTIVE: To develop and evaluate a data-driven process to generate suggestions for improving alert criteria using explainable artificial intelligence (XAI) approaches. METHODS: We extracted data on alerts generated from January 1, 2019 to December 31, 2020, at Vanderbilt University Medical Center. We developed machine learning models to predict user responses to alerts. We applied XAI techniques to generate global explanations and local explanations. We evaluated the generated suggestions by comparing with alert's historical change logs and stakeholder interviews. Suggestions that either matched (or partially matched) changes already made to the alert or were considered clinically correct were classified as helpful. RESULTS: The final dataset included 2â991â823 firings with 2689 features. Among the 5 machine learning models, the LightGBM model achieved the highest Area under the ROC Curve: 0.919 [0.918, 0.920]. We identified 96 helpful suggestions. A total of 278â807 firings (9.3%) could have been eliminated. Some of the suggestions also revealed workflow and education issues. CONCLUSION: We developed a data-driven process to generate suggestions for improving alert criteria using XAI techniques. Our approach could identify improvements regarding clinical decision support (CDS) that might be overlooked or delayed in manual reviews. It also unveils a secondary purpose for the XAI: to improve quality by discovering scenarios where CDS alerts are not accepted due to workflow, education, or staffing issues.

Asunto(s)

Inteligencia Artificial , Sistemas de Apoyo a Decisiones Clínicas , Humanos , Aprendizaje Automático , Centros Médicos Académicos , Escolaridad

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA