Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 141
Filtrar
1.
medRxiv ; 2024 Mar 19.
Artículo en Inglés | MEDLINE | ID: mdl-38562803

RESUMEN

Rationale: Early detection of clinical deterioration using early warning scores may improve outcomes. However, most implemented scores were developed using logistic regression, only underwent retrospective internal validation, and were not tested in important patient subgroups. Objectives: To develop a gradient boosted machine model (eCARTv5) for identifying clinical deterioration and then validate externally, test prospectively, and evaluate across patient subgroups. Methods: All adult patients hospitalized on the wards in seven hospitals from 2008- 2022 were used to develop eCARTv5, with demographics, vital signs, clinician documentation, and laboratory values utilized to predict intensive care unit transfer or death in the next 24 hours. The model was externally validated retrospectively in 21 hospitals from 2009-2023 and prospectively in 10 hospitals from February to May 2023. eCARTv5 was compared to the Modified Early Warning Score (MEWS) and the National Early Warning Score (NEWS) using the area under the receiver operating characteristic curve (AUROC). Measurements and Main Results: The development cohort included 901,491 admissions, the retrospective validation cohort included 1,769,461 admissions, and the prospective validation cohort included 46,330 admissions. In retrospective validation, eCART had the highest AUROC (0.835; 95%CI 0.834, 0.835), followed by NEWS (0.766 (95%CI 0.766, 0.767)), and MEWS (0.704 (95%CI 0.703, 0.704)). eCART's performance remained high (AUROC ≥0.80) across a range of patient demographics, clinical conditions, and during prospective validation. Conclusions: We developed eCARTv5, which accurately identifies early clinical deterioration in hospitalized ward patients. Our model performed better than the NEWS and MEWS retrospectively, prospectively, and across a range of subgroups.

2.
Artículo en Inglés | MEDLINE | ID: mdl-38679906

RESUMEN

OBJECTIVES: To compare and externally validate popular deep learning model architectures and data transformation methods for variable-length time series data in 3 clinical tasks (clinical deterioration, severe acute kidney injury [AKI], and suspected infection). MATERIALS AND METHODS: This multicenter retrospective study included admissions at 2 medical centers that spanned 2007-2022. Distinct datasets were created for each clinical task, with 1 site used for training and the other for testing. Three feature engineering methods (normalization, standardization, and piece-wise linear encoding with decision trees [PLE-DTs]) and 3 architectures (long short-term memory/gated recurrent unit [LSTM/GRU], temporal convolutional network, and time-distributed wrapper with convolutional neural network [TDW-CNN]) were compared in each clinical task. Model discrimination was evaluated using the area under the precision-recall curve (AUPRC) and the area under the receiver operating characteristic curve (AUROC). RESULTS: The study comprised 373 825 admissions for training and 256 128 admissions for testing. LSTM/GRU models tied with TDW-CNN models with both obtaining the highest mean AUPRC in 2 tasks, and LSTM/GRU had the highest mean AUROC across all tasks (deterioration: 0.81, AKI: 0.92, infection: 0.87). PLE-DT with LSTM/GRU achieved the highest AUPRC in all tasks. DISCUSSION: When externally validated in 3 clinical tasks, the LSTM/GRU model architecture with PLE-DT transformed data demonstrated the highest AUPRC in all tasks. Multiple models achieved similar performance when evaluated using AUROC. CONCLUSION: The LSTM architecture performs as well or better than some newer architectures, and PLE-DT may enhance the AUPRC in variable-length time series data for predicting clinical outcomes during external validation.

3.
Artículo en Inglés | MEDLINE | ID: mdl-38587875

RESUMEN

OBJECTIVE: The timely stratification of trauma injury severity can enhance the quality of trauma care but it requires intense manual annotation from certified trauma coders. The objective of this study is to develop machine learning models for the stratification of trauma injury severity across various body regions using clinical text and structured electronic health records (EHRs) data. MATERIALS AND METHODS: Our study utilized clinical documents and structured EHR variables linked with the trauma registry data to create 2 machine learning models with different approaches to representing text. The first one fuses concept unique identifiers (CUIs) extracted from free text with structured EHR variables, while the second one integrates free text with structured EHR variables. Temporal validation was undertaken to ensure the models' temporal generalizability. Additionally, analyses to assess the variable importance were conducted. RESULTS: Both models demonstrated impressive performance in categorizing leg injuries, achieving high accuracy with macro-F1 scores of over 0.8. Additionally, they showed considerable accuracy, with macro-F1 scores exceeding or near 0.7, in assessing injuries in the areas of the chest and head. We showed in our variable importance analysis that the most important features in the model have strong face validity in determining clinically relevant trauma injuries. DISCUSSION: The CUI-based model achieves comparable performance, if not higher, compared to the free-text-based model, with reduced complexity. Furthermore, integrating structured EHR data improves performance, particularly when the text modalities are insufficiently indicative. CONCLUSIONS: Our multi-modal, multiclass models can provide accurate stratification of trauma injury severity and clinically relevant interpretations.

4.
medRxiv ; 2024 Apr 09.
Artículo en Inglés | MEDLINE | ID: mdl-38562730

RESUMEN

In the evolving landscape of clinical Natural Language Generation (NLG), assessing abstractive text quality remains challenging, as existing methods often overlook generative task complexities. This work aimed to examine the current state of automated evaluation metrics in NLG in healthcare. To have a robust and well-validated baseline with which to examine the alignment of these metrics, we created a comprehensive human evaluation framework. Employing ChatGPT-3.5-turbo generative output, we correlated human judgments with each metric. None of the metrics demonstrated high alignment; however, the SapBERT score-a Unified Medical Language System (UMLS)- showed the best results. This underscores the importance of incorporating domain-specific knowledge into evaluation efforts. Our work reveals the deficiency in quality evaluations for generated text and introduces our comprehensive human evaluation framework as a baseline. Future efforts should prioritize integrating medical knowledge databases to enhance the alignment of automated metrics, particularly focusing on refining the SapBERT score for improved assessments.

5.
JAMA ; 331(14): 1195-1204, 2024 04 09.
Artículo en Inglés | MEDLINE | ID: mdl-38501205

RESUMEN

Importance: Among critically ill adults, randomized trials have not found oxygenation targets to affect outcomes overall. Whether the effects of oxygenation targets differ based on an individual's characteristics is unknown. Objective: To determine whether an individual's characteristics modify the effect of lower vs higher peripheral oxygenation-saturation (Spo2) targets on mortality. Design, Setting, and Participants: A machine learning model to predict the effect of treatment with a lower vs higher Spo2 target on mortality for individual patients was derived in the Pragmatic Investigation of Optimal Oxygen Targets (PILOT) trial and externally validated in the Intensive Care Unit Randomized Trial Comparing Two Approaches to Oxygen Therapy (ICU-ROX) trial. Critically ill adults received invasive mechanical ventilation in an intensive care unit (ICU) in the United States between July 2018 and August 2021 for PILOT (n = 1682) and in 21 ICUs in Australia and New Zealand between September 2015 and May 2018 for ICU-ROX (n = 965). Exposures: Randomization to a lower vs higher Spo2 target group. Main Outcome and Measure: 28-Day mortality. Results: In the ICU-ROX validation cohort, the predicted effect of treatment with a lower vs higher Spo2 target for individual patients ranged from a 27.2% absolute reduction to a 34.4% absolute increase in 28-day mortality. For example, patients predicted to benefit from a lower Spo2 target had a higher prevalence of acute brain injury, whereas patients predicted to benefit from a higher Spo2 target had a higher prevalence of sepsis and abnormally elevated vital signs. Patients predicted to benefit from a lower Spo2 target experienced lower mortality when randomized to the lower Spo2 group, whereas patients predicted to benefit from a higher Spo2 target experienced lower mortality when randomized to the higher Spo2 group (likelihood ratio test for effect modification P = .02). The use of a Spo2 target predicted to be best for each patient, instead of the randomized Spo2 target, would have reduced the absolute overall mortality by 6.4% (95% CI, 1.9%-10.9%). Conclusion and relevance: Oxygenation targets that are individualized using machine learning analyses of randomized trials may reduce mortality for critically ill adults. A prospective trial evaluating the use of individualized oxygenation targets is needed.


Asunto(s)
Enfermedad Crítica , Oxígeno , Adulto , Humanos , Oxígeno/uso terapéutico , Enfermedad Crítica/terapia , Respiración Artificial , Estudios Prospectivos , Terapia por Inhalación de Oxígeno , Unidades de Cuidados Intensivos
6.
Crit Care Explor ; 6(3): e1066, 2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-38505174

RESUMEN

OBJECTIVES: Alcohol withdrawal syndrome (AWS) may progress to require high-intensity care. Approaches to identify hospitalized patients with AWS who received higher level of care have not been previously examined. This study aimed to examine the utility of Clinical Institute Withdrawal Assessment Alcohol Revised (CIWA-Ar) for alcohol scale scores and medication doses for alcohol withdrawal management in identifying patients who received high-intensity care. DESIGN: A multicenter observational cohort study of hospitalized adults with alcohol withdrawal. SETTING: University of Chicago Medical Center and University of Wisconsin Hospital. PATIENTS: Inpatient encounters between November 2008 and February 2022 with a CIWA-Ar score greater than 0 and benzodiazepine or barbiturate administered within the first 24 hours. The primary composite outcome was patients who progressed to high-intensity care (intermediate care or ICU). INTERVENTIONS: None. MAIN RESULTS: Among the 8742 patients included in the study, 37.5% (n = 3280) progressed to high-intensity care. The odds ratio for the composite outcome increased above 1.0 when the CIWA-Ar score was 24. The sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) at this threshold were 0.12 (95% CI, 0.11-0.13), 0.95 (95% CI, 0.94-0.95), 0.58 (95% CI, 0.54-0.61), and 0.64 (95% CI, 0.63-0.65), respectively. The OR increased above 1.0 at a 24-hour lorazepam milligram equivalent dose cutoff of 15 mg. The sensitivity, specificity, PPV, and NPV at this threshold were 0.16 (95% CI, 0.14-0.17), 0.96 (95% CI, 0.95-0.96), 0.68 (95% CI, 0.65-0.72), and 0.65 (95% CI, 0.64-0.66), respectively. CONCLUSIONS: Neither CIWA-Ar scores nor medication dose cutoff points were effective measures for identifying patients with alcohol withdrawal who received high-intensity care. Research studies for examining outcomes in patients who deteriorate with AWS will require better methods for cohort identification.

7.
JAMA ; 331(6): 500-509, 2024 02 13.
Artículo en Inglés | MEDLINE | ID: mdl-38349372

RESUMEN

Importance: The US heart allocation system prioritizes medically urgent candidates with a high risk of dying without transplant. The current therapy-based 6-status system is susceptible to manipulation and has limited rank ordering ability. Objective: To develop and validate a candidate risk score that incorporates current clinical, laboratory, and hemodynamic data. Design, Setting, and Participants: A registry-based observational study of adult heart transplant candidates (aged ≥18 years) from the US heart allocation system listed between January 1, 2019, and December 31, 2022, split by center into training (70%) and test (30%) datasets. Adult candidates were listed between January 1, 2019, and December 31, 2022. Main Outcomes and Measures: A US candidate risk score (US-CRS) model was developed by adding a predefined set of predictors to the current French Candidate Risk Score (French-CRS) model. Sensitivity analyses were performed, which included intra-aortic balloon pumps (IABP) and percutaneous ventricular assist devices (VAD) in the definition of short-term mechanical circulatory support (MCS) for the US-CRS. Performance of the US-CRS model, French-CRS model, and 6-status model in the test dataset was evaluated by time-dependent area under the receiver operating characteristic curve (AUC) for death without transplant within 6 weeks and overall survival concordance (c-index) with integrated AUC. Results: A total of 16 905 adult heart transplant candidates were listed (mean [SD] age, 53 [13] years; 73% male; 58% White); 796 patients (4.7%) died without a transplant. The final US-CRS contained time-varying short-term MCS (ventricular assist-extracorporeal membrane oxygenation or temporary surgical VAD), the log of bilirubin, estimated glomerular filtration rate, the log of B-type natriuretic peptide, albumin, sodium, and durable left ventricular assist device. In the test dataset, the AUC for death within 6 weeks of listing for the US-CRS model was 0.79 (95% CI, 0.75-0.83), for the French-CRS model was 0.72 (95% CI, 0.67-0.76), and 6-status model was 0.68 (95% CI, 0.62-0.73). Overall c-index for the US-CRS model was 0.76 (95% CI, 0.73-0.80), for the French-CRS model was 0.69 (95% CI, 0.65-0.73), and 6-status model was 0.67 (95% CI, 0.63-0.71). Classifying IABP and percutaneous VAD as short-term MCS reduced the effect size by 54%. Conclusions and Relevance: In this registry-based study of US heart transplant candidates, a continuous multivariable allocation score outperformed the 6-status system in rank ordering heart transplant candidates by medical urgency and may be useful for the medical urgency component of heart allocation.


Asunto(s)
Insuficiencia Cardíaca , Trasplante de Corazón , Obtención de Tejidos y Órganos , Adulto , Femenino , Humanos , Masculino , Persona de Mediana Edad , Bilirrubina , Servicios de Laboratorio Clínico , Corazón , Factores de Riesgo , Medición de Riesgo , Insuficiencia Cardíaca/mortalidad , Insuficiencia Cardíaca/cirugía , Estados Unidos , Asignación de Recursos para la Atención de Salud/métodos , Valor Predictivo de las Pruebas , Obtención de Tejidos y Órganos/métodos , Obtención de Tejidos y Órganos/organización & administración
8.
medRxiv ; 2024 Feb 06.
Artículo en Inglés | MEDLINE | ID: mdl-38370788

RESUMEN

OBJECTIVE: Timely intervention for clinically deteriorating ward patients requires that care teams accurately diagnose and treat their underlying medical conditions. However, the most common diagnoses leading to deterioration and the relevant therapies provided are poorly characterized. Therefore, we aimed to determine the diagnoses responsible for clinical deterioration, the relevant diagnostic tests ordered, and the treatments administered among high-risk ward patients using manual chart review. DESIGN: Multicenter retrospective observational study. SETTING: Inpatient medical-surgical wards at four health systems from 2006-2020 PATIENTS: Randomly selected patients (1,000 from each health system) with clinical deterioration, defined by reaching the 95th percentile of a validated early warning score, electronic Cardiac Arrest Risk Triage (eCART), were included. INTERVENTIONS: None. MEASUREMENTS AND MAIN RESULTS: Clinical deterioration was confirmed by a trained reviewer or marked as a false alarm if no deterioration occurred for each patient. For true deterioration events, the condition causing deterioration, relevant diagnostic tests ordered, and treatments provided were collected. Of the 4,000 included patients, 2,484 (62%) had clinical deterioration confirmed by chart review. Sepsis was the most common cause of deterioration (41%; n=1,021), followed by arrhythmia (19%; n=473), while liver failure had the highest in-hospital mortality (41%). The most common diagnostic tests ordered were complete blood counts (47% of events), followed by chest x-rays (42%), and cultures (40%), while the most common medication orders were antimicrobials (46%), followed by fluid boluses (34%), and antiarrhythmics (19%). CONCLUSIONS: We found that sepsis was the most common cause of deterioration, while liver failure had the highest mortality. Complete blood counts and chest x-rays were the most common diagnostic tests ordered, and antimicrobials and fluid boluses were the most common medication interventions. These results provide important insights for clinical decision-making at the bedside, training of rapid response teams, and the development of institutional treatment pathways for clinical deterioration. KEY POINTS: Question: What are the most common diagnoses, diagnostic test orders, and treatments for ward patients experiencing clinical deterioration? Findings: In manual chart review of 2,484 encounters with deterioration across four health systems, we found that sepsis was the most common cause of clinical deterioration, followed by arrythmias, while liver failure had the highest mortality. Complete blood counts and chest x-rays were the most common diagnostic test orders, while antimicrobials and fluid boluses were the most common treatments. Meaning: Our results provide new insights into clinical deterioration events, which can inform institutional treatment pathways, rapid response team training, and patient care.

9.
Resusc Plus ; 17: 100540, 2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-38260119

RESUMEN

Background and Objective: The Children's Early Warning Tool (CEWT), developed in Australia, is widely used in many countries to monitor the risk of deterioration in hospitalized children. Our objective was to compare CEWT prediction performance against a version of the Bedside Pediatric Early Warning Score (Bedside PEWS), Between the Flags (BTF), and the pediatric Calculated Assessment of Risk and Triage (pCART). Methods: We conducted a retrospective observational study of all patient admissions to the Comer Children's Hospital at the University of Chicago between 2009-2019. We compared performance for predicting the primary outcome of a direct ward-to-intensive care unit (ICU) transfer within the next 12 h using the area under the receiver operating characteristic curve (AUC). Alert rates at various score thresholds were also compared. Results: Of 50,815 ward admissions, 1,874 (3.7%) experienced the primary outcome. Among patients in Cohort 1 (years 2009-2017, on which the machine learning-based pCART was trained), CEWT performed slightly worse than Bedside PEWS but better than BTF (CEWT AUC 0.74 vs. Bedside PEWS 0.76, P < 0.001; vs. BTF 0.66, P < 0.001), while pCART performed best for patients in Cohort 2 (years 2018-2019, pCART AUC 0.84 vs. CEWT AUC 0.79, P < 0.001; vs. BTF AUC 0.67, P < 0.001; vs. Bedside PEWS 0.80, P < 0.001). Sensitivity, specificity, and positive predictive values varied across all four tools at the examined thresholds for alerts. Conclusion: CEWT has good discrimination for predicting which patients will likely be transferred to the ICU, while pCART performed the best.

10.
Chest ; 165(3): 529-539, 2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-37748574

RESUMEN

BACKGROUND: Trajectories of bedside vital signs have been used to identify sepsis subphenotypes with distinct outcomes and treatment responses. The objective of this study was to validate the vitals trajectory model in a multicenter cohort of patients hospitalized with COVID-19 and to evaluate the clinical characteristics and outcomes of the resulting subphenotypes. RESEARCH QUESTION: Can the trajectory of routine bedside vital signs identify COVID-19 subphenotypes with distinct clinical characteristics and outcomes? STUDY DESIGN AND METHODS: The study included adult patients admitted with COVID-19 to four academic hospitals in the Emory Healthcare system between March 1, 2020, and May 31, 2022. Using a validated group-based trajectory model, we classified patients into previously defined vital sign trajectories using oral temperature, heart rate, respiratory rate, and systolic and diastolic BP measured in the first 8 h of hospitalization. Clinical characteristics, biomarkers, and outcomes were compared between subphenotypes. Heterogeneity of treatment effect to tocilizumab was evaluated. RESULTS: The 7,065 patients with hospitalized COVID-19 were classified into four subphenotypes: group A (n = 1,429, 20%)-high temperature, heart rate, respiratory rate, and hypotensive; group B (1,454, 21%)-high temperature, heart rate, respiratory rate, and hypertensive; group C (2,996, 42%)-low temperature, heart rate, respiratory rate, and normotensive; and group D (1,186, 17%)-low temperature, heart rate, respiratory rate, and hypotensive. Groups A and D had higher ORs of mechanical ventilation, vasopressors, and 30-day inpatient mortality (P < .001). On comparing patients receiving tocilizumab (n = 55) with those who met criteria for tocilizumab but were admitted before its use (n = 461), there was significant heterogeneity of treatment effect across subphenotypes in the association of tocilizumab with 30-day mortality (P = .001). INTERPRETATION: By using bedside vital signs available in even low-resource settings, we found novel subphenotypes associated with distinct manifestations of COVID-19, which could lead to preemptive and targeted treatments.


Asunto(s)
COVID-19 , Adulto , Humanos , COVID-19/diagnóstico , COVID-19/terapia , Biomarcadores , Respiración Artificial , Frecuencia Cardíaca , Signos Vitales
11.
Ann Surg Oncol ; 31(1): 488-498, 2024 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-37782415

RESUMEN

BACKGROUND: While lower socioeconomic status has been shown to correlate with worse outcomes in cancer care, data correlating neighborhood-level metrics with outcomes are scarce. We aim to explore the association between neighborhood disadvantage and both short- and long-term postoperative outcomes in patients undergoing pancreatectomy for pancreatic ductal adenocarcinoma (PDAC). PATIENTS AND METHODS: We retrospectively analyzed 243 patients who underwent resection for PDAC at a single institution between 1 January 2010 and 15 September 2021. To measure neighborhood disadvantage, the cohort was divided into tertiles by Area Deprivation Index (ADI). Short-term outcomes of interest were minor complications, major complications, unplanned readmission within 30 days, prolonged hospitalization, and delayed gastric emptying (DGE). The long-term outcome of interest was overall survival. Logistic regression was used to test short-term outcomes; Cox proportional hazards models and Kaplan-Meier method were used for long-term outcomes. RESULTS: The median ADI of the cohort was 49 (IQR 32-64.5). On adjusted analysis, the high-ADI group demonstrated greater odds of suffering a major complication (odds ratio [OR], 2.78; 95% confidence interval [CI], 1.26-6.40; p = 0.01) and of an unplanned readmission (OR, 3.09; 95% CI, 1.16-9.28; p = 0.03) compared with the low-ADI group. There were no significant differences between groups in the odds of minor complications, prolonged hospitalization, or DGE (all p > 0.05). High ADI did not confer an increased hazard of death (p = 0.63). CONCLUSIONS: We found that worse neighborhood disadvantage is associated with a higher risk of major complication and unplanned readmission after pancreatectomy for PDAC.


Asunto(s)
Carcinoma Ductal Pancreático , Neoplasias Pancreáticas , Humanos , Pancreatectomía/efectos adversos , Pancreatectomía/métodos , Estudios Retrospectivos , Neoplasias Pancreáticas/patología , Carcinoma Ductal Pancreático/patología , Características del Vecindario
12.
JAMIA Open ; 6(4): ooad109, 2023 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-38144168

RESUMEN

Objectives: To develop and externally validate machine learning models using structured and unstructured electronic health record data to predict postoperative acute kidney injury (AKI) across inpatient settings. Materials and Methods: Data for adult postoperative admissions to the Loyola University Medical Center (2009-2017) were used for model development and admissions to the University of Wisconsin-Madison (2009-2020) were used for validation. Structured features included demographics, vital signs, laboratory results, and nurse-documented scores. Unstructured text from clinical notes were converted into concept unique identifiers (CUIs) using the clinical Text Analysis and Knowledge Extraction System. The primary outcome was the development of Kidney Disease Improvement Global Outcomes stage 2 AKI within 7 days after leaving the operating room. We derived unimodal extreme gradient boosting machines (XGBoost) and elastic net logistic regression (GLMNET) models using structured-only data and multimodal models combining structured data with CUI features. Model comparison was performed using the receiver operating characteristic curve (AUROC), with Delong's test for statistical differences. Results: The study cohort included 138 389 adult patient admissions (mean [SD] age 58 [16] years; 11 506 [8%] African-American; and 70 826 [51%] female) across the 2 sites. Of those, 2959 (2.1%) developed stage 2 AKI or higher. Across all data types, XGBoost outperformed GLMNET (mean AUROC 0.81 [95% confidence interval (CI), 0.80-0.82] vs 0.78 [95% CI, 0.77-0.79]). The multimodal XGBoost model incorporating CUIs parameterized as term frequency-inverse document frequency (TF-IDF) showed the highest discrimination performance (AUROC 0.82 [95% CI, 0.81-0.83]) over unimodal models (AUROC 0.79 [95% CI, 0.78-0.80]). Discussion: A multimodality approach with structured data and TF-IDF weighting of CUIs increased model performance over structured data-only models. Conclusion: These findings highlight the predictive power of CUIs when merged with structured data for clinical prediction models, which may improve the detection of postoperative AKI.

13.
JAMIA Open ; 6(4): ooad092, 2023 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-37942470

RESUMEN

Objectives: Substance misuse is a complex and heterogeneous set of conditions associated with high mortality and regional/demographic variations. Existing data systems are siloed and have been ineffective in curtailing the substance misuse epidemic. Therefore, we aimed to build a novel informatics platform, the Substance Misuse Data Commons (SMDC), by integrating multiple data modalities to provide a unified record of information crucial to improving outcomes in substance misuse patients. Materials and Methods: The SMDC was created by linking electronic health record (EHR) data from adult cases of substance (alcohol, opioid, nonopioid drug) misuse at the University of Wisconsin hospitals to socioeconomic and state agency data. To ensure private and secure data exchange, Privacy-Preserving Record Linkage (PPRL) and Honest Broker services were utilized. The overlap in mortality reporting among the EHR, state Vital Statistics, and a commercial national data source was assessed. Results: The SMDC included data from 36 522 patients experiencing 62 594 healthcare encounters. Over half of patients were linked to the statewide ambulance database and prescription drug monitoring program. Chronic diseases accounted for most underlying causes of death, while drug-related overdoses constituted 8%. Our analysis of mortality revealed a 49.1% overlap across the 3 data sources. Nonoverlapping deaths were associated with poor socioeconomic indicators. Discussion: Through PPRL, the SMDC enabled the longitudinal integration of multimodal data. Combining death data from local, state, and national sources enhanced mortality tracking and exposed disparities. Conclusion: The SMDC provides a comprehensive resource for clinical providers and policymakers to inform interventions targeting substance misuse-related hospitalizations, overdoses, and death.

14.
Proc Conf Assoc Comput Linguist Meet ; 2023: 461-467, 2023 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-37583489

RESUMEN

The BioNLP Workshop 2023 initiated the launch of a shared task on Problem List Summarization (ProbSum) in January 2023. The aim of this shared task is to attract future research efforts in building NLP models for real-world diagnostic decision support applications, where a system generating relevant and accurate diagnoses will augment the healthcare providers' decision-making process and improve the quality of care for patients. The goal for participants is to develop models that generated a list of diagnoses and problems using input from the daily care notes collected from the hospitalization of critically ill patients. Eight teams submitted their final systems to the shared task leaderboard. In this paper, we describe the tasks, datasets, evaluation metrics, and baseline systems. Additionally, the techniques and results of the evaluation of the different approaches tried by the participating teams are summarized.

15.
Lancet Respir Med ; 11(11): 965-974, 2023 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-37633303

RESUMEN

BACKGROUND: In sepsis and acute respiratory distress syndrome (ARDS), heterogeneity has contributed to difficulty identifying effective pharmacotherapies. In ARDS, two molecular phenotypes (hypoinflammatory and hyperinflammatory) have consistently been identified, with divergent outcomes and treatment responses. In this study, we sought to derive molecular phenotypes in critically ill adults with sepsis, determine their overlap with previous ARDS phenotypes, and evaluate whether they respond differently to treatment in completed sepsis trials. METHODS: We used clinical data and plasma biomarkers from two prospective sepsis cohorts, the Validating Acute Lung Injury biomarkers for Diagnosis (VALID) study (N=1140) and the Early Assessment of Renal and Lung Injury (EARLI) study (N=818), in latent class analysis (LCA) to identify the optimal number of classes in each cohort independently. We used validated models trained to classify ARDS phenotypes to evaluate concordance of sepsis and ARDS phenotypes. We applied these models retrospectively to the previously published Prospective Recombinant Human Activated Protein C Worldwide Evaluation in Severe Sepsis and Septic Shock (PROWESS-SHOCK) trial and Vasopressin and Septic Shock Trial (VASST) to assign phenotypes and evaluate heterogeneity of treatment effect. FINDINGS: A two-class model best fit both VALID and EARLI (p<0·0001). In VALID, 804 (70·5%) of the 1140 patients were classified as hypoinflammatory and 336 (29·5%) as hyperinflammatory; in EARLI, 530 (64·8%) of 818 were hypoinflammatory and 288 (35·2%) hyperinflammatory. We observed higher plasma pro-inflammatory cytokines, more vasopressor use, more bacteraemia, lower protein C, and higher mortality in the hyperinflammatory than in the hypoinflammatory phenotype (p<0·0001 for all). Classifier models indicated strong concordance between sepsis phenotypes and previously identified ARDS phenotypes (area under the curve 0·87-0·96, depending on the model). Findings were similar excluding participants with both sepsis and ARDS. In PROWESS-SHOCK, 1142 (68·0%) of 1680 patients had the hypoinflammatory phenotype and 538 (32·0%) had the hyperinflammatory phenotype, and response to activated protein C differed by phenotype (p=0·0043). In VASST, phenotype proportions were similar to other cohorts; however, no treatment interaction with the type of vasopressor was observed (p=0·72). INTERPRETATION: Molecular phenotypes previously identified in ARDS are also identifiable in multiple sepsis cohorts and respond differently to activated protein C. Molecular phenotypes could represent a treatable trait in critical illness beyond the patient's syndromic diagnosis. FUNDING: US National Institutes of Health.


Asunto(s)
Síndrome de Dificultad Respiratoria , Sepsis , Choque Séptico , Adulto , Humanos , Choque Séptico/diagnóstico , Choque Séptico/tratamiento farmacológico , Proteína C/uso terapéutico , Estudios Retrospectivos , Estudios Prospectivos , Sepsis/diagnóstico , Sepsis/tratamiento farmacológico , Sepsis/complicaciones , Fenotipo , Biomarcadores , Vasoconstrictores/uso terapéutico , Ensayos Clínicos Controlados Aleatorios como Asunto
16.
Proc Conf Assoc Comput Linguist Meet ; 2023(ClinicalNLP): 78-85, 2023 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-37492270

RESUMEN

Generative artificial intelligence (AI) is a promising direction for augmenting clinical diagnostic decision support and reducing diagnostic errors, a leading contributor to medical errors. To further the development of clinical AI systems, the Diagnostic Reasoning Benchmark (DR.BENCH) was introduced as a comprehensive generative AI framework, comprised of six tasks representing key components in clinical reasoning. We present a comparative analysis of in-domain versus out-of-domain language models as well as multi-task versus single task training with a focus on the problem summarization task in DR.BENCH (Gao et al., 2023). We demonstrate that a multi-task, clinically-trained language model outperforms its general domain counterpart by a large margin, establishing a new state-of-the-art performance, with a ROUGE-L score of 28.55. This research underscores the value of domain-specific training for optimizing clinical diagnostic reasoning tasks.

17.
J Surg Res ; 291: 7-16, 2023 11.
Artículo en Inglés | MEDLINE | ID: mdl-37329635

RESUMEN

INTRODUCTION: Weight gain among young adults continues to increase. Identifying adults at high risk for weight gain and intervening before they gain weight could have a major public health impact. Our objective was to develop and test electronic health record-based machine learning models to predict weight gain in young adults with overweight/class 1 obesity. METHODS: Seven machine learning models were assessed, including three regression models, random forest, single-layer neural network, gradient-boosted decision trees, and support vector machine (SVM) models. Four categories of predictors were included: 1) demographics; 2) obesity-related health conditions; 3) laboratory data and vital signs; and 4) neighborhood-level variables. The cohort was split 60:40 for model training and validation. Area under the receiver operating characteristic curves (AUC) were calculated to determine model accuracy at predicting high-risk individuals, defined by ≥ 10% total body weight gain within 2 y. Variable importance was measured via generalized analysis of variance procedures. RESULTS: Of the 24,183 patients (mean [SD] age, 32.0 [6.3] y; 55.1% females) in the study, 14.2% gained ≥10% total body weight. Area under the receiver operating characteristic curves varied from 0.557 (SVM) to 0.675 (gradient-boosted decision trees). Age, sex, and baseline body mass index were the most important predictors among the models except SVM and neural network. CONCLUSIONS: Our machine learning models performed similarly and had modest accuracy for identifying young adults at risk of weight gain. Future models may need to incorporate behavioral and/or genetic information to enhance model accuracy.


Asunto(s)
Aprendizaje Automático , Aumento de Peso , Femenino , Humanos , Adulto Joven , Adulto , Masculino , Redes Neurales de la Computación , Registros Electrónicos de Salud , Obesidad/complicaciones , Obesidad/diagnóstico
18.
Crit Care Med ; 51(12): 1697-1705, 2023 12 01.
Artículo en Inglés | MEDLINE | ID: mdl-37378460

RESUMEN

OBJECTIVES: To identify and validate novel COVID-19 subphenotypes with potential heterogenous treatment effects (HTEs) using electronic health record (EHR) data and 33 unique biomarkers. DESIGN: Retrospective cohort study of adults presenting for acute care, with analysis of biomarkers from residual blood collected during routine clinical care. Latent profile analysis (LPA) of biomarker and EHR data identified subphenotypes of COVID-19 inpatients, which were validated using a separate cohort of patients. HTE for glucocorticoid use among subphenotypes was evaluated using both an adjusted logistic regression model and propensity matching analysis for in-hospital mortality. SETTING: Emergency departments from four medical centers. PATIENTS: Patients diagnosed with COVID-19 based on International Classification of Diseases , 10th Revision codes and laboratory test results. INTERVENTIONS: None. MEASUREMENTS AND MAIN RESULTS: Biomarker levels generally paralleled illness severity, with higher levels among more severely ill patients. LPA of 522 COVID-19 inpatients from three sites identified two profiles: profile 1 ( n = 332), with higher levels of albumin and bicarbonate, and profile 2 ( n = 190), with higher inflammatory markers. Profile 2 patients had higher median length of stay (7.4 vs 4.1 d; p < 0.001) and in-hospital mortality compared with profile 1 patients (25.8% vs 4.8%; p < 0.001). These were validated in a separate, single-site cohort ( n = 192), which demonstrated similar outcome differences. HTE was observed ( p = 0.03), with glucocorticoid treatment associated with increased mortality for profile 1 patients (odds ratio = 4.54). CONCLUSIONS: In this multicenter study combining EHR data with research biomarker analysis of patients with COVID-19, we identified novel profiles with divergent clinical outcomes and differential treatment responses.


Asunto(s)
COVID-19 , Adulto , Humanos , Estudios Retrospectivos , Glucocorticoides/uso terapéutico , Biomarcadores , Mortalidad Hospitalaria
19.
JMIR Med Inform ; 11: e44977, 2023 Apr 20.
Artículo en Inglés | MEDLINE | ID: mdl-37079367

RESUMEN

BACKGROUND: The clinical narrative in electronic health records (EHRs) carries valuable information for predictive analytics; however, its free-text form is difficult to mine and analyze for clinical decision support (CDS). Large-scale clinical natural language processing (NLP) pipelines have focused on data warehouse applications for retrospective research efforts. There remains a paucity of evidence for implementing NLP pipelines at the bedside for health care delivery. OBJECTIVE: We aimed to detail a hospital-wide, operational pipeline to implement a real-time NLP-driven CDS tool and describe a protocol for an implementation framework with a user-centered design of the CDS tool. METHODS: The pipeline integrated a previously trained open-source convolutional neural network model for screening opioid misuse that leveraged EHR notes mapped to standardized medical vocabularies in the Unified Medical Language System. A sample of 100 adult encounters were reviewed by a physician informaticist for silent testing of the deep learning algorithm before deployment. An end user interview survey was developed to examine the user acceptability of a best practice alert (BPA) to provide the screening results with recommendations. The planned implementation also included a human-centered design with user feedback on the BPA, an implementation framework with cost-effectiveness, and a noninferiority patient outcome analysis plan. RESULTS: The pipeline was a reproducible workflow with a shared pseudocode for a cloud service to ingest, process, and store clinical notes as Health Level 7 messages from a major EHR vendor in an elastic cloud computing environment. Feature engineering of the notes used an open-source NLP engine, and the features were fed into the deep learning algorithm, with the results returned as a BPA in the EHR. On-site silent testing of the deep learning algorithm demonstrated a sensitivity of 93% (95% CI 66%-99%) and specificity of 92% (95% CI 84%-96%), similar to published validation studies. Before deployment, approvals were received across hospital committees for inpatient operations. Five interviews were conducted; they informed the development of an educational flyer and further modified the BPA to exclude certain patients and allow the refusal of recommendations. The longest delay in pipeline development was because of cybersecurity approvals, especially because of the exchange of protected health information between the Microsoft (Microsoft Corp) and Epic (Epic Systems Corp) cloud vendors. In silent testing, the resultant pipeline provided a BPA to the bedside within minutes of a provider entering a note in the EHR. CONCLUSIONS: The components of the real-time NLP pipeline were detailed with open-source tools and pseudocode for other health systems to benchmark. The deployment of medical artificial intelligence systems in routine clinical care presents an important yet unfulfilled opportunity, and our protocol aimed to close the gap in the implementation of artificial intelligence-driven CDS. TRIAL REGISTRATION: ClinicalTrials.gov NCT05745480; https://www.clinicaltrials.gov/ct2/show/NCT05745480.

20.
J Biomed Inform ; 142: 104346, 2023 06.
Artículo en Inglés | MEDLINE | ID: mdl-37061012

RESUMEN

Daily progress notes are a common note type in the electronic health record (EHR) where healthcare providers document the patient's daily progress and treatment plans. The EHR is designed to document all the care provided to patients, but it also enables note bloat with extraneous information that distracts from the diagnoses and treatment plans. Applications of natural language processing (NLP) in the EHR is a growing field with the majority of methods in information extraction. Few tasks use NLP methods for downstream diagnostic decision support. We introduced the 2022 National NLP Clinical Challenge (N2C2) Track 3: Progress Note Understanding - Assessment and Plan Reasoning as one step towards a new suite of tasks. The Assessment and Plan Reasoning task focuses on the most critical components of progress notes, Assessment and Plan subsections where health problems and diagnoses are contained. The goal of the task was to develop and evaluate NLP systems that automatically predict causal relations between the overall status of the patient contained in the Assessment section and its relation to each component of the Plan section which contains the diagnoses and treatment plans. The goal of the task was to identify and prioritize diagnoses as the first steps in diagnostic decision support to find the most relevant information in long documents like daily progress notes. We present the results of the 2022 N2C2 Track 3 and provide a description of the data, evaluation, participation and system performance.


Asunto(s)
Registros Electrónicos de Salud , Almacenamiento y Recuperación de la Información , Humanos , Procesamiento de Lenguaje Natural , Personal de Salud
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...