Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 12 de 12
Filtrar
1.
AMIA Jt Summits Transl Sci Proc ; 2024: 95-104, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38827052

RESUMEN

Access to real-world data streams like electronic medical records (EMRs) has accelerated the development of supervised machine learning (ML) models for clinical applications. However, few studies investigate the differential impact of particular features in the EMR on model performance under temporal dataset shift. To explain how features in the EMR impact models over time, this study aggregates features into feature groups by their source (e.g. medication orders, diagnosis codes and lab results) and feature categories based on their reflection of patient pathophysiology or healthcare processes. We adapt Shapley values to explain feature groups' and feature categories' marginal contribution to initial and sustained model performance. We investigate three standard clinical prediction tasks and find that while feature contributions to initial performance differ across tasks, pathophysiological features help mitigate temporal discrimination deterioration. These results provide interpretable insights on how specific feature groups contribute to model performance and robustness to temporal dataset shift.

2.
AMIA Jt Summits Transl Sci Proc ; 2024: 182-189, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38827068

RESUMEN

This study explored the efficacy of electronic phenotyping in data labeling for machine learning with a focus on urinary tract infections (UTIs). We contrasted labels from electronic phenotyping against previously published labels such as urine culture positivity. In comparison, electronic phenotyping showed the potential to enhance specificity in UTI labeling while maintaining similar sensitivity and was easily scaled for application to a large dataset suitable for machine learning, which we used to train and validate a machine learning model. Electronic phenotyping offers a valuable method for machine learning label generation in healthcare, with potential benefits for patient care and antimicrobial stewardship. Further research will expand its application and optimize techniques for increased performance.

3.
J Am Med Inform Assoc ; 30(9): 1532-1542, 2023 08 18.
Artículo en Inglés | MEDLINE | ID: mdl-37369008

RESUMEN

OBJECTIVE: Heatlhcare institutions are establishing frameworks to govern and promote the implementation of accurate, actionable, and reliable machine learning models that integrate with clinical workflow. Such governance frameworks require an accompanying technical framework to deploy models in a resource efficient, safe and high-quality manner. Here we present DEPLOYR, a technical framework for enabling real-time deployment and monitoring of researcher-created models into a widely used electronic medical record system. MATERIALS AND METHODS: We discuss core functionality and design decisions, including mechanisms to trigger inference based on actions within electronic medical record software, modules that collect real-time data to make inferences, mechanisms that close-the-loop by displaying inferences back to end-users within their workflow, monitoring modules that track performance of deployed models over time, silent deployment capabilities, and mechanisms to prospectively evaluate a deployed model's impact. RESULTS: We demonstrate the use of DEPLOYR by silently deploying and prospectively evaluating 12 machine learning models trained using electronic medical record data that predict laboratory diagnostic results, triggered by clinician button-clicks in Stanford Health Care's electronic medical record. DISCUSSION: Our study highlights the need and feasibility for such silent deployment, because prospectively measured performance varies from retrospective estimates. When possible, we recommend using prospectively estimated performance measures during silent trials to make final go decisions for model deployment. CONCLUSION: Machine learning applications in healthcare are extensively researched, but successful translations to the bedside are rare. By describing DEPLOYR, we aim to inform machine learning deployment best practices and help bridge the model implementation gap.


Asunto(s)
Registros Electrónicos de Salud , Programas Informáticos , Estudios Retrospectivos , Aprendizaje Automático
4.
Artículo en Inglés | MEDLINE | ID: mdl-37350883

RESUMEN

When evaluating the performance of clinical machine learning models, one must consider the deployment population. When the population of patients with observed labels is only a subset of the deployment population (label selection), standard model performance estimates on the observed population may be misleading. In this study we describe three classes of label selection and simulate five causally distinct scenarios to assess how particular selection mechanisms bias a suite of commonly reported binary machine learning model performance metrics. Simulations reveal that when selection is affected by observed features, naive estimates of model discrimination may be misleading. When selection is affected by labels, naive estimates of calibration fail to reflect reality. We borrow traditional weighting estimators from causal inference literature and find that when selection probabilities are properly specified, they recover full population estimates. We then tackle the real-world task of monitoring the performance of deployed machine learning models whose interactions with clinicians feed-back and affect the selection mechanism of the labels. We train three machine learning models to flag low-yield laboratory diagnostics, and simulate their intended consequence of reducing wasteful laboratory utilization. We find that naive estimates of AUROC on the observed population undershoot actual performance by up to 20%. Such a disparity could be large enough to lead to the wrongful termination of a successful clinical decision support tool. We propose an altered deployment procedure, one that combines injected randomization with traditional weighted estimates, and find it recovers true model performance.

5.
AMIA Annu Symp Proc ; 2023: 1007-1016, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-38222438

RESUMEN

Low-yield repetitive laboratory diagnostics burden patients and inflate cost of care. In this study, we assess whether stability in repeated laboratory diagnostic measurements is predictable with uncertainty estimates using electronic health record data available before the diagnostic is ordered. We use probabilistic regression to predict a distribution of plausible values, allowing use-time customization for various definitions of "stability" given dynamic ranges and clinical scenarios. After converting distributions into "stability" scores, the models achieve a sensitivity of 29% for white blood cells, 60% for hemoglobin, 100% for platelets, 54% for potassium, 99% for albumin and 35% for creatinine for predicting stability at 90% precision, suggesting those fractions of repetitive tests could be reduced with low risk of missing important changes. The findings demonstrate the feasibility of using electronic health record data to identify low-yield repetitive tests and offer personalized guidance for better usage of testing while ensuring high quality care.


Asunto(s)
Técnicas de Laboratorio Clínico , Hemoglobinas , Humanos
6.
AMIA Annu Symp Proc ; 2023: 1201-1208, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-38222372

RESUMEN

In analyzing direct hospitalization cost and clinical data from an academic medical center, commonly used metrics such as diagnosis-related group (DRG) weight explain approximately 37% of cost variability, but a substantial amount of variation remains unaccounted for by case mix index (CMI) alone. Using CMI as a benchmark, we isolate and target individual DRGs with higher than expected average costs for specific quality improvement efforts. While DRGs summarize hospitalization care after discharge, a predictive model using only information known before admission explained up to 60% of cost variability for two DRGs with a high excess cost burden. This level of variability likely reflects underlying patient factors that are not modifiable (e.g., age and prior comorbidities) and therefore less useful for health systems to target for intervention. However, the remaining unexplained variation can be inspected in further studies to discover operational factors that health systems can target to improve quality and value for their patients. Since DRG weights represent the expected resource consumption for a specific hospitalization type relative to the average hospitalization, the data-driven approach we demonstrate can be utilized by any health institution to quantify excess costs and potential savings among DRGs.


Asunto(s)
Grupos Diagnósticos Relacionados , Hospitalización , Humanos , Costos y Análisis de Costo , Alta del Paciente , Centros Médicos Académicos
7.
Commun Med (Lond) ; 2: 38, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35603264

RESUMEN

Background: The Centers for Disease Control and Prevention identify antibiotic prescribing stewardship as the most important action to combat increasing antibiotic resistance. Clinicians balance broad empiric antibiotic coverage vs. precision coverage targeting only the most likely pathogens. We investigate the utility of machine learning-based clinical decision support for antibiotic prescribing stewardship. Methods: In this retrospective multi-site study, we developed machine learning models that predict antibiotic susceptibility patterns (personalized antibiograms) using electronic health record data of 8342 infections from Stanford emergency departments and 15,806 uncomplicated urinary tract infections from Massachusetts General Hospital and Brigham & Women's Hospital in Boston. We assessed the trade-off between broad-spectrum and precise antibiotic prescribing using linear programming. Results: We find in Stanford data that personalized antibiograms reallocate clinician antibiotic selections with a coverage rate (fraction of infections covered by treatment) of 85.9%; similar to clinician performance (84.3% p = 0.11). In the Boston dataset, the personalized antibiograms coverage rate is 90.4%; a significant improvement over clinicians (88.1% p < 0.0001). Personalized antibiograms achieve similar coverage to the clinician benchmark with narrower antibiotics. With Stanford data, personalized antibiograms maintain clinician coverage rates while narrowing 69% of empiric vancomycin+piperacillin/tazobactam prescriptions to piperacillin/tazobactam. In the Boston dataset, personalized antibiograms maintain clinician coverage rates while narrowing 48% of ciprofloxacin to trimethoprim/sulfamethoxazole. Conclusions: Precision empiric antibiotic prescribing with personalized antibiograms could improve patient safety and antibiotic stewardship by reducing unnecessary use of broad-spectrum antibiotics that breed a growing tide of resistant organisms.

8.
J Am Med Inform Assoc ; 28(11): 2423-2432, 2021 10 12.
Artículo en Inglés | MEDLINE | ID: mdl-34402507

RESUMEN

OBJECTIVE: To develop prediction models for intensive care unit (ICU) vs non-ICU level-of-care need within 24 hours of inpatient admission for emergency department (ED) patients using electronic health record data. MATERIALS AND METHODS: Using records of 41 654 ED visits to a tertiary academic center from 2015 to 2019, we tested 4 algorithms-feed-forward neural networks, regularized regression, random forests, and gradient-boosted trees-to predict ICU vs non-ICU level-of-care within 24 hours and at the 24th hour following admission. Simple-feature models included patient demographics, Emergency Severity Index (ESI), and vital sign summary. Complex-feature models added all vital signs, lab results, and counts of diagnosis, imaging, procedures, medications, and lab orders. RESULTS: The best-performing model, a gradient-boosted tree using a full feature set, achieved an AUROC of 0.88 (95%CI: 0.87-0.89) and AUPRC of 0.65 (95%CI: 0.63-0.68) for predicting ICU care need within 24 hours of admission. The logistic regression model using ESI achieved an AUROC of 0.67 (95%CI: 0.65-0.70) and AUPRC of 0.37 (95%CI: 0.35-0.40). Using a discrimination threshold, such as 0.6, the positive predictive value, negative predictive value, sensitivity, and specificity were 85%, 89%, 30%, and 99%, respectively. Vital signs were the most important predictors. DISCUSSION AND CONCLUSIONS: Undertriaging admitted ED patients who subsequently require ICU care is common and associated with poorer outcomes. Machine learning models using readily available electronic health record data predict subsequent need for ICU admission with good discrimination, substantially better than the benchmarking ESI system. The results could be used in a multitiered clinical decision-support system to improve ED triage.


Asunto(s)
Servicio de Urgencia en Hospital , Triaje , Hospitalización , Hospitales , Humanos , Unidades de Cuidados Intensivos , Aprendizaje Automático , Estudios Retrospectivos
9.
J Biomed Inform ; 113: 103637, 2021 01.
Artículo en Inglés | MEDLINE | ID: mdl-33290879

RESUMEN

Widespread adoption of electronic health records (EHRs) has fueled the development of using machine learning to build prediction models for various clinical outcomes. However, this process is often constrained by having a relatively small number of patient records for training the model. We demonstrate that using patient representation schemes inspired from techniques in natural language processing can increase the accuracy of clinical prediction models by transferring information learned from the entire patient population to the task of training a specific model, where only a subset of the population is relevant. Such patient representation schemes enable a 3.5% mean improvement in AUROC on five prediction tasks compared to standard baselines, with the average improvement rising to 19% when only a small number of patient records are available for training the clinical prediction model.


Asunto(s)
Registros Electrónicos de Salud , Modelos Estadísticos , Humanos , Aprendizaje Automático , Procesamiento de Lenguaje Natural , Pronóstico
10.
NPJ Digit Med ; 3: 95, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-32695885

RESUMEN

There is substantial interest in using presenting symptoms to prioritize testing for COVID-19 and establish symptom-based surveillance. However, little is currently known about the specificity of COVID-19 symptoms. To assess the feasibility of symptom-based screening for COVID-19, we used data from tests for common respiratory viruses and SARS-CoV-2 in our health system to measure the ability to correctly classify virus test results based on presenting symptoms. Based on these results, symptom-based screening may not be an effective strategy to identify individuals who should be tested for SARS-CoV-2 infection or to obtain a leading indicator of new COVID-19 cases.

11.
AMIA Jt Summits Transl Sci Proc ; 2020: 108-115, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-32477629

RESUMEN

Up to 50% of antibiotic use in hospital settings is suboptimal. We build machine learning models trained on electronic health record data to minimize wasteful use of antibiotics. Our classifiers flag no growth blood and urine microbial cultures with high precision. Further, we build models that predict the likelihood of bacterial susceptibility to sets of antibiotics. These models contain decision thresholds that separate subgroups of patients whose susceptibility rates to narrow-spectrum antibiotics equal overall susceptibility rates to broader-spectrum drugs. Retroactively analyzing these thresholds on our one year test set, we find that 14% of patients infected with Escherichia coli and empirically treated with piperacillin/tazobactam could have been treated with ceftriaxone with coverage equal to the overall susceptibility rate ofpiperacillin/tazobactam. Similarly, 13% of the same cohort could have been treated with cefazolin - a first generation cephalosporin.

12.
AMIA Annu Symp Proc ; 2020: 953-962, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-33936471

RESUMEN

High quality patient care through timely, precise and efficacious management depends not only on the clinical presentation of a patient, but the context of the care environment to which they present. Understanding and improving factors that affect streamlined workflow, such as provider or department busyness or experience, are essential to improving these care processes, but have been difficult to measure with traditional approaches and clinical data sources. In this exploratory data analysis, we aim to determine whether such contextual factors can be captured for important clinical processes by taking advantage of non-traditional data sources like EHR audit logs which passively track the electronic behavior of clinical teams. Our results illustrate the potential of defining multiple measures of contextual factors and their correlation with key care processes. We illustrate this using thrombolytic (tPA) treatment for ischemic stroke as an example process, but the measurement approaches can be generalized to multiple scenarios.


Asunto(s)
Accidente Cerebrovascular , Femenino , Humanos , Almacenamiento y Recuperación de la Información , Masculino , Persona de Mediana Edad , Atención al Paciente , Accidente Cerebrovascular/terapia , Flujo de Trabajo
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...