RESUMEN
Rationale: Care of emergency department (ED) patients with pneumonia can be challenging. Clinical decision support may decrease unnecessary variation and improve care. Objectives: To report patient outcomes and processes of care after deployment of electronic pneumonia clinical decision support (ePNa): a comprehensive, open loop, real-time clinical decision support embedded within the electronic health record. Methods: We conducted a pragmatic, stepped-wedge, cluster-controlled trial with deployment at 2-month intervals in 16 community hospitals. ePNa extracts real-time and historical data to guide diagnosis, risk stratification, microbiological studies, site of care, and antibiotic therapy. We included all adult ED patients with pneumonia over the course of 3 years identified by International Classification of Diseases, 10th Revision discharge coding confirmed by chest imaging. Measurements and Main Results: The median age of the 6,848 patients was 67 years (interquartile range, 50-79), and 48% were female; 64.8% were hospital admitted. Unadjusted mortality was 8.6% before and 4.8% after deployment. A mixed effects logistic regression model adjusting for severity of illness with hospital cluster as the random effect showed an adjusted odds ratio of 0.62 (0.49-0.79; P < 0.001) for 30-day all-cause mortality after deployment. Lower mortality was consistent across hospital clusters. ePNa-concordant antibiotic prescribing increased from 83.5% to 90.2% (P < 0.001). The mean time from ED admission to first antibiotic was 159.4 (156.9-161.9) minutes at baseline and 150.9 (144.1-157.8) minutes after deployment (P < 0.001). Outpatient disposition from the ED increased from 29.2% to 46.9%, whereas 7-day secondary hospital admission was unchanged (5.2% vs. 6.1%). ePNa was used by ED clinicians in 67% of eligible patients. Conclusions: ePNa deployment was associated with improved processes of care and lower mortality. Clinical trial registered with www.clinicaltrials.gov (NCT03358342).
Asunto(s)
Sistemas de Apoyo a Decisiones Clínicas , Neumonía , Adulto , Anciano , Antibacterianos/uso terapéutico , Servicio de Urgencia en Hospital , Femenino , Hospitalización , Humanos , Masculino , Neumonía/diagnósticoRESUMEN
BACKGROUND: Risk adjustment models are employed to prevent adverse selection, anticipate budgetary reserve needs, and offer care management services to high-risk individuals. We aimed to address two unknowns about risk adjustment: whether machine learning (ML) and inclusion of social determinants of health (SDH) indicators improve prospective risk adjustment for health plan payments. METHODS: We employed a 2-by-2 factorial design comparing: (i) linear regression versus ML (gradient boosting) and (ii) demographics and diagnostic codes alone, versus additional ZIP code-level SDH indicators. Healthcare claims from privately-insured US adults (2016-2017), and Census data were used for analysis. Data from 1.02 million adults were used for derivation, and data from 0.26 million to assess performance. Model performance was measured using coefficient of determination (R2), discrimination (C-statistic), and mean absolute error (MAE) for the overall population, and predictive ratio and net compensation for vulnerable subgroups. We provide 95% confidence intervals (CI) around each performance measure. RESULTS: Linear regression without SDH indicators achieved moderate determination (R2 0.327, 95% CI: 0.300, 0.353), error ($6992; 95% CI: $6889, $7094), and discrimination (C-statistic 0.703; 95% CI: 0.701, 0.705). ML without SDH indicators improved all metrics (R2 0.388; 95% CI: 0.357, 0.420; error $6637; 95% CI: $6539, $6735; C-statistic 0.717; 95% CI: 0.715, 0.718), reducing misestimation of cost by $3.5 M per 10,000 members. Among people living in areas with high poverty, high wealth inequality, or high prevalence of uninsured, SDH indicators reduced underestimation of cost, improving the predictive ratio by 3% (~$200/person/year). CONCLUSIONS: ML improved risk adjustment models and the incorporation of SDH indicators reduced underpayment in several vulnerable populations.
Asunto(s)
Promoción de la Salud/economía , Promoción de la Salud/estadística & datos numéricos , Seguro de Salud/economía , Seguro de Salud/estadística & datos numéricos , Aprendizaje Automático/economía , Aprendizaje Automático/estadística & datos numéricos , Determinantes Sociales de la Salud/economía , Determinantes Sociales de la Salud/estadística & datos numéricos , Adulto , Análisis Costo-Beneficio , Femenino , Humanos , Masculino , Persona de Mediana Edad , Estudios Prospectivos , Ajuste de RiesgoRESUMEN
Lack of diagnosis coding is a barrier to leveraging veterinary notes for medical and public health research. Previous work is limited to develop specialized rule-based or customized supervised learning models to predict diagnosis coding, which is tedious and not easily transferable. In this work, we show that open-source large language models (LLMs) pretrained on general corpus can achieve reasonable performance in a zero-shot setting. Alpaca-7B can achieve a zero-shot F1 of 0.538 on CSU test data and 0.389 on PP test data, two standard benchmarks for coding from veterinary notes. Furthermore, with appropriate fine-tuning, the performance of LLMs can be substantially boosted, exceeding those of strong state-of-the-art supervised models. VetLLM, which is fine-tuned on Alpaca-7B using just 5000 veterinary notes, can achieve a F1 of 0.747 on CSU test data and 0.637 on PP test data. It is of note that our fine-tuning is data-efficient: using 200 notes can outperform supervised models trained with more than 100,000 notes. The findings demonstrate the great potential of leveraging LLMs for language processing tasks in medicine, and we advocate this new paradigm for processing clinical text.
Asunto(s)
Camélidos del Nuevo Mundo , Humanos , Animales , Procesamiento de Lenguaje Natural , Biología Computacional , LenguajeRESUMEN
Low-yield repetitive laboratory diagnostics burden patients and inflate cost of care. In this study, we assess whether stability in repeated laboratory diagnostic measurements is predictable with uncertainty estimates using electronic health record data available before the diagnostic is ordered. We use probabilistic regression to predict a distribution of plausible values, allowing use-time customization for various definitions of "stability" given dynamic ranges and clinical scenarios. After converting distributions into "stability" scores, the models achieve a sensitivity of 29% for white blood cells, 60% for hemoglobin, 100% for platelets, 54% for potassium, 99% for albumin and 35% for creatinine for predicting stability at 90% precision, suggesting those fractions of repetitive tests could be reduced with low risk of missing important changes. The findings demonstrate the feasibility of using electronic health record data to identify low-yield repetitive tests and offer personalized guidance for better usage of testing while ensuring high quality care.
Asunto(s)
Técnicas de Laboratorio Clínico , Hemoglobinas , HumanosRESUMEN
PURPOSE: Patients with pneumonia often present to the emergency department (ED) and require prompt diagnosis and treatment. Clinical decision support systems for the diagnosis and management of pneumonia are commonly utilized in EDs to improve patient care. The purpose of this study is to investigate whether a deep learning model for detecting radiographic pneumonia and pleural effusions can improve functionality of a clinical decision support system (CDSS) for pneumonia management (ePNa) operating in 20 EDs. MATERIALS AND METHODS: In this retrospective cohort study, a dataset of 7434 prior chest radiographic studies from 6551 ED patients was used to develop and validate a deep learning model to identify radiographic pneumonia, pleural effusions, and evidence of multilobar pneumonia. Model performance was evaluated against 3 radiologists' adjudicated interpretation and compared with performance of the natural language processing of radiology reports used by ePNa. RESULTS: The deep learning model achieved an area under the receiver operating characteristic curve of 0.833 (95% confidence interval [CI]: 0.795, 0.868) for detecting radiographic pneumonia, 0.939 (95% CI: 0.911, 0.962) for detecting pleural effusions and 0.847 (95% CI: 0.800, 0.890) for identifying multilobar pneumonia. On all 3 tasks, the model achieved higher agreement with the adjudicated radiologist interpretation compared with ePNa. CONCLUSIONS: A deep learning model demonstrated higher agreement with radiologists than the ePNa CDSS in detecting radiographic pneumonia and related findings. Incorporating deep learning models into pneumonia CDSS could enhance diagnostic performance and improve pneumonia management.