Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
1.
J Biomed Inform ; 86: 109-119, 2018 10.
Artículo en Inglés | MEDLINE | ID: mdl-30195660

RESUMEN

OBJECTIVE: Evaluate the quality of clinical order practice patterns machine-learned from clinician cohorts stratified by patient mortality outcomes. MATERIALS AND METHODS: Inpatient electronic health records from 2010 to 2013 were extracted from a tertiary academic hospital. Clinicians (n = 1822) were stratified into low-mortality (21.8%, n = 397) and high-mortality (6.0%, n = 110) extremes using a two-sided P-value score quantifying deviation of observed vs. expected 30-day patient mortality rates. Three patient cohorts were assembled: patients seen by low-mortality clinicians, high-mortality clinicians, and an unfiltered crowd of all clinicians (n = 1046, 1046, and 5230 post-propensity score matching, respectively). Predicted order lists were automatically generated from recommender system algorithms trained on each patient cohort and evaluated against (i) real-world practice patterns reflected in patient cases with better-than-expected mortality outcomes and (ii) reference standards derived from clinical practice guidelines. RESULTS: Across six common admission diagnoses, order lists learned from the crowd demonstrated the greatest alignment with guideline references (AUROC range = 0.86-0.91), performing on par or better than those learned from low-mortality clinicians (0.79-0.84, P < 10-5) or manually-authored hospital order sets (0.65-0.77, P < 10-3). The same trend was observed in evaluating model predictions against better-than-expected patient cases, with the crowd model (AUROC mean = 0.91) outperforming the low-mortality model (0.87, P < 10-16) and order set benchmarks (0.78, P < 10-35). DISCUSSION: Whether machine-learning models are trained on all clinicians or a subset of experts illustrates a bias-variance tradeoff in data usage. Defining robust metrics to assess quality based on internal (e.g. practice patterns from better-than-expected patient cases) or external reference standards (e.g. clinical practice guidelines) is critical to assess decision support content. CONCLUSION: Learning relevant decision support content from all clinicians is as, if not more, robust than learning from a select subgroup of clinicians favored by patient outcomes.


Asunto(s)
Minería de Datos , Sistemas de Apoyo a Decisiones Clínicas , Registros Electrónicos de Salud , Mortalidad , Reconocimiento de Normas Patrones Automatizadas , Algoritmos , Área Bajo la Curva , Toma de Decisiones , Medicina Basada en la Evidencia , Hospitalización , Humanos , Pacientes Internos , Aprendizaje Automático , Guías de Práctica Clínica como Asunto , Pautas de la Práctica en Medicina , Curva ROC , Análisis de Regresión , Resultado del Tratamiento
2.
AMIA Jt Summits Transl Sci Proc ; 2019: 515-523, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-31259006

RESUMEN

A primary focus for reducing waste in healthcare expenditure is identifying and discouraging unnecessary repeat lab tests. A machine learning model which could reliably predict low information lab tests could provide personalized, real-time predictions to discourage over-testing. To this end, we apply six standard machine learning algorithms to six years (2008-2014) of inpatient data from a tertiary academic center, to predict when the next measurement of a lab test is likely to be the "same" as the previous one. Out of 13 common inpatient lab tests selected for this analysis, several are predictably stable in many cases. This points to potential areas where machine learning approaches may identify and prevent unneeded testing before it occurs, and a methodological framework for how these tasks can be accomplished.

3.
JAMA Netw Open ; 2(9): e1910967, 2019 09 04.
Artículo en Inglés | MEDLINE | ID: mdl-31509205

RESUMEN

Importance: Laboratory testing is an important target for high-value care initiatives, constituting the highest volume of medical procedures. Prior studies have found that up to half of all inpatient laboratory tests may be medically unnecessary, but a systematic method to identify these unnecessary tests in individual cases is lacking. Objective: To systematically identify low-yield inpatient laboratory testing through personalized predictions. Design, Setting, and Participants: In this retrospective diagnostic study with multivariable prediction models, 116 637 inpatients treated at Stanford University Hospital from January 1, 2008, to December 31, 2017, a total of 60 929 inpatients treated at University of Michigan from January 1, 2015, to December 31, 2018, and 13 940 inpatients treated at the University of California, San Francisco from January 1 to December 31, 2018, were assessed. Main Outcomes and Measures: Diagnostic accuracy measures, including sensitivity, specificity, negative predictive values (NPVs), positive predictive values (PPVs), and area under the receiver operating characteristic curve (AUROC), of machine learning models when predicting whether inpatient laboratory tests yield a normal result as defined by local laboratory reference ranges. Results: In the recent data sets (July 1, 2014, to June 30, 2017) from Stanford University Hospital (including 22 664 female inpatients with a mean [SD] age of 58.8 [19.0] years and 22 016 male inpatients with a mean [SD] age of 59.0 [18.1] years), among the top 20 highest-volume tests, 792 397 were repeats of orders within 24 hours, including tests that are physiologically unlikely to yield new information that quickly (eg, white blood cell differential, glycated hemoglobin, and serum albumin level). The best-performing machine learning models predicted normal results with an AUROC of 0.90 or greater for 12 stand-alone laboratory tests (eg, sodium AUROC, 0.92 [95% CI, 0.91-0.93]; sensitivity, 98%; specificity, 35%; PPV, 66%; NPV, 93%; lactate dehydrogenase AUROC, 0.93 [95% CI, 0.93-0.94]; sensitivity, 96%; specificity, 65%; PPV, 71%; NPV, 95%; and troponin I AUROC, 0.92 [95% CI, 0.91-0.93]; sensitivity, 88%; specificity, 79%; PPV, 67%; NPV, 93%) and 10 common laboratory test components (eg, hemoglobin AUROC, 0.94 [95% CI, 0.92-0.95]; sensitivity, 99%; specificity, 17%; PPV, 90%; NPV, 81%; creatinine AUROC, 0.96 [95% CI, 0.96-0.97]; sensitivity, 93%; specificity, 83%; PPV, 79%; NPV, 94%; and urea nitrogen AUROC, 0.95 [95% CI, 0.94, 0.96]; sensitivity, 87%; specificity, 89%; PPV, 77%; NPV 94%). Conclusions and Relevance: The findings suggest that low-yield diagnostic testing is common and can be systematically identified through data-driven methods and patient context-aware predictions. Implementing machine learning models appear to be able to quantify the level of uncertainty and expected information gained from diagnostic tests explicitly, with the potential to encourage useful testing and discourage low-value testing that incurs direct costs and indirect harms.


Asunto(s)
Técnicas de Laboratorio Clínico/estadística & datos numéricos , Hospitalización , Aprendizaje Automático , Adulto , Anciano , Área Bajo la Curva , Nitrógeno de la Urea Sanguínea , Femenino , Hemoglobina Glucada , Hemoglobinas , Humanos , L-Lactato Deshidrogenasa , Recuento de Leucocitos , Masculino , Persona de Mediana Edad , Valor Predictivo de las Pruebas , Curva ROC , Estudios Retrospectivos , Sensibilidad y Especificidad , Troponina I
4.
Data Brief ; 21: 1669-1673, 2018 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-30505898

RESUMEN

In this data article, we learn clinical order patterns from inpatient electronic health record (EHR) data at a tertiary academic center from three different cohorts of providers: (1) Clinicians with lower-than-expected patient mortality rates, (2) clinicians with higher-than-expected patient mortality rates, and (3) an unfiltered population of clinicians. We extract and make public these order patterns learned from each clinician cohort associated with six common admission diagnoses (e.g. pneumonia, chest pain, etc.). We also share a reusable reference standard or benchmark for evaluating automatically-learned clinical order patterns for each admission diagnosis, based on a manual review of clinical practice literature. The data shared in this article can support further study, evaluation, and translation of data-driven CDS systems. Further interpretation and discussion of this data can be found in Wang et al. (2018).

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA