Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
1.
J Biomed Inform ; 86: 109-119, 2018 10.
Artigo em Inglês | MEDLINE | ID: mdl-30195660

RESUMO

OBJECTIVE: Evaluate the quality of clinical order practice patterns machine-learned from clinician cohorts stratified by patient mortality outcomes. MATERIALS AND METHODS: Inpatient electronic health records from 2010 to 2013 were extracted from a tertiary academic hospital. Clinicians (n = 1822) were stratified into low-mortality (21.8%, n = 397) and high-mortality (6.0%, n = 110) extremes using a two-sided P-value score quantifying deviation of observed vs. expected 30-day patient mortality rates. Three patient cohorts were assembled: patients seen by low-mortality clinicians, high-mortality clinicians, and an unfiltered crowd of all clinicians (n = 1046, 1046, and 5230 post-propensity score matching, respectively). Predicted order lists were automatically generated from recommender system algorithms trained on each patient cohort and evaluated against (i) real-world practice patterns reflected in patient cases with better-than-expected mortality outcomes and (ii) reference standards derived from clinical practice guidelines. RESULTS: Across six common admission diagnoses, order lists learned from the crowd demonstrated the greatest alignment with guideline references (AUROC range = 0.86-0.91), performing on par or better than those learned from low-mortality clinicians (0.79-0.84, P < 10-5) or manually-authored hospital order sets (0.65-0.77, P < 10-3). The same trend was observed in evaluating model predictions against better-than-expected patient cases, with the crowd model (AUROC mean = 0.91) outperforming the low-mortality model (0.87, P < 10-16) and order set benchmarks (0.78, P < 10-35). DISCUSSION: Whether machine-learning models are trained on all clinicians or a subset of experts illustrates a bias-variance tradeoff in data usage. Defining robust metrics to assess quality based on internal (e.g. practice patterns from better-than-expected patient cases) or external reference standards (e.g. clinical practice guidelines) is critical to assess decision support content. CONCLUSION: Learning relevant decision support content from all clinicians is as, if not more, robust than learning from a select subgroup of clinicians favored by patient outcomes.


Assuntos
Mineração de Dados , Sistemas de Apoio a Decisões Clínicas , Registros Eletrônicos de Saúde , Mortalidade , Reconhecimento Automatizado de Padrão , Algoritmos , Área Sob a Curva , Tomada de Decisões , Medicina Baseada em Evidências , Hospitalização , Humanos , Pacientes Internados , Aprendizado de Máquina , Guias de Prática Clínica como Assunto , Padrões de Prática Médica , Curva ROC , Análise de Regressão , Resultado do Tratamento
2.
AMIA Jt Summits Transl Sci Proc ; 2019: 515-523, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31259006

RESUMO

A primary focus for reducing waste in healthcare expenditure is identifying and discouraging unnecessary repeat lab tests. A machine learning model which could reliably predict low information lab tests could provide personalized, real-time predictions to discourage over-testing. To this end, we apply six standard machine learning algorithms to six years (2008-2014) of inpatient data from a tertiary academic center, to predict when the next measurement of a lab test is likely to be the "same" as the previous one. Out of 13 common inpatient lab tests selected for this analysis, several are predictably stable in many cases. This points to potential areas where machine learning approaches may identify and prevent unneeded testing before it occurs, and a methodological framework for how these tasks can be accomplished.

3.
JAMA Netw Open ; 2(9): e1910967, 2019 09 04.
Artigo em Inglês | MEDLINE | ID: mdl-31509205

RESUMO

Importance: Laboratory testing is an important target for high-value care initiatives, constituting the highest volume of medical procedures. Prior studies have found that up to half of all inpatient laboratory tests may be medically unnecessary, but a systematic method to identify these unnecessary tests in individual cases is lacking. Objective: To systematically identify low-yield inpatient laboratory testing through personalized predictions. Design, Setting, and Participants: In this retrospective diagnostic study with multivariable prediction models, 116 637 inpatients treated at Stanford University Hospital from January 1, 2008, to December 31, 2017, a total of 60 929 inpatients treated at University of Michigan from January 1, 2015, to December 31, 2018, and 13 940 inpatients treated at the University of California, San Francisco from January 1 to December 31, 2018, were assessed. Main Outcomes and Measures: Diagnostic accuracy measures, including sensitivity, specificity, negative predictive values (NPVs), positive predictive values (PPVs), and area under the receiver operating characteristic curve (AUROC), of machine learning models when predicting whether inpatient laboratory tests yield a normal result as defined by local laboratory reference ranges. Results: In the recent data sets (July 1, 2014, to June 30, 2017) from Stanford University Hospital (including 22 664 female inpatients with a mean [SD] age of 58.8 [19.0] years and 22 016 male inpatients with a mean [SD] age of 59.0 [18.1] years), among the top 20 highest-volume tests, 792 397 were repeats of orders within 24 hours, including tests that are physiologically unlikely to yield new information that quickly (eg, white blood cell differential, glycated hemoglobin, and serum albumin level). The best-performing machine learning models predicted normal results with an AUROC of 0.90 or greater for 12 stand-alone laboratory tests (eg, sodium AUROC, 0.92 [95% CI, 0.91-0.93]; sensitivity, 98%; specificity, 35%; PPV, 66%; NPV, 93%; lactate dehydrogenase AUROC, 0.93 [95% CI, 0.93-0.94]; sensitivity, 96%; specificity, 65%; PPV, 71%; NPV, 95%; and troponin I AUROC, 0.92 [95% CI, 0.91-0.93]; sensitivity, 88%; specificity, 79%; PPV, 67%; NPV, 93%) and 10 common laboratory test components (eg, hemoglobin AUROC, 0.94 [95% CI, 0.92-0.95]; sensitivity, 99%; specificity, 17%; PPV, 90%; NPV, 81%; creatinine AUROC, 0.96 [95% CI, 0.96-0.97]; sensitivity, 93%; specificity, 83%; PPV, 79%; NPV, 94%; and urea nitrogen AUROC, 0.95 [95% CI, 0.94, 0.96]; sensitivity, 87%; specificity, 89%; PPV, 77%; NPV 94%). Conclusions and Relevance: The findings suggest that low-yield diagnostic testing is common and can be systematically identified through data-driven methods and patient context-aware predictions. Implementing machine learning models appear to be able to quantify the level of uncertainty and expected information gained from diagnostic tests explicitly, with the potential to encourage useful testing and discourage low-value testing that incurs direct costs and indirect harms.


Assuntos
Técnicas de Laboratório Clínico/estatística & dados numéricos , Hospitalização , Aprendizado de Máquina , Adulto , Idoso , Área Sob a Curva , Nitrogênio da Ureia Sanguínea , Feminino , Hemoglobinas Glicadas , Hemoglobinas , Humanos , L-Lactato Desidrogenase , Contagem de Leucócitos , Masculino , Pessoa de Meia-Idade , Valor Preditivo dos Testes , Curva ROC , Estudos Retrospectivos , Sensibilidade e Especificidade , Troponina I
4.
Data Brief ; 21: 1669-1673, 2018 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-30505898

RESUMO

In this data article, we learn clinical order patterns from inpatient electronic health record (EHR) data at a tertiary academic center from three different cohorts of providers: (1) Clinicians with lower-than-expected patient mortality rates, (2) clinicians with higher-than-expected patient mortality rates, and (3) an unfiltered population of clinicians. We extract and make public these order patterns learned from each clinician cohort associated with six common admission diagnoses (e.g. pneumonia, chest pain, etc.). We also share a reusable reference standard or benchmark for evaluating automatically-learned clinical order patterns for each admission diagnosis, based on a manual review of clinical practice literature. The data shared in this article can support further study, evaluation, and translation of data-driven CDS systems. Further interpretation and discussion of this data can be found in Wang et al. (2018).

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA