RESUMO
We sought to determine whether machine learning and natural language processing (NLP) applied to electronic medical records could improve performance of automated health-care claims-based algorithms to identify anaphylaxis events using data on 516 patients with outpatient, emergency department, or inpatient anaphylaxis diagnosis codes during 2015-2019 in 2 integrated health-care institutions in the Northwest United States. We used one site's manually reviewed gold-standard outcomes data for model development and the other's for external validation based on cross-validated area under the receiver operating characteristic curve (AUC), positive predictive value (PPV), and sensitivity. In the development site 154 (64%) of 239 potential events met adjudication criteria for anaphylaxis compared with 180 (65%) of 277 in the validation site. Logistic regression models using only structured claims data achieved a cross-validated AUC of 0.58 (95% CI: 0.54, 0.63). Machine learning improved cross-validated AUC to 0.62 (0.58, 0.66); incorporating NLP-derived covariates further increased cross-validated AUCs to 0.70 (0.66, 0.75) in development and 0.67 (0.63, 0.71) in external validation data. A classification threshold with cross-validated PPV of 79% and cross-validated sensitivity of 66% in development data had cross-validated PPV of 78% and cross-validated sensitivity of 56% in external data. Machine learning and NLP-derived data improved identification of validated anaphylaxis events.
Assuntos
Anafilaxia , Processamento de Linguagem Natural , Humanos , Anafilaxia/diagnóstico , Anafilaxia/epidemiologia , Aprendizado de Máquina , Algoritmos , Serviço Hospitalar de Emergência , Registros Eletrônicos de SaúdeRESUMO
INTRODUCTION: This study examined change in tobacco use over 4 years among the general population of patients in six diverse health care organizations using electronic medical record data. METHODS: The study cohort (N = 34 393) included all patients age 18 years or older who were identified as smokers in 2007, and who then had at least one primary care visit in each of the following 4 years. RESULTS: In the 4 years following 2007, this patient cohort had a median of 13 primary care visits, and 38.6% of the patients quit smoking at least once. At the end of the fourth follow-up year, 15.4% had stopped smoking for 1 year or more. Smokers were more likely to become long-term quitters if they were 65 or older (OR = 1.32, 95% CI = [1.16, 1.49]), or had a diagnoses of cancer (1.26 [1.12, 1.41]), cardiovascular disease (1.22 [1.09, 1.37]), asthma (1.15 [1.06, 1.25]), or diabetes (1.17 [1.09, 1.27]). Characteristics associated with lower likelihood of becoming a long-term quitter were female gender (0.90 [0.84, 0.95]), black race (0.84 [0.75, 0.94]) and those identified as non-Hispanic (0.50 [0.43, 0.59]). CONCLUSIONS: Among smokers who regularly used these care systems, one in seven had achieved long-term cessation after 4 years. This study shows the practicality of using electronic medical records for monitoring patient smoking status over time. Similar methods could be used to assess tobacco use in any health care organization to evaluate the impact of environmental and organizational programs.
Assuntos
Atenção à Saúde/tendências , Registros Eletrônicos de Saúde/tendências , Vigilância da População , Abandono do Hábito de Fumar/métodos , Uso de Tabaco/tendências , Uso de Tabaco/terapia , Adulto , Idoso , Estudos de Coortes , Atenção à Saúde/métodos , Feminino , Humanos , Estudos Longitudinais , Masculino , Pessoa de Meia-Idade , Vigilância da População/métodos , Atenção Primária à Saúde/métodos , Atenção Primária à Saúde/tendências , Fumar/epidemiologia , Fumar/terapia , Fumar/tendências , Uso de Tabaco/epidemiologiaRESUMO
OBJECTIVE: To present a general framework providing high-level guidance to developers of computable algorithms for identifying patients with specific clinical conditions (phenotypes) through a variety of approaches, including but not limited to machine learning and natural language processing methods to incorporate rich electronic health record data. MATERIALS AND METHODS: Drawing on extensive prior phenotyping experiences and insights derived from 3 algorithm development projects conducted specifically for this purpose, our team with expertise in clinical medicine, statistics, informatics, pharmacoepidemiology, and healthcare data science methods conceptualized stages of development and corresponding sets of principles, strategies, and practical guidelines for improving the algorithm development process. RESULTS: We propose 5 stages of algorithm development and corresponding principles, strategies, and guidelines: (1) assessing fitness-for-purpose, (2) creating gold standard data, (3) feature engineering, (4) model development, and (5) model evaluation. DISCUSSION AND CONCLUSION: This framework is intended to provide practical guidance and serve as a basis for future elaboration and extension.
Assuntos
Algoritmos , Registros Eletrônicos de Saúde , Processamento de Linguagem Natural , Fenótipo , Humanos , Aprendizado de MáquinaRESUMO
Comparative effectiveness research (CER) has the potential to transform the current health care delivery system by identifying the most effective medical and surgical treatments, diagnostic tests, disease prevention methods, and ways to deliver care for specific clinical conditions. To be successful, such research requires the identification, capture, aggregation, integration, and analysis of disparate data sources held by different institutions with diverse representations of the relevant clinical events. In an effort to address these diverse demands, there have been multiple new designs and implementations of informatics platforms that provide access to electronic clinical data and the governance infrastructure required for interinstitutional CER. The goal of this manuscript is to help investigators understand why these informatics platforms are required and to compare and contrast 6 large-scale, recently funded, CER-focused informatics platform development efforts. We utilized an 8-dimension, sociotechnical model of health information technology to help guide our work. We identified 6 generic steps that are necessary in any distributed, multi-institutional CER project: data identification, extraction, modeling, aggregation, analysis, and dissemination. We expect that over the next several years these projects will provide answers to many important, and heretofore unanswerable, clinical research questions.
Assuntos
Pesquisa Comparativa da Efetividade , Informática Médica/organização & administração , Avaliação de Processos e Resultados em Cuidados de Saúde , Coleta de Dados/métodos , Humanos , Informática Médica/estatística & dados numéricos , Sistemas Computadorizados de Registros Médicos , Garantia da Qualidade dos Cuidados de Saúde , Melhoria de Qualidade , Sistema de Registros , Estados UnidosRESUMO
Objective: Opioid surveillance in response to the opioid epidemic will benefit from scalable, automated algorithms for identifying patients with clinically documented signs of problem prescription opioid use. Existing algorithms lack accuracy. We sought to develop a high-sensitivity, high-specificity classification algorithm based on widely available structured health data to identify patients receiving chronic extended-release/long-acting (ER/LA) therapy with evidence of problem use to support subsequent epidemiologic investigations. Methods: Outpatient medical records of a probability sample of 2,000 Kaiser Permanente Washington patients receiving ≥60 days' supply of ER/LA opioids in a 90-day period from 1 January 2006 to 30 June 2015 were manually reviewed to determine the presence of clinically documented signs of problem use and used as a reference standard for algorithm development. Using 1,400 patients as training data, we constructed candidate predictors from demographic, enrollment, encounter, diagnosis, procedure, and medication data extracted from medical claims records or the equivalent from electronic health record (EHR) systems, and we used adaptive least absolute shrinkage and selection operator (LASSO) regression to develop a model. We evaluated this model in a comparable 600-patient validation set. We compared this model to ICD-9 diagnostic codes for opioid abuse, dependence, and poisoning. This study was registered with ClinicalTrials.gov as study NCT02667262 on 28 January 2016. Results: We operationalized 1,126 potential predictors characterizing patient demographics, procedures, diagnoses, timing, dose, and location of medication dispensing. The final model incorporating 53 predictors had a sensitivity of 0.582 at positive predictive value (PPV) of 0.572. ICD-9 codes for opioid abuse, dependence, and poisoning had a sensitivity of 0.390 at PPV of 0.599 in the same cohort. Conclusions: Scalable methods using widely available structured EHR/claims data to accurately identify problem opioid use among patients receiving long-term ER/LA therapy were unsuccessful. This approach may be useful for identifying patients needing clinical evaluation.
RESUMO
INTRODUCTION: Brief smoking-cessation interventions in primary care settings are effective, but delivery of these services remains low. The Centers for Medicare and Medicaid Services' Meaningful Use (MU) of Electronic Health Record (EHR) Incentive Program could increase rates of smoking assessment and cessation assistance among vulnerable populations. This study examined whether smoking status assessment, cessation assistance, and odds of being a current smoker changed after Stage 1 MU implementation. METHODS: EHR data were extracted from 26 community health centers with an EHR in place by June 15, 2009. AORs were computed for each binary outcome (smoking status assessment, counseling given, smoking-cessation medications ordered/discussed, current smoking status), comparing 2010 (pre-MU), 2012 (MU preparation), and 2014 (MU fully implemented) for pregnant and non-pregnant patients. RESULTS: Non-pregnant patients had decreased odds of current smoking over time; odds for all other outcomes increased except for medication orders from 2010 to 2012. Among pregnant patients, odds of assessment and counseling increased across all years. Odds of discussing or ordering of cessation medications increased from 2010 compared with the other 2 study years; however, medication orders alone did not change over time, and current smoking only decreased from 2010 to 2012. Compared with non-pregnant patients, a lower percentage of pregnant patients were provided counseling. CONCLUSIONS: Findings suggest that incentives for MU of EHRs increase the odds of smoking assessment and cessation assistance, which could lead to decreased smoking rates among vulnerable populations. Continued efforts for provision of cessation assistance among pregnant patients is warranted.
Assuntos
Centers for Medicare and Medicaid Services, U.S./estatística & dados numéricos , Registros Eletrônicos de Saúde/estatística & dados numéricos , Uso Significativo/estatística & dados numéricos , Abandono do Hábito de Fumar/métodos , Fumar/terapia , Adulto , Idoso , Aconselhamento/estatística & dados numéricos , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Gravidez , Atenção Primária à Saúde/métodos , Atenção Primária à Saúde/estatística & dados numéricos , Fumar/epidemiologia , Estados Unidos/epidemiologia , Adulto JovemRESUMO
OBJECTIVES: Comparative effectiveness research (CER) requires the capture and analysis of data from disparate sources, often from a variety of institutions with diverse electronic health record (EHR) implementations. In this paper we describe the CER Hub, a web-based informatics platform for developing and conducting research studies that combine comprehensive electronic clinical data from multiple health care organizations. METHODS: The CER Hub platform implements a data processing pipeline that employs informatics standards for data representation and web-based tools for developing study-specific data processing applications, providing standardized access to the patient-centric electronic health record (EHR) across organizations. RESULTS: The CER Hub is being used to conduct two CER studies utilizing data from six geographically distributed and demographically diverse health systems. These foundational studies address the effectiveness of medications for controlling asthma and the effectiveness of smoking cessation services delivered in primary care. DISCUSSION: The CER Hub includes four key capabilities: the ability to process and analyze both free-text and coded clinical data in the EHR; a data processing environment supported by distributed data and study governance processes; a clinical data-interchange format for facilitating standardized extraction of clinical data from EHRs; and a library of shareable clinical data processing applications. CONCLUSION: CER requires coordinated and scalable methods for extracting, aggregating, and analyzing complex, multi-institutional clinical data. By offering a range of informatics tools integrated into a framework for conducting studies using EHR data, the CER Hub provides a solution to the challenges of multi-institutional research using electronic medical record data.
Assuntos
Pesquisa Comparativa da Efetividade/normas , Registros Eletrônicos de Saúde/organização & administração , Armazenamento e Recuperação da Informação/normas , Uso Significativo/organização & administração , Informática Médica/normas , Registro Médico Coordenado/normas , Guias como Assunto , Internet/normas , Registro Médico Coordenado/métodos , Processamento de Linguagem Natural , Garantia da Qualidade dos Cuidados de Saúde/métodos , Estados UnidosRESUMO
BACKGROUND: Numerous population-based surveys indicate that overweight and obese patients can benefit from lifestyle counseling during routine clinical care. PURPOSE: To determine if natural language processing (NLP) could be applied to information in the electronic health record (EHR) to automatically assess delivery of weight management-related counseling in clinical healthcare encounters. METHODS: The MediClass system with NLP capabilities was used to identify weight-management counseling in EHRs. Knowledge for the NLP application was derived from the 5As framework for behavior counseling: Ask (evaluate weight and related disease), Advise at-risk patients to lose weight, Assess patients' readiness to change behavior, Assist through discussion of weight-loss methods and programs, and Arrange follow-up efforts including referral. Using samples of EHR data between January 1, 2007, and March 31, 2011, from two health systems, the accuracy of the MediClass processor for identifying these counseling elements was evaluated in postpartum visits of 600 women with gestational diabetes mellitus (GDM) compared to manual chart review as the gold standard. Data were analyzed in 2013. RESULTS: Mean sensitivity and specificity for each of the 5As compared to the gold standard was at or above 85%, with the exception of sensitivity for Assist, which was 40% and 60% for each of the two health systems. The automated method identified many valid Assist cases not identified in the gold standard. CONCLUSIONS: The MediClass processor has performance capability sufficiently similar to human abstractors to permit automated assessment of counseling for weight loss in postpartum encounter records.
Assuntos
Aconselhamento/organização & administração , Registros Eletrônicos de Saúde/organização & administração , Estilo de Vida , Sobrepeso/terapia , Encaminhamento e Consulta , Adulto , Diabetes Gestacional/epidemiologia , Feminino , Comportamentos Relacionados com a Saúde , Humanos , Processamento de Linguagem Natural , Obesidade/terapia , Sobrepeso/epidemiologia , Gravidez , Grupos RaciaisRESUMO
Comparative effectiveness research (CER) studies involving multiple institutions with diverse electronic health records (EHRs) depend on high quality data. To ensure uniformity of data derived from different EHR systems and implementations, the CER Hub informatics platform developed a quality assurance (QA) process using tools and data formats available through the CER Hub. The QA process, implemented here in a study of smoking cessation services in primary care, used the 'emrAdapter' tool programmed with a set of quality checks to query large samples of primary care encounter records extracted in accord with the CER Hub common data framework. The tool, deployed to each study site, generated error reports indicating data problems to be fixed locally and aggregate data sharable with the central site for quality review. Across the CER Hub network of six health systems, data completeness and correctness issues were prevalent in the first iteration and were considerably improved after three iterations of the QA process. A common issue encountered was incomplete mapping of local EHR data values to those defined by the common data framework. A highly automated and distributed QA process helped to ensure the correctness and completeness of patient care data extracted from EHRs for a multi-institution CER study in smoking cessation.
Assuntos
Pesquisa Comparativa da Efetividade , Conjuntos de Dados como Assunto/normas , Registros Eletrônicos de Saúde/normas , Abandono do Hábito de Fumar , Humanos , Internet , Sistemas Computadorizados de Registros Médicos , Controle de QualidadeRESUMO
In an experiment to investigate cognitive skill differences between clinicians and lay persons, eight individuals in each group were asked to determine if an explicit concept existed in an ambulatory encounter note (a simple task) or if the concept could be inferred from the same note (a complex task). Subjects answered questions, highlighted text used to answer each question, and commented on their reasoning for selecting specific text. Quantitative results were mixed for expert vs. non-expert task performance on simple vs. complex tasks. Qualitative analysis revealed that data ambiguity obscured quantifiable skill differences between groups. In addition, this analysis offered new insight into whether a concept identification task is simple or complex. We present this case study to demonstrate the value of mixed method approaches to task-based performance study design and evaluation. We discuss the results in terms of their implications for evaluating meaningful use of technologies.