RESUMEN
PURPOSE: The US Food and Drug Administration's Sentinel Innovation Center aimed to establish a query-ready, quality-checked distributed data network containing electronic health records (EHRs) linked with insurance claims data for at least 10 million individuals to expand the utility of real-world data for regulatory decision-making. METHODS: In this report, we describe the resulting network, the Real-World Evidence Data Enterprise (RWE-DE), including data from two commercial EHR-claims linked assets collectively termed the Commercial Network covering 21 million lives, and four academic partner institutions collectively termed the Development Network covering 4.5 million lives. RESULTS: We discuss provenance and completeness of the data converted in the Sentinel Common Data Model (SCDM), describe patient populations, and report on EHR-claims linkage characterization for all contributing data sources. Further, we introduce a standardized process to store free-text notes in the Development Network for efficient retrieval as needed. CONCLUSIONS: Finally, we outline typical use cases for the RWE-DE where it can broaden the reach of the types of questions that can be addressed by the Sentinel system.
Asunto(s)
Registros Electrónicos de Salud , United States Food and Drug Administration , Estados Unidos , Humanos , Registros Electrónicos de Salud/estadística & datos numéricos , Revisión de Utilización de Seguros , Vigilancia de GuardiaRESUMEN
BACKGROUND: Sequencing Mendelian arrhythmia genes in individuals without an indication for arrhythmia genetic testing can identify carriers of pathogenic or likely pathogenic (P/LP) variants. However, the extent to which these variants are associated with clinically meaningful phenotypes before or after return of variant results is unclear. In addition, the majority of discovered variants are currently classified as variants of uncertain significance, limiting clinical actionability. METHODS: The eMERGE-III study (Electronic Medical Records and Genomics Phase III) is a multicenter prospective cohort that included 21 846 participants without previous indication for cardiac genetic testing. Participants were sequenced for 109 Mendelian disease genes, including 10 linked to arrhythmia syndromes. Variant carriers were assessed with electronic health record-derived phenotypes and follow-up clinical examination. Selected variants of uncertain significance (n=50) were characterized in vitro with automated electrophysiology experiments in HEK293 cells. RESULTS: As previously reported, 3.0% of participants had P/LP variants in the 109 genes. Herein, we report 120 participants (0.6%) with P/LP arrhythmia variants. Compared with noncarriers, arrhythmia P/LP carriers had a significantly higher burden of arrhythmia phenotypes in their electronic health records. Fifty-four participants had variant results returned. Nineteen of these 54 participants had inherited arrhythmia syndrome diagnoses (primarily long-QT syndrome), and 12 of these 19 diagnoses were made only after variant results were returned (0.05%). After in vitro functional evaluation of 50 variants of uncertain significance, we reclassified 11 variants: 3 to likely benign and 8 to P/LP. CONCLUSIONS: Genome sequencing in a large population without indication for arrhythmia genetic testing identified phenotype-positive carriers of variants in congenital arrhythmia syndrome disease genes. As the genomes of large numbers of people are sequenced, the disease risk from rare variants in arrhythmia genes can be assessed by integrating genomic screening, electronic health record phenotypes, and in vitro functional studies. REGISTRATION: URL: https://www. CLINICALTRIALS: gov; Unique identifier; NCT03394859.
Asunto(s)
Arritmias Cardíacas , Pruebas Genéticas , Arritmias Cardíacas/diagnóstico , Arritmias Cardíacas/genética , Predisposición Genética a la Enfermedad , Pruebas Genéticas/métodos , Genómica , Células HEK293 , Humanos , Fenotipo , Estudios ProspectivosRESUMEN
We sought to determine whether machine learning and natural language processing (NLP) applied to electronic medical records could improve performance of automated health-care claims-based algorithms to identify anaphylaxis events using data on 516 patients with outpatient, emergency department, or inpatient anaphylaxis diagnosis codes during 2015-2019 in 2 integrated health-care institutions in the Northwest United States. We used one site's manually reviewed gold-standard outcomes data for model development and the other's for external validation based on cross-validated area under the receiver operating characteristic curve (AUC), positive predictive value (PPV), and sensitivity. In the development site 154 (64%) of 239 potential events met adjudication criteria for anaphylaxis compared with 180 (65%) of 277 in the validation site. Logistic regression models using only structured claims data achieved a cross-validated AUC of 0.58 (95% CI: 0.54, 0.63). Machine learning improved cross-validated AUC to 0.62 (0.58, 0.66); incorporating NLP-derived covariates further increased cross-validated AUCs to 0.70 (0.66, 0.75) in development and 0.67 (0.63, 0.71) in external validation data. A classification threshold with cross-validated PPV of 79% and cross-validated sensitivity of 66% in development data had cross-validated PPV of 78% and cross-validated sensitivity of 56% in external data. Machine learning and NLP-derived data improved identification of validated anaphylaxis events.
Asunto(s)
Anafilaxia , Procesamiento de Lenguaje Natural , Humanos , Anafilaxia/diagnóstico , Anafilaxia/epidemiología , Aprendizaje Automático , Algoritmos , Servicio de Urgencia en Hospital , Registros Electrónicos de SaludRESUMEN
BACKGROUND: Acute pancreatitis is a serious gastrointestinal disease that is an important target for drug safety surveillance. Little is known about the accuracy of ICD-10 codes for acute pancreatitis in the United States, or their performance in specific clinical settings. We conducted a validation study to assess the accuracy of acute pancreatitis ICD-10 diagnosis codes in inpatient, emergency department (ED), and outpatient settings. METHODS: We reviewed electronic medical records for encounters with acute pancreatitis diagnosis codes in an integrated healthcare system from October 2015 to December 2019. Trained abstractors and physician adjudicators determined whether events met criteria for acute pancreatitis. RESULTS: Out of 1,844 eligible events, we randomly sampled 300 for review. Across all clinical settings, 182 events met validation criteria for an overall positive predictive value (PPV) of 61% (95% confidence intervals [CI] = 55, 66). The PPV was 87% (95% CI = 79, 92%) for inpatient codes, but only 45% for ED (95% CI = 35, 54%) and outpatient (95% CI = 34, 55%) codes. ED and outpatient encounters accounted for 43% of validated events. Acute pancreatitis codes from any encounter type with lipase >3 times the upper limit of normal had a PPV of 92% (95% CI = 86, 95%) and identified 85% of validated events (95% CI = 79, 89%), while codes with lipase <3 times the upper limit of normal had a PPV of only 22% (95% CI = 16, 30%). CONCLUSIONS: These results suggest that ICD-10 codes accurately identified acute pancreatitis in the inpatient setting, but not in the ED and outpatient settings. Laboratory data substantially improved algorithm performance.
Asunto(s)
Prestación Integrada de Atención de Salud , Pancreatitis , Adulto , Humanos , Estados Unidos/epidemiología , Enfermedad Aguda , Pancreatitis/diagnóstico , Pancreatitis/epidemiología , Clasificación Internacional de Enfermedades , Valor Predictivo de las Pruebas , LipasaRESUMEN
Carotid artery atherosclerotic disease (CAAD) is a risk factor for stroke. We used a genome-wide association (GWAS) approach to discover genetic variants associated with CAAD in participants in the electronic Medical Records and Genomics (eMERGE) Network. We identified adult CAAD cases with unilateral or bilateral carotid artery stenosis and controls without evidence of stenosis from electronic health records at eight eMERGE sites. We performed GWAS with a model adjusting for age, sex, study site, and genetic principal components of ancestry. In eMERGE we found 1793 CAAD cases and 17,958 controls. Two loci reached genome-wide significance, on chr6 in LPA (rs10455872, odds ratio [OR] (95% confidence interval [CI]) = 1.50 (1.30-1.73), p = 2.1 × 10-8 ) and on chr7, an intergenic single nucleotide variant (SNV; rs6952610, OR (95% CI) = 1.25 (1.16-1.36), p = 4.3 × 10-8 ). The chr7 association remained significant in the presence of the LPA SNV as a covariate. The LPA SNV was also associated with coronary heart disease (CHD; 4199 cases and 11,679 controls) in this study (OR (95% CI) = 1.27 (1.13-1.43), p = 5 × 10-5 ) but the chr7 SNV was not (OR (95% CI) = 1.03 (0.97-1.09), p = .37). Both variants replicated in UK Biobank. Elevated lipoprotein(a) concentrations ([Lp(a)]) and LPA variants associated with elevated [Lp(a)] have previously been associated with CAAD and CHD, including rs10455872. With electronic health record phenotypes in eMERGE and UKB, we replicated a previously known association and identified a novel locus associated with CAAD.
Asunto(s)
Estenosis Carotídea , Estudio de Asociación del Genoma Completo , Registros Electrónicos de Salud , Predisposición Genética a la Enfermedad , Genómica , Humanos , Lipoproteína(a)/genética , Modelos Genéticos , Polimorfismo de Nucleótido SimpleRESUMEN
As clinical testing for Mendelian causes of colorectal cancer (CRC) is largely driven by recognition of family history and early age of onset, the rates of such findings among individuals with prevalent CRC not recognized to have these features is largely unknown. We evaluated actionable genomic findings in community-based participants ascertained by three phenotypes: (1) CRC, (2) one or more adenomatous colon polyps, and (3) control participants over age 59 years without CRC or colon polyps. These participants underwent sequencing for a panel of genes that included colorectal cancer/polyp (CRC/P)-associated and actionable incidental findings genes. Those with CRC had a 3.8% rate of positive results (pathogenic or likely pathogenic) for a CRC-associated gene variant, despite generally being older at CRC onset (mean 72 years). Those ascertained for polyps had a 0.8% positive rate and those with no CRC/P had a positive rate of 0.2%. Though incidental finding rates unrelated to colon cancer were similar for all groups, our positive rate for cardiovascular findings exceeds disease prevalence, suggesting that variant interpretation challenges or low penetrance in these genes. The rate of HFE c.845G>A (p.Cys282Tyr) homozygotes in the CRC group reinforces a previously reported, but relatively unexplored, association between hemochromatosis and CRC. These results in a general clinical population suggest that current testing strategies could be improved in order to better detect Mendelian CRC-associated conditions. These data also underscore the need for additional functional and familial evidence to clarify the pathogenicity and penetrance of variants deemed pathogenic or likely pathogenic, particularly among the actionable genes associated with cardiovascular disease.
Asunto(s)
Pólipos del Colon/genética , Neoplasias Colorrectales/genética , Adulto , Anciano , Anciano de 80 o más Años , Femenino , Humanos , Masculino , Persona de Mediana EdadRESUMEN
BACKGROUND: Currently available medications for chronic osteoarthritis pain are only moderately effective, and their use is limited in many patients because of serious adverse effects and contraindications. The primary surgical option for osteoarthritis is total joint replacement (TJR). The objectives of this study were to describe the treatment history of patients with osteoarthritis receiving prescription pain medications and/or intra-articular corticosteroid injections, and to estimate the incidence of TJR in these patients. METHODS: This retrospective, multicenter, cohort study utilized health plan administrative claims data (January 1, 2013, through December 31, 2019) of adult patients with osteoarthritis in the Innovation in Medical Evidence Development and Surveillance Distributed Database, a subset of the US FDA Sentinel Distributed Database. Patients were analyzed in two cohorts: those with prevalent use of "any pain medication" (prescription non-steroidal anti-inflammatory drugs [NSAIDs], opioids, and/or intra-articular corticosteroid injections) using only the first qualifying dispensing (index date); and those with prevalent use of "each specific pain medication class" with all qualifying treatment episodes identified. RESULTS: Among 1 992 670 prevalent users of "any pain medication", pain medications prescribed on the index date were NSAIDs (596 624 [29.9%] patients), opioids (1 161 806 [58.3%]), and intra-articular corticosteroids (323 459 [16.2%]). Further, 92 026 patients received multiple pain medications on the index date, including 71 632 (3.6%) receiving both NSAIDs and opioids. Altogether, 20.6% of patients used an NSAID at any time following an opioid index dispensing and 17.2% used an opioid following an NSAID index dispensing. The TJR incidence rates per 100 person-years (95% confidence interval [CI]) were 3.21 (95% CI: 3.20-3.23) in the "any pain medication" user cohort, and among those receiving "each specific pain medication class" were NSAIDs, 4.63 (95% CI: 4.58-4.67); opioids, 7.45 (95% CI: 7.40-7.49); and intra-articular corticosteroids, 8.05 (95% CI: 7.97-8.13). CONCLUSIONS: In patients treated with prescription medications for osteoarthritis pain, opioids were more commonly prescribed at index than NSAIDs and intra-articular corticosteroid injections. Of the pain medication classes examined, the incidence of TJR was highest in patients receiving intra-articular corticosteroids and lowest in patients receiving NSAIDs.
Asunto(s)
Artroplastia de Reemplazo , Dolor Crónico , Osteoartritis , Corticoesteroides/efectos adversos , Adulto , Analgésicos Opioides/uso terapéutico , Antiinflamatorios no Esteroideos , Artroplastia de Reemplazo/efectos adversos , Dolor Crónico/tratamiento farmacológico , Dolor Crónico/epidemiología , Estudios de Cohortes , Humanos , Incidencia , Osteoartritis/tratamiento farmacológico , Osteoartritis/epidemiología , Osteoartritis/cirugía , Estudios RetrospectivosRESUMEN
BACKGROUND: Patients and their loved ones often report symptoms or complaints of cognitive decline that clinicians note in free clinical text, but no structured screening or diagnostic data are recorded. These symptoms/complaints may be signals that predict who will go on to be diagnosed with mild cognitive impairment (MCI) and ultimately develop Alzheimer's Disease or related dementias. Our objective was to develop a natural language processing system and prediction model for identification of MCI from clinical text in the absence of screening or other structured diagnostic information. METHODS: There were two populations of patients: 1794 participants in the Adult Changes in Thought (ACT) study and 2391 patients in the general population of Kaiser Permanente Washington. All individuals had standardized cognitive assessment scores. We excluded patients with a diagnosis of Alzheimer's Disease, Dementia or use of donepezil. We manually annotated 10,391 clinic notes to train the NLP model. Standard Python code was used to extract phrases from notes and map each phrase to a cognitive functioning concept. Concepts derived from the NLP system were used to predict future MCI. The prediction model was trained on the ACT cohort and 60% of the general population cohort with 40% withheld for validation. We used a least absolute shrinkage and selection operator logistic regression approach (LASSO) to fit a prediction model with MCI as the prediction target. Using the predicted case status from the LASSO model and known MCI from standardized scores, we constructed receiver operating curves to measure model performance. RESULTS: Chart abstraction identified 42 MCI concepts. Prediction model performance in the validation data set was modest with an area under the curve of 0.67. Setting the cutoff for correct classification at 0.60, the classifier yielded sensitivity of 1.7%, specificity of 99.7%, PPV of 70% and NPV of 70.5% in the validation cohort. DISCUSSION AND CONCLUSION: Although the sensitivity of the machine learning model was poor, negative predictive value was high, an important characteristic of models used for population-based screening. While an AUC of 0.67 is generally considered moderate performance, it is also comparable to several tests that are widely used in clinical practice.
Asunto(s)
Enfermedad de Alzheimer , Disfunción Cognitiva , Enfermedad de Alzheimer/diagnóstico , Disfunción Cognitiva/diagnóstico , Humanos , Aprendizaje Automático , Tamizaje Masivo , Procesamiento de Lenguaje NaturalRESUMEN
INTRODUCTION: Currently, one of the commonly used methods for disseminating electronic health record (EHR)-based phenotype algorithms is providing a narrative description of the algorithm logic, often accompanied by flowcharts. A challenge with this mode of dissemination is the potential for under-specification in the algorithm definition, which leads to ambiguity and vagueness. METHODS: This study examines incidents of under-specification that occurred during the implementation of 34 narrative phenotyping algorithms in the electronic Medical Record and Genomics (eMERGE) network. We reviewed the online communication history between algorithm developers and implementers within the Phenotype Knowledge Base (PheKB) platform, where questions could be raised and answered regarding the intended implementation of a phenotype algorithm. RESULTS: We developed a taxonomy of under-specification categories via an iterative review process between two groups of annotators. Under-specifications that lead to ambiguity and vagueness were consistently found across narrative phenotype algorithms developed by all involved eMERGE sites. DISCUSSION AND CONCLUSION: Our findings highlight that under-specification is an impediment to the accuracy and efficiency of the implementation of current narrative phenotyping algorithms, and we propose approaches for mitigating these issues and improved methods for disseminating EHR phenotyping algorithms.
Asunto(s)
Algoritmos , Registros Electrónicos de Salud , Genómica , Humanos , Bases del Conocimiento , FenotipoRESUMEN
Background: Most states have legalized medical cannabis, yet little is known about how medical cannabis use is documented in patients' electronic health records (EHRs). We used natural language processing (NLP) to calculate the prevalence of clinician-documented medical cannabis use among adults in an integrated health system in Washington State where medical and recreational use are legal. Methods: We analyzed EHRs of patients ≥18 years old screened for past-year cannabis use (November 1, 2017-October 31, 2018), to identify clinician-documented medical cannabis use. We defined medical use as any documentation of cannabis that was recommended by a clinician or described by the clinician or patient as intended to manage health conditions or symptoms. We developed and applied an NLP system that included NLP-assisted manual review to identify such documentation in encounter notes. Results: Medical cannabis use was documented for 16,684 (5.6%) of 299,597 outpatient encounters with routine screening for cannabis use among 203,489 patients seeing 1,274 clinicians. The validated NLP system identified 54% of documentation and NLP-assisted manual review the remainder. Language documenting reasons for cannabis use included 125 terms indicating medical use, 28 terms indicating non-medical use and 41 ambiguous terms. Implicit documentation of medical use (e.g., "edible THC nightly for lumbar pain") was more common than explicit (e.g., "continues medical cannabis use"). Conclusions: Clinicians use diverse and often ambiguous language to document patients' reasons for cannabis use. Automating extraction of documentation about patients' cannabis use could facilitate clinical decision support and epidemiological investigation but will require large amounts of gold standard training data.
Asunto(s)
Marihuana Medicinal , Procesamiento de Lenguaje Natural , Adolescente , Adulto , Documentación , Humanos , Marihuana Medicinal/uso terapéutico , Medición de Resultados Informados por el Paciente , Atención Primaria de SaludRESUMEN
BACKGROUND: Abdominal aortic aneurysm (AAA) is an important cause of cardiovascular mortality; however, its genetic determinants remain incompletely defined. In total, 10 previously identified risk loci explain a small fraction of AAA heritability. METHODS: We performed a genome-wide association study in the Million Veteran Program testing ≈18 million DNA sequence variants with AAA (7642 cases and 172 172 controls) in veterans of European ancestry with independent replication in up to 4972 cases and 99 858 controls. We then used mendelian randomization to examine the causal effects of blood pressure on AAA. We examined the association of AAA risk variants with aneurysms in the lower extremity, cerebral, and iliac arterial beds, and derived a genome-wide polygenic risk score (PRS) to identify a subset of the population at greater risk for disease. RESULTS: Through a genome-wide association study, we identified 14 novel loci, bringing the total number of known significant AAA loci to 24. In our mendelian randomization analysis, we demonstrate that a genetic increase of 10 mm Hg in diastolic blood pressure (odds ratio, 1.43 [95% CI, 1.24-1.66]; P=1.6×10-6), as opposed to systolic blood pressure (odds ratio, 1.06 [95% CI, 0.97-1.15]; P=0.2), likely has a causal relationship with AAA development. We observed that 19 of 24 AAA risk variants associate with aneurysms in at least 1 other vascular territory. A 29-variant PRS was strongly associated with AAA (odds ratioPRS, 1.26 [95% CI, 1.18-1.36]; PPRS=2.7×10-11 per SD increase in PRS), independent of family history and smoking risk factors (odds ratioPRS+family history+smoking, 1.24 [95% CI, 1.14-1.35]; PPRS=1.27×10-6). Using this PRS, we identified a subset of the population with AAA prevalence greater than that observed in screening trials informing current guidelines. CONCLUSIONS: We identify novel AAA genetic associations with therapeutic implications and identify a subset of the population at significantly increased genetic risk of AAA independent of family history. Our data suggest that extending current screening guidelines to include testing to identify those with high polygenic AAA risk, once the cost of genotyping becomes comparable with that of screening ultrasound, would significantly increase the yield of current screening at reasonable cost.
Asunto(s)
Aneurisma de la Aorta Abdominal/genética , Humanos , VeteranosRESUMEN
BACKGROUND: Anaphylaxis is a life-threatening allergic reaction that is difficult to identify accurately with administrative data. We conducted a population-based validation study to assess the accuracy of ICD-10 diagnosis codes for anaphylaxis in outpatient, emergency department, and inpatient settings. METHODS: In an integrated healthcare system in Washington State, we obtained medical records from healthcare encounters with anaphylaxis diagnosis codes (potential events) from October 2015 to December 2018. To capture events missed by anaphylaxis diagnosis codes, we also obtained records on a sample of serious allergic and drug reactions. Two physicians determined whether potential events met established clinical criteria for anaphylaxis (validated events). RESULTS: Out of 239 potential events with anaphylaxis diagnosis codes, the overall positive predictive value (PPV) for validated events was 64% (95% CI = 58 to 70). The PPV decreased with increasing age. Common precipitants for anaphylaxis were food (39%), medications (35%), and insect bite or sting (12%). The sensitivity of emergency department and inpatient anaphylaxis diagnosis codes for all validated events was 58% (95% CI = 51 to 65), but sensitivity increased to 95% (95% CI = 74 to 99) when outpatient diagnosis codes were included. Using information from all validated events and sampling weights, the incidence rate for anaphylaxis was 3.6 events per 10,000 person-years (95% CI = 3.1 to 4.0). CONCLUSIONS: In this population-based setting, ICD-10 diagnosis codes for anaphylaxis from emergency department and inpatient settings had moderate PPV and sensitivity for validated events. These findings have implications for epidemiologic studies that seek to estimate risks of anaphylaxis using electronic health data.
Asunto(s)
Anafilaxia , Anafilaxia/diagnóstico , Anafilaxia/epidemiología , Registros Electrónicos de Salud , Humanos , Clasificación Internacional de Enfermedades , Valor Predictivo de las Pruebas , Washingtón/epidemiologíaRESUMEN
The Electronic Medical Records and Genomics (eMERGE) network is a network of medical centers with electronic medical records linked to existing biorepository samples for genomic discovery and genomic medicine research. The network sought to unify the genetic results from 78 Illumina and Affymetrix genotype array batches from 12 contributing medical centers for joint association analysis of 83,717 human participants. In this report, we describe the imputation of eMERGE results and methods to create the unified imputed merged set of genome-wide variant genotype data. We imputed the data using the Michigan Imputation Server, which provides a missing single-nucleotide variant genotype imputation service using the minimac3 imputation algorithm with the Haplotype Reference Consortium genotype reference set. We describe the quality control and filtering steps used in the generation of this data set and suggest generalizable quality thresholds for imputation and phenotype association studies. To test the merged imputed genotype set, we replicated a previously reported chromosome 6 HLA-B herpes zoster (shingles) association and discovered a novel zoster-associated loci in an epigenetic binding site near the terminus of chromosome 3 (3p29).
Asunto(s)
Registros Electrónicos de Salud , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Herpes Zóster/genética , Algoritmos , Población Negra/genética , Cromosomas Humanos/genética , Femenino , Haplotipos/genética , Homocigoto , Humanos , Masculino , Fenotipo , Polimorfismo de Nucleótido Simple/genética , Análisis de Componente Principal , Población Blanca/genéticaRESUMEN
Individuals participating in biobanks and other large research projects are increasingly asked to provide broad consent for open-ended research use and widespread sharing of their biosamples and data. We assessed willingness to participate in a biobank using different consent and data sharing models, hypothesizing that willingness would be higher under more restrictive scenarios. Perceived benefits, concerns, and information needs were also assessed. In this experimental survey, individuals from 11 US healthcare systems in the Electronic Medical Records and Genomics (eMERGE) Network were randomly allocated to one of three hypothetical scenarios: tiered consent and controlled data sharing; broad consent and controlled data sharing; or broad consent and open data sharing. Of 82,328 eligible individuals, exactly 13,000 (15.8%) completed the survey. Overall, 66% (95% CI: 63%-69%) of population-weighted respondents stated they would be willing to participate in a biobank; willingness and attitudes did not differ between respondents in the three scenarios. Willingness to participate was associated with self-identified white race, higher educational attainment, lower religiosity, perceiving more research benefits, fewer concerns, and fewer information needs. Most (86%, CI: 84%-87%) participants would want to know what would happen if a researcher misused their health information; fewer (51%, CI: 47%-55%) would worry about their privacy. The concern that the use of broad consent and open data sharing could adversely affect participant recruitment is not supported by these findings. Addressing potential participants' concerns and information needs and building trust and relationships with communities may increase acceptance of broad consent and wide data sharing in biobank research.
Asunto(s)
Bancos de Muestras Biológicas/ética , Difusión de la Información/ética , Consentimiento Informado/ética , Opinión Pública , Adolescente , Adulto , Anciano , Investigación Biomédica/ética , Registros Electrónicos de Salud/ética , Femenino , Genoma Humano , Genómica , Humanos , Masculino , Persona de Mediana Edad , Privacidad , Factores Socioeconómicos , Estados Unidos , Adulto JovenRESUMEN
BACKGROUND: Primary care providers prescribe most long-term opioid therapy and are increasingly asked to taper the opioid doses of these patients to safer levels. A recent systematic review suggests that multiple interventions may facilitate opioid taper, but many of these are not feasible within the usual primary care practice. OBJECTIVE: To determine if opioid taper plans documented by primary care providers in the electronic health record are associated with significant and sustained opioid dose reductions among patients on long-term opioid therapy. DESIGN: A nested case-control design was used to compare cases (patients with a sustained opioid taper defined as average daily opioid dose of ≤ 30 mg morphine equivalent (MME) or a 50% reduction in MME) to controls (patients matched to cases on year and quarter of cohort entry, sex, and age group, who had not achieved a sustained taper). Each case was matched with four controls. PARTICIPANTS: Two thousand four hundred nine patients receiving a ≥ 60-day supply of opioids with an average daily dose of ≥ 50 MME during 2011-2015. MAIN MEASURES: Opioid taper plans documented in prescription instructions or clinical notes within the electronic health record identified through natural language processing; opioid dosing, patient characteristics, and taper plan components also abstracted from the electronic health record. KEY RESULTS: Primary care taper plans were associated with an increased likelihood of sustained opioid taper after adjusting for all patient covariates and near peak dose (OR = 3.63 [95% CI 2.96-4.46], p < 0.0001). Both taper plans in prescription instructions (OR = 4.03 [95% CI 3.19-5.09], p < 0.0001) and in clinical notes (OR = 2.82 [95% CI 2.00-3.99], p < 0.0001) were associated with sustained taper. CONCLUSIONS: These results suggest that planning for opioid taper during primary care visits may facilitate significant and sustained opioid dose reduction.
Asunto(s)
Analgésicos Opioides , Reducción Gradual de Medicamentos , Registros Electrónicos de Salud , Analgésicos Opioides/efectos adversos , Estudios de Casos y Controles , Humanos , Atención Primaria de SaludRESUMEN
Genome-wide association studies (GWAS) have identified several risk variants for late-onset Alzheimer's disease (LOAD). These common variants have replicable but small effects on LOAD risk and generally do not have obvious functional effects. Low-frequency coding variants, not detected by GWAS, are predicted to include functional variants with larger effects on risk. To identify low-frequency coding variants with large effects on LOAD risk, we carried out whole-exome sequencing (WES) in 14 large LOAD families and follow-up analyses of the candidate variants in several large LOAD case-control data sets. A rare variant in PLD3 (phospholipase D3; Val232Met) segregated with disease status in two independent families and doubled risk for Alzheimer's disease in seven independent case-control series with a total of more than 11,000 cases and controls of European descent. Gene-based burden analyses in 4,387 cases and controls of European descent and 302 African American cases and controls, with complete sequence data for PLD3, reveal that several variants in this gene increase risk for Alzheimer's disease in both populations. PLD3 is highly expressed in brain regions that are vulnerable to Alzheimer's disease pathology, including hippocampus and cortex, and is expressed at significantly lower levels in neurons from Alzheimer's disease brains compared to control brains. Overexpression of PLD3 leads to a significant decrease in intracellular amyloid-ß precursor protein (APP) and extracellular Aß42 and Aß40 (the 42- and 40-residue isoforms of the amyloid-ß peptide), and knockdown of PLD3 leads to a significant increase in extracellular Aß42 and Aß40. Together, our genetic and functional data indicate that carriers of PLD3 coding variants have a twofold increased risk for LOAD and that PLD3 influences APP processing. This study provides an example of how densely affected families may help to identify rare variants with large effects on risk for disease or other complex traits.
Asunto(s)
Enfermedad de Alzheimer/genética , Predisposición Genética a la Enfermedad/genética , Variación Genética/genética , Fosfolipasa D/genética , Negro o Afroamericano/genética , Edad de Inicio , Anciano , Anciano de 80 o más Años , Enfermedad de Alzheimer/metabolismo , Péptidos beta-Amiloides/metabolismo , Precursor de Proteína beta-Amiloide/metabolismo , Encéfalo/metabolismo , Estudios de Casos y Controles , Europa (Continente)/etnología , Exoma/genética , Femenino , Humanos , Masculino , Fragmentos de Péptidos/metabolismo , Fosfolipasa D/deficiencia , Fosfolipasa D/metabolismo , Procesamiento Proteico-Postraduccional/genética , ProteolisisRESUMEN
BACKGROUND: The extent to which obesity and genetics determine postoperative complications is incompletely understood. METHODS: We performed a retrospective study using two population cohorts with electronic health record (EHR) data. The first included 736,726 adults with body mass index (BMI) recorded between 1990 and 2017 at Vanderbilt University Medical Center. The second cohort consisted of 65,174 individuals from 12 institutions contributing EHR and genome-wide genotyping data to the Electronic Medical Records and Genomics (eMERGE) Network. Pairwise logistic regression analyses were used to measure the association of BMI categories with postoperative complications derived from International Classification of Disease-9 codes, including postoperative infection, incisional hernia, and intestinal obstruction. A genetic risk score was constructed from 97 obesity-risk single-nucleotide polymorphisms for a Mendelian randomization study to determine the association of genetic risk of obesity on postoperative complications. Logistic regression analyses were adjusted for sex, age, site, and race/principal components. RESULTS: Individuals with overweight or obese BMI (≥25 kg/m2) had increased risk of incisional hernia (odds ratio [OR] 1.7-5.5, p < 3.1 × 10-20), and people with obesity (BMI ≥ 30 kg/m2) had increased risk of postoperative infection (OR 1.2-2.3, p < 2.5 × 10-5). In the eMERGE cohort, genetically predicted BMI was associated with incisional hernia (OR 2.1 [95% CI 1.8-2.5], p = 1.4 × 10-6) and postoperative infection (OR 1.6 [95% CI 1.4-1.9], p = 3.1 × 10-6). Association findings were similar after limitation of the cohorts to those who underwent abdominal procedures. CONCLUSIONS: Clinical and Mendelian randomization studies suggest that obesity, as measured by BMI, is associated with the development of postoperative incisional hernia and infection.
Asunto(s)
Análisis de la Aleatorización Mendeliana/métodos , Obesidad/complicaciones , Complicaciones Posoperatorias/genética , Adulto , Índice de Masa Corporal , Femenino , Humanos , Modelos Logísticos , Masculino , Persona de Mediana Edad , Polimorfismo de Nucleótido Simple , Complicaciones Posoperatorias/etiología , Estudios Retrospectivos , Factores de RiesgoRESUMEN
Resting-state white blood cell (WBC) count is a marker of inflammation and immune system health. There is evidence that WBC count is not fixed over time and there is heterogeneity in WBC trajectory that is associated with morbidity and mortality. Latent class mixed modeling (LCMM) is a method that can identify unobserved heterogeneity in longitudinal data and attempts to classify individuals into groups based on a linear model of repeated measurements. We applied LCMM to repeated WBC count measures derived from electronic medical records of participants of the National Human Genetics Research Institute (NHRGI) electronic MEdical Record and GEnomics (eMERGE) network study, revealing two WBC count trajectory phenotypes. Advancing these phenotypes to GWAS, we found genetic associations between trajectory class membership and regions on chromosome 1p34.3 and chromosome 11q13.4. The chromosome 1 region contains CSF3R, which encodes the granulocyte colony-stimulating factor receptor. This protein is a major factor in neutrophil stimulation and proliferation. The association on chromosome 11 contain genes RNF169 and XRRA1; both involved in the regulation of double-strand break DNA repair.
Asunto(s)
Recuento de Leucocitos/métodos , Leucocitos/clasificación , Adulto , Anciano , Bases de Datos Genéticas , Registros Electrónicos de Salud , Femenino , Estudio de Asociación del Genoma Completo , Humanos , Análisis de Clases Latentes , Masculino , Persona de Mediana Edad , Fenotipo , Polimorfismo de Nucleótido Simple/genética , Proteínas/genética , Receptores del Factor Estimulante de Colonias/genética , Ubiquitina-Proteína Ligasas/genéticaRESUMEN
BACKGROUND: Proteomic approaches allow measurement of thousands of proteins in a single specimen, which can accelerate biomarker discovery. However, applying these technologies to massive biobanks is not currently feasible because of the practical barriers and costs of implementing such assays at scale. To overcome these challenges, we used a "virtual proteomic" approach, linking genetically predicted protein levels to clinical diagnoses in >40 000 individuals. METHODS: We used genome-wide association data from the Framingham Heart Study (n=759) to construct genetic predictors for 1129 plasma protein levels. We validated the genetic predictors for 268 proteins and used them to compute predicted protein levels in 41 288 genotyped individuals in the Electronic Medical Records and Genomics (eMERGE) cohort. We tested associations for each predicted protein with 1128 clinical phenotypes. Lead associations were validated with directly measured protein levels and either low-density lipoprotein cholesterol or subclinical atherosclerosis in the MDCS (Malmö Diet and Cancer Study; n=651). RESULTS: In the virtual proteomic analysis in eMERGE, 55 proteins were associated with 89 distinct diagnoses at a false discovery rate q<0.1. Among these, 13 associations involved lipid (n=7) or atherosclerosis (n=6) phenotypes. We tested each association for validation in MDCS using directly measured protein levels. At Bonferroni-adjusted significance thresholds, levels of apolipoprotein E isoforms were associated with hyperlipidemia, and circulating C-type lectin domain family 1 member B and platelet-derived growth factor receptor-ß predicted subclinical atherosclerosis. Odds ratios for carotid atherosclerosis were 1.31 (95% CI, 1.08-1.58; P=0.006) per 1-SD increment in C-type lectin domain family 1 member B and 0.79 (0.66-0.94; P=0.008) per 1-SD increment in platelet-derived growth factor receptor-ß. CONCLUSIONS: We demonstrate a biomarker discovery paradigm to identify candidate biomarkers of cardiovascular and other diseases.