Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 34
Filtrar
1.
Am J Hum Genet ; 106(5): 707-716, 2020 05 07.
Artigo em Inglês | MEDLINE | ID: mdl-32386537

RESUMO

Because polygenic risk scores (PRSs) for coronary heart disease (CHD) are derived from mainly European ancestry (EA) cohorts, their validity in African ancestry (AA) and Hispanic ethnicity (HE) individuals is unclear. We investigated associations of "restricted" and genome-wide PRSs with CHD in three major racial and ethnic groups in the U.S. The eMERGE cohort (mean age 48 ± 14 years, 58% female) included 45,645 EA, 7,597 AA, and 2,493 HE individuals. We assessed two restricted PRSs (PRSTikkanen and PRSTada; 28 and 50 variants, respectively) and two genome-wide PRSs (PRSmetaGRS and PRSLDPred; 1.7 M and 6.6 M variants, respectively) derived from EA cohorts. Over a median follow-up of 11.1 years, 2,652 incident CHD events occurred. Hazard and odds ratios for the association of PRSs with CHD were similar in EA and HE cohorts but lower in AA cohorts. Genome-wide PRSs were more strongly associated with CHD than restricted PRSs were. PRSmetaGRS, the best performing PRS, was associated with CHD in all three cohorts; hazard ratios (95% CI) per 1 SD increase were 1.53 (1.46-1.60), 1.53 (1.23-1.90), and 1.27 (1.13-1.43) for incident CHD in EA, HE, and AA individuals, respectively. The hazard ratios were comparable in the EA and HE cohorts (pinteraction = 0.77) but were significantly attenuated in AA individuals (pinteraction= 2.9 × 10-3). These results highlight the potential clinical utility of PRSs for CHD as well as the need to assemble diverse cohorts to generate ancestry- and ethnicity PRSs.


Assuntos
Afro-Americanos/genética , Doença das Coronárias/genética , Grupo com Ancestrais do Continente Europeu/genética , Predisposição Genética para Doença , Hispano-Americanos/genética , Herança Multifatorial/genética , Estudos de Coortes , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Razão de Chances
2.
World J Surg ; 44(1): 84-94, 2020 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-31605180

RESUMO

BACKGROUND: The extent to which obesity and genetics determine postoperative complications is incompletely understood. METHODS: We performed a retrospective study using two population cohorts with electronic health record (EHR) data. The first included 736,726 adults with body mass index (BMI) recorded between 1990 and 2017 at Vanderbilt University Medical Center. The second cohort consisted of 65,174 individuals from 12 institutions contributing EHR and genome-wide genotyping data to the Electronic Medical Records and Genomics (eMERGE) Network. Pairwise logistic regression analyses were used to measure the association of BMI categories with postoperative complications derived from International Classification of Disease-9 codes, including postoperative infection, incisional hernia, and intestinal obstruction. A genetic risk score was constructed from 97 obesity-risk single-nucleotide polymorphisms for a Mendelian randomization study to determine the association of genetic risk of obesity on postoperative complications. Logistic regression analyses were adjusted for sex, age, site, and race/principal components. RESULTS: Individuals with overweight or obese BMI (≥25 kg/m2) had increased risk of incisional hernia (odds ratio [OR] 1.7-5.5, p < 3.1 × 10-20), and people with obesity (BMI ≥ 30 kg/m2) had increased risk of postoperative infection (OR 1.2-2.3, p < 2.5 × 10-5). In the eMERGE cohort, genetically predicted BMI was associated with incisional hernia (OR 2.1 [95% CI 1.8-2.5], p = 1.4 × 10-6) and postoperative infection (OR 1.6 [95% CI 1.4-1.9], p = 3.1 × 10-6). Association findings were similar after limitation of the cohorts to those who underwent abdominal procedures. CONCLUSIONS: Clinical and Mendelian randomization studies suggest that obesity, as measured by BMI, is associated with the development of postoperative incisional hernia and infection.

3.
Pediatrics ; 144(6)2019 12.
Artigo em Inglês | MEDLINE | ID: mdl-31699831

RESUMO

OBJECTIVES: Proton pump inhibitors (PPIs) are often used in pediatrics to treat common gastrointestinal disorders, and there are growing concerns for infectious adverse events. Because CYP2C19 inactivates PPIs, genetic variants that increase CYP2C19 function may decrease PPI exposure and infections. We tested the hypothesis that CYP2C19 metabolizer phenotypes are associated with infection event rates in children exposed to PPIs. METHODS: This retrospective biorepository cohort study included individuals aged 0 to 36 months at the time of PPI exposure. Respiratory tract and gastrointestinal tract infection events were identified by using International Classification of Diseases codes in the year after the first PPI mention. Variants defining CYP2C19 *2, *3, *4, *8, *9, and *17 were genotyped, and all individuals were classified as CYP2C19 poor or intermediate, normal metabolizers (NMs), or rapid or ultrarapid metabolizers (RM/UMs). Infection rates were compared by using univariate and multivariate analyses. RESULTS: In all, 670 individuals were included (median age 7 months; 44% girls). CYP2C19 NMs (n = 267; 40%) had a higher infection rate than RM/UMs (n = 220; 33%; median 2 vs 1 infections per person per year; P = .03). There was no difference between poor or intermediate (n = 183; 27%) and NMs. In multivariable analysis of NMs and RM/UMs adjusting for age, sex, PPI dose, and comorbidities, CYP2C19 metabolizer status remained a significant risk factor for infection events (odds ratio 0.70 [95% confidence interval 0.50-0.97] for RM/UMs versus NMs). CONCLUSIONS: PPI therapy is associated with higher infection rates in children with normal CYP2C19 function than in those with increased CYP2C19 function, highlighting this adverse effect of PPI therapy and the relevance of CYP2C19 genotypes to PPI therapeutic decision-making.


Assuntos
Citocromo P-450 CYP2C19/genética , Infecções/induzido quimicamente , Infecções/genética , Fenótipo , Inibidores da Bomba de Prótons/efeitos adversos , Estudos de Coortes , Feminino , Humanos , Lactente , Infecções/diagnóstico , Masculino , Estudos Retrospectivos , Fatores de Risco
4.
Nat Genet ; 51(10): 1459-1474, 2019 10.
Artigo em Inglês | MEDLINE | ID: mdl-31578528

RESUMO

Elevated serum urate levels cause gout and correlate with cardiometabolic diseases via poorly understood mechanisms. We performed a trans-ancestry genome-wide association study of serum urate in 457,690 individuals, identifying 183 loci (147 previously unknown) that improve the prediction of gout in an independent cohort of 334,880 individuals. Serum urate showed significant genetic correlations with many cardiometabolic traits, with genetic causality analyses supporting a substantial role for pleiotropy. Enrichment analysis, fine-mapping of urate-associated loci and colocalization with gene expression in 47 tissues implicated the kidney and liver as the main target organs and prioritized potentially causal genes and variants, including the transcriptional master regulators in the liver and kidney, HNF1A and HNF4A. Experimental validation showed that HNF4A transactivated the promoter of ABCG2, encoding a major urate transporter, in kidney cells, and that HNF4A p.Thr139Ile is a functional variant. Transcriptional coregulation within and across organs may be a general mechanism underlying the observed pleiotropy between urate and cardiometabolic traits.


Assuntos
Doenças Cardiovasculares/sangue , Marcadores Genéticos , Gota/sangue , Doenças Metabólicas/sangue , Polimorfismo de Nucleotídeo Único , Transdução de Sinais , Ácido Úrico/sangue , Membro 2 da Subfamília G de Transportadores de Cassetes de Ligação de ATP/genética , Doenças Cardiovasculares/epidemiologia , Doenças Cardiovasculares/genética , Estudos de Coortes , Loci Gênicos , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Gota/epidemiologia , Gota/genética , Fator 1-alfa Nuclear de Hepatócito/genética , Fator 4 Nuclear de Hepatócito/genética , Humanos , Rim/metabolismo , Rim/patologia , Fígado/metabolismo , Fígado/patologia , Doenças Metabólicas/epidemiologia , Doenças Metabólicas/genética , Proteínas de Neoplasias/genética , Especificidade de Órgãos
5.
Nat Commun ; 10(1): 4130, 2019 09 11.
Artigo em Inglês | MEDLINE | ID: mdl-31511532

RESUMO

Increased levels of the urinary albumin-to-creatinine ratio (UACR) are associated with higher risk of kidney disease progression and cardiovascular events, but underlying mechanisms are incompletely understood. Here, we conduct trans-ethnic (n = 564,257) and European-ancestry specific meta-analyses of genome-wide association studies of UACR, including ancestry- and diabetes-specific analyses, and identify 68 UACR-associated loci. Genetic correlation analyses and risk score associations in an independent electronic medical records database (n = 192,868) reveal connections with proteinuria, hyperlipidemia, gout, and hypertension. Fine-mapping and trans-Omics analyses with gene expression in 47 tissues and plasma protein levels implicate genes potentially operating through differential expression in kidney (including TGFB1, MUC1, PRKCI, and OAF), and allow coupling of UACR associations to altered plasma OAF concentrations. Knockdown of OAF and PRKCI orthologs in Drosophila nephrocytes reduces albumin endocytosis. Silencing fly PRKCI further impairs slit diaphragm formation. These results generate a priority list of genes and pathways for translational research to reduce albuminuria.


Assuntos
Albuminúria/genética , Mapeamento Cromossômico , Estudo de Associação Genômica Ampla , Metanálise como Assunto , Animais , Creatinina/urina , Diabetes Mellitus/genética , Diabetes Mellitus/urina , Drosophila melanogaster/genética , Regulação da Expressão Gênica , Loci Gênicos , Predisposição Genética para Doença , Humanos , Fenômica , Fatores de Risco
6.
BMC Med ; 17(1): 135, 2019 07 17.
Artigo em Inglês | MEDLINE | ID: mdl-31311600

RESUMO

BACKGROUND: Non-alcoholic fatty liver disease (NAFLD) is a common chronic liver illness with a genetically heterogeneous background that can be accompanied by considerable morbidity and attendant health care costs. The pathogenesis and progression of NAFLD is complex with many unanswered questions. We conducted genome-wide association studies (GWASs) using both adult and pediatric participants from the Electronic Medical Records and Genomics (eMERGE) Network to identify novel genetic contributors to this condition. METHODS: First, a natural language processing (NLP) algorithm was developed, tested, and deployed at each site to identify 1106 NAFLD cases and 8571 controls and histological data from liver tissue in 235 available participants. These include 1242 pediatric participants (396 cases, 846 controls). The algorithm included billing codes, text queries, laboratory values, and medication records. Next, GWASs were performed on NAFLD cases and controls and case-only analyses using histologic scores and liver function tests adjusting for age, sex, site, ancestry, PC, and body mass index (BMI). RESULTS: Consistent with previous results, a robust association was detected for the PNPLA3 gene cluster in participants with European ancestry. At the PNPLA3-SAMM50 region, three SNPs, rs738409, rs738408, and rs3747207, showed strongest association (best SNP rs738409 p = 1.70 × 10- 20). This effect was consistent in both pediatric (p = 9.92 × 10- 6) and adult (p = 9.73 × 10- 15) cohorts. Additionally, this variant was also associated with disease severity and NAFLD Activity Score (NAS) (p = 3.94 × 10- 8, beta = 0.85). PheWAS analysis link this locus to a spectrum of liver diseases beyond NAFLD with a novel negative correlation with gout (p = 1.09 × 10- 4). We also identified novel loci for NAFLD disease severity, including one novel locus for NAS score near IL17RA (rs5748926, p = 3.80 × 10- 8), and another near ZFP90-CDH1 for fibrosis (rs698718, p = 2.74 × 10- 11). Post-GWAS and gene-based analyses identified more than 300 genes that were used for functional and pathway enrichment analyses. CONCLUSIONS: In summary, this study demonstrates clear confirmation of a previously described NAFLD risk locus and several novel associations. Further collaborative studies including an ethnically diverse population with well-characterized liver histologic features of NAFLD are needed to further validate the novel findings.


Assuntos
Hepatopatia Gordurosa não Alcoólica/genética , Adulto , Idoso , Índice de Massa Corporal , Estudos de Casos e Controles , Redes Comunitárias/organização & administração , Redes Comunitárias/estatística & dados numéricos , Progressão da Doença , Registros Eletrônicos de Saúde/organização & administração , Registros Eletrônicos de Saúde/estatística & dados numéricos , Feminino , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Genômica/organização & administração , Genômica/estatística & dados numéricos , Humanos , Lipase/genética , Masculino , Proteínas de Membrana/genética , Pessoa de Meia-Idade , Morbidade , Hepatopatia Gordurosa não Alcoólica/epidemiologia , Fenótipo , Polimorfismo de Nucleotídeo Único , Transdução de Sinais/genética
7.
J Biomed Inform ; 96: 103253, 2019 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-31325501

RESUMO

BACKGROUND: Implementing clinical phenotypes across a network is labor intensive and potentially error prone. Use of a common data model may facilitate the process. METHODS: Electronic Medical Records and Genomics (eMERGE) sites implemented the Observational Health Data Sciences and Informatics (OHDSI) Observational Medical Outcomes Partnership (OMOP) Common Data Model across their electronic health record (EHR)-linked DNA biobanks. Two previously implemented eMERGE phenotypes were converted to OMOP and implemented across the network. RESULTS: It was feasible to implement the common data model across sites, with laboratory data producing the greatest challenge due to local encoding. Sites were then able to execute the OMOP phenotype in less than one day, as opposed to weeks of effort to manually implement an eMERGE phenotype in their bespoke research EHR databases. Of the sites that could compare the current OMOP phenotype implementation with the original eMERGE phenotype implementation, specific agreement ranged from 100% to 43%, with disagreements due to the original phenotype, the OMOP phenotype, changes in data, and issues in the databases. Using the OMOP query as a standard comparison revealed differences in the original implementations despite starting from the same definitions, code lists, flowcharts, and pseudocode. CONCLUSION: Using a common data model can dramatically speed phenotype implementation at the cost of having to populate that data model, though this will produce a net benefit as the number of phenotype implementations increases. Inconsistencies among the implementations of the original queries point to a potential benefit of using a common data model so that actual phenotype code and logic can be shared, mitigating human error in reinterpretation of a narrative phenotype definition.

8.
NPJ Genom Med ; 4: 3, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-30774981

RESUMO

We conducted an electronic health record (EHR)-based phenome-wide association study (PheWAS) to discover pleiotropic effects of variants in three lipoprotein metabolism genes PCSK9, APOB, and LDLR. Using high-density genotype data, we tested the associations of variants in the three genes with 1232 EHR-derived binary phecodes in 51,700 European-ancestry (EA) individuals and 585 phecodes in 10,276 African-ancestry (AA) individuals; 457 PCSK9, 730 APOB, and 720 LDLR variants were filtered by imputation quality (r 2 > 0.4), minor allele frequency (>1%), linkage disequilibrium (r 2 < 0.3), and association with LDL-C levels, yielding a set of two PCSK9, three APOB, and five LDLR variants in EA but no variants in AA. Cases and controls were defined for each phecode using the PheWAS package in R. Logistic regression assuming an additive genetic model was used with adjustment for age, sex, and the first two principal components. Significant associations were tested in additional cohorts from Vanderbilt University (n = 29,713), the Marshfield Clinic Personalized Medicine Research Project (n = 9562), and UK Biobank (n = 408,455). We identified one PCSK9, two APOB, and two LDLR variants significantly associated with an examined phecode. Only one of the variants was associated with a non-lipid disease phecode, ("myopia") but this association was not significant in the replication cohorts. In this large-scale PheWAS we did not find LDL-C-related variants in PCSK9, APOB, and LDLR to be associated with non-lipid-related phenotypes including diabetes, neurocognitive disorders, or cataracts.

9.
Pediatr Res ; 85(5): 602-606, 2019 04.
Artigo em Inglês | MEDLINE | ID: mdl-30661084

RESUMO

BACKGROUND: There are few and conflicting data on the role of cytochrome P450 2D6 (CYP2D6) polymorphisms in relation to risperidone adverse events (AEs) in children. This study assessed the association between CYP2D6 metabolizer status and risk for risperidone AEs in children. METHODS: Children ≤18 years with at least 4 weeks of risperidone exposure were identified using BioVU, a de-identified DNA biobank linked to electronic health record data. The primary outcome of this study was AEs. After DNA sequencing, individuals were classified as CYP2D6 poor, intermediate, normal, or ultrarapid CYP2D6 metabolizers. RESULTS: For analysis, the 257 individuals were grouped as poor/intermediate metabolizers (n = 33, 13%) and normal/ultrarapid metabolizers (n = 224, 87%). AEs were more common in poor/intermediate vs. normal/ultrarapid metabolizers (15/33, 46% vs. 61/224, 27%, P = 0.04). In multivariate analysis adjusting for age, sex, race, and initial dose, poor/intermediate metabolizers had increased AE risk (adjusted odds ratio 2.4, 95% confidence interval 1.1-5.1, P = 0.03). CONCLUSION: Children with CYP2D6 poor or intermediate metabolizer phenotypes are at greater risk for risperidone AEs. Pre-prescription genotyping could identify this high-risk subset for an alternate therapy, risperidone dose reduction, and/or increased monitoring for AEs.


Assuntos
Citocromo P-450 CYP2D6/genética , Farmacogenética , Polimorfismo Genético , Risperidona/efeitos adversos , Adolescente , Alelos , Criança , Registros Eletrônicos de Saúde , Feminino , Genótipo , Humanos , Masculino , Fenótipo , Estudos Retrospectivos , Risco , Resultado do Tratamento
10.
Genes Immun ; 20(7): 555-565, 2019 09.
Artigo em Inglês | MEDLINE | ID: mdl-30459343

RESUMO

Resting-state white blood cell (WBC) count is a marker of inflammation and immune system health. There is evidence that WBC count is not fixed over time and there is heterogeneity in WBC trajectory that is associated with morbidity and mortality. Latent class mixed modeling (LCMM) is a method that can identify unobserved heterogeneity in longitudinal data and attempts to classify individuals into groups based on a linear model of repeated measurements. We applied LCMM to repeated WBC count measures derived from electronic medical records of participants of the National Human Genetics Research Institute (NHRGI) electronic MEdical Record and GEnomics (eMERGE) network study, revealing two WBC count trajectory phenotypes. Advancing these phenotypes to GWAS, we found genetic associations between trajectory class membership and regions on chromosome 1p34.3 and chromosome 11q13.4. The chromosome 1 region contains CSF3R, which encodes the granulocyte colony-stimulating factor receptor. This protein is a major factor in neutrophil stimulation and proliferation. The association on chromosome 11 contain genes RNF169 and XRRA1; both involved in the regulation of double-strand break DNA repair.


Assuntos
Contagem de Leucócitos/métodos , Leucócitos/classificação , Adulto , Idoso , Bases de Dados Genéticas , Registros Eletrônicos de Saúde , Feminino , Estudo de Associação Genômica Ampla , Humanos , Análise de Classes Latentes , Masculino , Pessoa de Meia-Idade , Fenótipo , Polimorfismo de Nucleotídeo Único/genética , Proteínas/genética , Receptores de Fator Estimulador de Colônias/genética , Ubiquitina-Proteína Ligases/genética
11.
Genet Epidemiol ; 43(1): 63-81, 2019 02.
Artigo em Inglês | MEDLINE | ID: mdl-30298529

RESUMO

The Electronic Medical Records and Genomics (eMERGE) network is a network of medical centers with electronic medical records linked to existing biorepository samples for genomic discovery and genomic medicine research. The network sought to unify the genetic results from 78 Illumina and Affymetrix genotype array batches from 12 contributing medical centers for joint association analysis of 83,717 human participants. In this report, we describe the imputation of eMERGE results and methods to create the unified imputed merged set of genome-wide variant genotype data. We imputed the data using the Michigan Imputation Server, which provides a missing single-nucleotide variant genotype imputation service using the minimac3 imputation algorithm with the Haplotype Reference Consortium genotype reference set. We describe the quality control and filtering steps used in the generation of this data set and suggest generalizable quality thresholds for imputation and phenotype association studies. To test the merged imputed genotype set, we replicated a previously reported chromosome 6 HLA-B herpes zoster (shingles) association and discovered a novel zoster-associated loci in an epigenetic binding site near the terminus of chromosome 3 (3p29).


Assuntos
Registros Eletrônicos de Saúde , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Herpes Zoster/genética , Grupo com Ancestrais do Continente Africano/genética , Algoritmos , Cromossomos Humanos/genética , Grupo com Ancestrais do Continente Europeu/genética , Feminino , Haplótipos/genética , Homozigoto , Humanos , Masculino , Fenótipo , Polimorfismo de Nucleotídeo Único/genética , Análise de Componente Principal
12.
Cell Host Microbe ; 24(2): 308-323.e6, 2018 08 08.
Artigo em Inglês | MEDLINE | ID: mdl-30092202

RESUMO

Pathogens have been a strong driving force for natural selection. Therefore, understanding how human genetic differences impact infection-related cellular traits can mechanistically link genetic variation to disease susceptibility. Here we report the Hi-HOST Phenome Project (H2P2): a catalog of cellular genome-wide association studies (GWAS) comprising 79 infection-related phenotypes in response to 8 pathogens in 528 lymphoblastoid cell lines. Seventeen loci surpass genome-wide significance for infection-associated phenotypes ranging from pathogen replication to cytokine production. We combined H2P2 with clinical association data from patients to identify a SNP near CXCL10 as a risk factor for inflammatory bowel disease. A SNP in the transcriptional repressor ZBTB20 demonstrated pleiotropy, likely through suppression of multiple target genes, and was associated with viral hepatitis. These data are available on a web portal to facilitate interpreting human genome variation through the lens of cell biology and should serve as a rich resource for the research community.


Assuntos
Biologia Computacional/métodos , Predisposição Genética para Doença , Variação Genética , Genoma Humano , Estudo de Associação Genômica Ampla/métodos , Infecções , Fenótipo , Anticorpos Monoclonais , Linhagem Celular , Quimiocina CXCL10/genética , Citocinas/genética , Citocinas/metabolismo , Análise Mutacional de DNA , Replicação do DNA , Coleta de Dados , Bases de Dados Genéticas , Registros Eletrônicos de Saúde , Pleiotropia Genética , Estudo de Associação Genômica Ampla/instrumentação , Hepatite Viral Humana , Humanos , Doenças Inflamatórias Intestinais , Proteínas do Tecido Nervoso/genética , Fatores de Risco , Fatores de Transcrição/genética , Navegador
13.
J Am Med Inform Assoc ; 25(11): 1540-1546, 2018 11 01.
Artigo em Inglês | MEDLINE | ID: mdl-30124903

RESUMO

Electronic health record (EHR) algorithms for defining patient cohorts are commonly shared as free-text descriptions that require human intervention both to interpret and implement. We developed the Phenotype Execution and Modeling Architecture (PhEMA, http://projectphema.org) to author and execute standardized computable phenotype algorithms. With PhEMA, we converted an algorithm for benign prostatic hyperplasia, developed for the electronic Medical Records and Genomics network (eMERGE), into a standards-based computable format. Eight sites (7 within eMERGE) received the computable algorithm, and 6 successfully executed it against local data warehouses and/or i2b2 instances. Blinded random chart review of cases selected by the computable algorithm shows PPV ≥90%, and 3 out of 5 sites had >90% overlap of selected cases when comparing the computable algorithm to their original eMERGE implementation. This case study demonstrates potential use of PhEMA computable representations to automate phenotyping across different EHR systems, but also highlights some ongoing challenges.


Assuntos
Algoritmos , Registros Eletrônicos de Saúde , Fenótipo , Hiperplasia Prostática/diagnóstico , Data Warehousing , Bases de Dados Factuais , Genômica , Humanos , Masculino , Estudos de Casos Organizacionais , Hiperplasia Prostática/genética
14.
Bioinformatics ; 34(17): 2988-2996, 2018 09 01.
Artigo em Inglês | MEDLINE | ID: mdl-29912272

RESUMO

Motivation: Phenome-wide association studies (PheWAS) have been used to discover many genotype-phenotype relationships and have the potential to identify therapeutic and adverse drug outcomes using longitudinal data within electronic health records (EHRs). However, the statistical methods for PheWAS applied to longitudinal EHR medication data have not been established. Results: In this study, we developed methods to address two challenges faced with reuse of EHR for this purpose: confounding by indication, and low exposure and event rates. We used Monte Carlo simulation to assess propensity score (PS) methods, focusing on two of the most commonly used methods, PS matching and PS adjustment, to address confounding by indication. We also compared two logistic regression approaches (the default of Wald versus Firth's penalized maximum likelihood, PML) to address complete separation due to sparse data with low exposure and event rates. PS adjustment resulted in greater power than PS matching, while controlling Type I error at 0.05. The PML method provided reasonable P-values, even in cases with complete separation, with well controlled Type I error rates. Using PS adjustment and the PML method, we identify novel latent drug effects in pediatric patients exposed to two common antibiotic drugs, ampicillin and gentamicin. Availability and implementation: R packages PheWAS and EHR are available at https://github.com/PheWAS/PheWAS and at CRAN (https://www.r-project.org/), respectively. The R script for data processing and the main analysis is available at https://github.com/choileena/EHR. Supplementary information: Supplementary data are available at Bioinformatics online.


Assuntos
Conjuntos de Dados como Assunto , Descoberta de Drogas , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos , Registros Eletrônicos de Saúde , Humanos , Modelos Logísticos , Probabilidade
15.
Arthritis Res Ther ; 20(1): 69, 2018 04 10.
Artigo em Inglês | MEDLINE | ID: mdl-29636090

RESUMO

BACKGROUND: African Americans with systemic lupus erythematosus (SLE) have increased renal disease compared to Caucasians, but differences in other comorbidities have not been well-described. We used an electronic health record (EHR) technique to test for differences in comorbidities in African Americans compared to Caucasians with SLE. METHODS: We used a de-identified EHR with 2.8 million subjects to identify SLE cases using a validated algorithm. We performed phenome-wide association studies (PheWAS) comparing African American to Caucasian SLE cases and African American SLE cases to matched non-SLE controls. Controls were age, sex, and race matched to SLE cases. For multiple testing, a false discovery rate (FDR) p value of 0.05 was used. RESULTS: We identified 270 African Americans and 715 Caucasians with SLE and 1425 matched African American controls. Compared to Caucasians with SLE adjusting for age and sex, African Americans with SLE had more comorbidities in every organ system. The most striking included hypertension odds ratio (OR) = 4.25, FDR p = 5.49 × 10- 15; renal dialysis OR = 10.90, FDR p = 8.75 × 10- 14; and pneumonia OR = 3.57, FDR p = 2.32 × 10- 8. Compared to the African American matched controls without SLE, African Americans with SLE were more likely to have comorbidities in every organ system. The most significant codes were renal and cardiac, and included renal failure (OR = 9.55, FDR p = 2.26 × 10- 40) and hypertensive heart and renal disease (OR = 8.08, FDR p = 1.78 × 10- 22). Adjusting for race, age, and sex in a model including both African American and Caucasian SLE cases and controls, SLE was independently associated with renal, cardiovascular, and infectious diseases (all p < 0.01). CONCLUSIONS: African Americans with SLE have an increased comorbidity burden compared to Caucasians with SLE and matched controls. This increase in comorbidities in African Americans with SLE highlights the need to monitor for cardiovascular and infectious complications.


Assuntos
Comorbidade , Lúpus Eritematoso Sistêmico/epidemiologia , Adulto , Afro-Americanos , Grupo com Ancestrais do Continente Europeu , Feminino , Humanos , Masculino , Pessoa de Meia-Idade
16.
Arthritis Care Res (Hoboken) ; 70(11): 1630-1636, 2018 11.
Artigo em Inglês | MEDLINE | ID: mdl-29481723

RESUMO

OBJECTIVE: Phenome-wide association studies (PheWAS) scan across billing codes in the electronic health record (EHR) and re-purpose clinical EHR data for research. In this study, we examined whether PheWAS could function as an EHR-based discovery tool for systemic lupus erythematosus (SLE) and identified novel clinical associations in male versus female patients with SLE. METHODS: We used a de-identified version of the Vanderbilt University Medical Center EHR, which includes more than 2.8 million subjects. We performed EHR-based PheWAS to compare SLE patients with age-, sex-, and race-matched control subjects and to compare male SLE patients with female SLE patients, controlling for multiple testing using a false discovery rate (FDR) P value of 0.05. RESULTS: We identified 1,097 patients with SLE and 5,735 matched control subjects. In a comparison of patients with SLE and matched controls, SLE patients were shown to be more likely to have International Classification of Diseases, Ninth Revision codes related to the SLE disease criteria. In the PheWAS of male versus female SLE patients, with adjustment for age and race, male patients were shown to be more likely to have atrial fibrillation (odds ratio 4.50, false discovery rate P = 3.23 × 10-3 ). Chart review confirmed atrial fibrillation, with the majority of patients developing atrial fibrillation after the SLE diagnosis and having multiple risk factors for atrial fibrillation. After adjustment for age, sex, race, and coronary artery disease, SLE disease status was shown to be significantly associated with atrial fibrillation (P = 0.002). CONCLUSION: Using PheWAS to compare male and female patients with SLE, we identified a novel association of an increased incidence of atrial fibrillation in male patients. SLE disease status was shown to be independently associated with atrial fibrillation, even after adjustment for age, sex, race, and coronary artery disease. These results demonstrate the utility of PheWAS as an EHR-based discovery tool for SLE.


Assuntos
Fibrilação Atrial/etiologia , Lúpus Eritematoso Sistêmico/complicações , Adulto , Idoso , Humanos , Masculino , Pessoa de Meia-Idade , Estudos Retrospectivos , Caracteres Sexuais
17.
PLoS One ; 12(7): e0175508, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28686612

RESUMO

OBJECTIVE: To compare three groupings of Electronic Health Record (EHR) billing codes for their ability to represent clinically meaningful phenotypes and to replicate known genetic associations. The three tested coding systems were the International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) codes, the Agency for Healthcare Research and Quality Clinical Classification Software for ICD-9-CM (CCS), and manually curated "phecodes" designed to facilitate phenome-wide association studies (PheWAS) in EHRs. METHODS AND MATERIALS: We selected 100 disease phenotypes and compared the ability of each coding system to accurately represent them without performing additional groupings. The 100 phenotypes included 25 randomly-chosen clinical phenotypes pursued in prior genome-wide association studies (GWAS) and another 75 common disease phenotypes mentioned across free-text problem lists from 189,289 individuals. We then evaluated the performance of each coding system to replicate known associations for 440 SNP-phenotype pairs. RESULTS: Out of the 100 tested clinical phenotypes, phecodes exactly matched 83, compared to 53 for ICD-9-CM and 32 for CCS. ICD-9-CM codes were typically too detailed (requiring custom groupings) while CCS codes were often not granular enough. Among 440 tested known SNP-phenotype associations, use of phecodes replicated 153 SNP-phenotype pairs compared to 143 for ICD-9-CM and 139 for CCS. Phecodes also generally produced stronger odds ratios and lower p-values for known associations than ICD-9-CM and CCS. Finally, evaluation of several SNPs via PheWAS identified novel potential signals, some seen in only using the phecode approach. Among them, rs7318369 in PEPD was associated with gastrointestinal hemorrhage. CONCLUSION: Our results suggest that the phecode groupings better align with clinical diseases mentioned in clinical practice or for genomic studies. ICD-9-CM, CCS, and phecode groupings all worked for PheWAS-type studies, though the phecode groupings produced superior results.


Assuntos
Biologia Computacional/métodos , Registros Eletrônicos de Saúde , Estudo de Associação Genômica Ampla/métodos , Genômica , Humanos , Fenótipo , Polimorfismo de Nucleotídeo Único , Software
18.
J Am Med Inform Assoc ; 24(1): 162-171, 2017 01.
Artigo em Inglês | MEDLINE | ID: mdl-27497800

RESUMO

OBJECTIVE: Phenotyping algorithms applied to electronic health record (EHR) data enable investigators to identify large cohorts for clinical and genomic research. Algorithm development is often iterative, depends on fallible investigator intuition, and is time- and labor-intensive. We developed and evaluated 4 types of phenotyping algorithms and categories of EHR information to identify hypertensive individuals and controls and provide a portable module for implementation at other sites. MATERIALS AND METHODS: We reviewed the EHRs of 631 individuals followed at Vanderbilt for hypertension status. We developed features and phenotyping algorithms of increasing complexity. Input categories included International Classification of Diseases, Ninth Revision (ICD9) codes, medications, vital signs, narrative-text search results, and Unified Medical Language System (UMLS) concepts extracted using natural language processing (NLP). We developed a module and tested portability by replicating 10 of the best-performing algorithms at the Marshfield Clinic. RESULTS: Random forests using billing codes, medications, vitals, and concepts had the best performance with a median area under the receiver operator characteristic curve (AUC) of 0.976. Normalized sums of all 4 categories also performed well (0.959 AUC). The best non-NLP algorithm combined normalized ICD9 codes, medications, and blood pressure readings with a median AUC of 0.948. Blood pressure cutoffs or ICD9 code counts alone had AUCs of 0.854 and 0.908, respectively. Marshfield Clinic results were similar. CONCLUSION: This work shows that billing codes or blood pressure readings alone yield good hypertension classification performance. However, even simple combinations of input categories improve performance. The most complex algorithms classified hypertension with excellent recall and precision.


Assuntos
Algoritmos , Registros Eletrônicos de Saúde , Hipertensão/diagnóstico , Aprendizado de Máquina , Idoso , Determinação da Pressão Arterial , Codificação Clínica , Feminino , Humanos , Armazenamento e Recuperação da Informação/métodos , Masculino , Pessoa de Meia-Idade , Processamento de Linguagem Natural , Fenótipo , Curva ROC
19.
Arthritis Rheumatol ; 69(2): 291-300, 2017 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-27589350

RESUMO

OBJECTIVE: The differences between seronegative and seropositive rheumatoid arthritis (RA) have not been widely reported. We performed electronic health record (EHR)-based phenome-wide association studies (PheWAS) to identify disease associations in seropositive and seronegative RA. METHODS: A validated algorithm identified RA subjects from the de-identified version of the Vanderbilt University Medical Center EHR. Serotypes were determined by rheumatoid factor (RF) and anti-cyclic citrullinated peptide antibody (ACPA) values. We tested EHR-derived phenotypes using PheWAS comparing seropositive RA and seronegative RA, yielding disease associations. PheWAS was also performed in RF-positive versus RF-negative subjects and ACPA-positive versus ACPA-negative subjects. Following PheWAS, select phenotypes were then manually reviewed, and fibromyalgia was specifically evaluated using a validated algorithm. RESULTS: A total of 2,199 RA individuals with either RF or ACPA testing were identified. Of these, 1,382 patients (63%) were classified as seropositive. Seronegative RA was associated with myalgia and myositis (odds ratio [OR] 2.1, P = 3.7 × 10-10 ) and back pain. A manual review of the health record showed that among subjects coded for Myalgia and Myositis, ∼80% had fibromyalgia. Follow-up with a specific EHR algorithm for fibromyalgia confirmed that seronegative RA was associated with fibromyalgia (OR 1.8, P = 4.0 × 10-6 ). Seropositive RA was associated with chronic airway obstruction (OR 2.2, P = 1.4 × 10-4 ) and tobacco use (OR 2.2, P = 7.0 × 10-4 ). CONCLUSION: This PheWAS of RA patients identifies a strong association between seronegativity and fibromyalgia. It also affirms relationships between seropositivity and chronic airway obstruction and between seropositivity and tobacco use. These findings demonstrate the utility of the PheWAS approach to discover novel phenotype associations within different subgroups of a disease.


Assuntos
Artrite Reumatoide/complicações , Artrite Reumatoide/genética , Fibromialgia/complicações , Fibromialgia/genética , Artrite Reumatoide/sangue , Feminino , Estudos de Associação Genética , Humanos , Masculino , Pessoa de Meia-Idade , Fenótipo , Testes Sorológicos
20.
Arthritis Care Res (Hoboken) ; 69(5): 687-693, 2017 05.
Artigo em Inglês | MEDLINE | ID: mdl-27390187

RESUMO

OBJECTIVE: To study systemic lupus erythematosus (SLE) in the electronic health record (EHR), we must accurately identify patients with SLE. Our objective was to develop and validate novel EHR algorithms that use International Classification of Diseases, Ninth Revision (ICD-9), Clinical Modification codes, laboratory testing, and medications to identify SLE patients. METHODS: We used Vanderbilt's Synthetic Derivative, a de-identified version of the EHR, with 2.5 million subjects. We selected all individuals with at least 1 SLE ICD-9 code (710.0), yielding 5,959 individuals. To create a training set, 200 subjects were randomly selected for chart review. A subject was defined as a case if diagnosed with SLE by a rheumatologist, nephrologist, or dermatologist. Positive predictive values (PPVs) and sensitivity were calculated for combinations of code counts of the SLE ICD-9 code, a positive antinuclear antibody (ANA), ever use of medications, and a keyword of "lupus" in the problem list. The algorithms with the highest PPV were each internally validated using a random set of 100 individuals from the remaining 5,759 subjects. RESULTS: The algorithm with the highest PPV at 95% in the training set and 91% in the validation set was 3 or more counts of the SLE ICD-9 code, ANA positive (≥1:40), and ever use of both disease-modifying antirheumatic drugs and steroids, while excluding individuals with systemic sclerosis and dermatomyositis ICD-9 codes. CONCLUSION: We developed and validated the first EHR algorithm that incorporates laboratory values and medications with the SLE ICD-9 code to identify patients with SLE accurately.


Assuntos
Algoritmos , Registros Eletrônicos de Saúde/estatística & dados numéricos , Lúpus Eritematoso Sistêmico/diagnóstico , Adulto , Idoso , Anticorpos Antinucleares/sangue , Antirreumáticos/uso terapêutico , Feminino , Humanos , Classificação Internacional de Doenças/estatística & dados numéricos , Lúpus Eritematoso Sistêmico/sangue , Lúpus Eritematoso Sistêmico/tratamento farmacológico , Masculino , Pessoa de Meia-Idade , Valor Preditivo dos Testes , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA