Your browser doesn't support javascript.
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 282
Filtrar
2.
Nat Rev Genet ; 2020 Mar 31.
Artigo em Inglês | MEDLINE | ID: mdl-32235907

RESUMO

Accurate prediction of disease risk based on the genetic make-up of an individual is essential for effective prevention and personalized treatment. Nevertheless, to date, individual genetic variants from genome-wide association studies have achieved only moderate prediction of disease risk. The aggregation of genetic variants under a polygenic model shows promising improvements in prediction accuracies. Increasingly, electronic health records (EHRs) are being linked to patient genetic data in biobanks, which provides new opportunities for developing and applying polygenic risk scores in the clinic, to systematically examine and evaluate patient susceptibilities to disease. However, the heterogeneous nature of EHR data brings forth many practical challenges along every step of designing and implementing risk prediction strategies. In this Review, we present the unique considerations for using genotype and phenotype data from biobank-linked EHRs for polygenic risk prediction.

3.
J Clin Endocrinol Metab ; 105(6)2020 Jun 01.
Artigo em Inglês | MEDLINE | ID: mdl-31917831

RESUMO

CONTEXT: As many as 75% of patients with polycystic ovary syndrome (PCOS) are estimated to be unidentified in clinical practice. OBJECTIVE: Utilizing polygenic risk prediction, we aim to identify the phenome-wide comorbidity patterns characteristic of PCOS to improve accurate diagnosis and preventive treatment. DESIGN, PATIENTS, AND METHODS: Leveraging the electronic health records (EHRs) of 124 852 individuals, we developed a PCOS risk prediction algorithm by combining polygenic risk scores (PRS) with PCOS component phenotypes into a polygenic and phenotypic risk score (PPRS). We evaluated its predictive capability across different ancestries and perform a PRS-based phenome-wide association study (PheWAS) to assess the phenomic expression of the heightened risk of PCOS. RESULTS: The integrated polygenic prediction improved the average performance (pseudo-R2) for PCOS detection by 0.228 (61.5-fold), 0.224 (58.8-fold), 0.211 (57.0-fold) over the null model across European, African, and multi-ancestry participants respectively. The subsequent PRS-powered PheWAS identified a high level of shared biology between PCOS and a range of metabolic and endocrine outcomes, especially with obesity and diabetes: "morbid obesity", "type 2 diabetes", "hypercholesterolemia", "disorders of lipid metabolism", "hypertension", and "sleep apnea" reaching phenome-wide significance. CONCLUSIONS: Our study has expanded the methodological utility of PRS in patient stratification and risk prediction, especially in a multifactorial condition like PCOS, across different genetic origins. By utilizing the individual genome-phenome data available from the EHR, our approach also demonstrates that polygenic prediction by PRS can provide valuable opportunities to discover the pleiotropic phenomic network associated with PCOS pathogenesis.

4.
Int J Cardiol ; 298: 107-113, 2020 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-31447229

RESUMO

BACKGROUND: Cardiovascular disease is the leading cause of death in the United States. Consequently, individuals who are genetically predisposed for high risk of cardiovascular disease would benefit most from prevention and early intervention approaches. Among common health risk factors affecting adult populations, we evaluated 23 cardiovascular disease-related traits, including BMI, glucose levels and lipid profiling to determine their associations with low-frequency recurrent copy number variations (CNV) (population frequency < 5%). RESULTS: We examined 10,619 unrelated subjects of European ancestry from the Electronic Medical Records and Genomics (eMERGE) Network who were genotyped with 657,366 markers genome-wide on the Illumina Infinium Quad 660 array. We performed CNV calling based on array marker intensity and evaluated data quality, ancestry stratification, and relatedness to ensure unbiased association discovery. Using a segment-based scoring approach, we assessed the association of all CNVs with each trait. In this large genome-wide analysis of low-frequency CNVs, we observed 11 novel genome-wide significant associations of low-frequency CNVs with major cardiovascular disease traits. CONCLUSION: In one of the largest genome-wide studies for low-frequency recurrent CNVs, we identified 11 loci associated with cardiovascular disease and related traits at the genome-wide significance level that may serve as biomarkers for prevention and early intervention studies in subjects who are at elevated risk. Our study further supports the role of low-frequency recurrent CNVs in the pathogenesis of common complex disease traits.

5.
Clin Pharmacol Ther ; 107(1): 203-210, 2020 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-31306493

RESUMO

Pharmacogenomics (PGx) decision support and return of results is an active area of precision medicine. One challenge of implementing PGx is extracting genomic variants and assigning haplotypes in order to apply prescribing recommendations and information from the Clinical Pharmacogenetics Implementation Consortium (CPIC), the US Food and Drug Administration (FDA), the Pharmacogenomics Knowledgebase (PharmGKB), etc. Pharmacogenomics Clinical Annotation Tool (PharmCAT) (i) extracts variants specified in guidelines from a genetic data set derived from sequencing or genotyping technologies, (ii) infers haplotypes and diplotypes, and (iii) generates a report containing genotype/diplotype-based annotations and guideline recommendations. We describe PharmCAT and a pilot validation project comparing results for 1000 Genomes Project sequences of Coriell samples with corresponding Genetic Testing Reference Materials Coordination Program (GeT-RM) sample characterization. PharmCAT was highly concordant with the GeT-RM data. PharmCAT is available in GitHub to evaluate, test, and report results back to the community. As precision medicine becomes more prevalent, our ability to consistently, accurately, and clearly define and report PGx annotations and prescribing recommendations is critical.

6.
Genet Med ; 22(1): 102-111, 2020 01.
Artigo em Inglês | MEDLINE | ID: mdl-31383942

RESUMO

PURPOSE: "Genome-first" approaches, in which genetic sequencing is agnostically linked to associated phenotypes, can enhance our understanding of rare variants' contributions to disease. Loss-of-function variants in LMNA cause a range of rare diseases, including cardiomyopathy. METHODS: We leveraged exome sequencing from 11,451 unselected individuals in the Penn Medicine Biobank to associate rare variants in LMNA with diverse electronic health record (EHR)-derived phenotypes. We used Rare Exome Variant Ensemble Learner (REVEL) to annotate rare missense variants, clustered predicted deleterious and loss-of-function variants into a "gene burden" (N = 72 individuals), and performed a phenome-wide association study (PheWAS). Major findings were replicated in DiscovEHR. RESULTS: The LMNA gene burden was significantly associated with primary cardiomyopathy (p = 1.78E-11) and cardiac conduction disorders (p = 5.27E-07). Most patients had not been clinically diagnosed with LMNA cardiomyopathy. We also noted an association with chronic kidney disease (p = 1.13E-06). Regression analyses on echocardiography and serum labs revealed that LMNA variant carriers had dilated cardiomyopathy and primary renal disease. CONCLUSION: Pathogenic LMNA variants are an underdiagnosed cause of cardiomyopathy. We also find that LMNA loss of function may be a primary cause of renal disease. Finally, we show the value of aggregating rare, annotated variants into a gene burden and using PheWAS to identify novel ontologies for pleiotropic human genes.

7.
Artigo em Inglês | MEDLINE | ID: mdl-31822125

RESUMO

Mitochondrial DNA (mtDNA) haplogroup has been associated with disease risk and longevity. Among persons with HIV (PWH), mtDNA haplogroup has been associated with AIDS progression, neuropathy, cognitive impairment, and gait speed decline. We sought to determine whether haplogroup is associated with frailty and its components among older PWH. A cross-sectional analysis was performed of AIDS Clinical Trials Group A5322 (HAILO) participants with available genome-wide genotype and frailty assessments. Multivariable logistic regression models adjusted for age, gender, education, smoking, hepatitis C, and prior use of didanosine/stavudine. Among 634 participants, 81% were male, 49% non-Hispanic white, 31% non-Hispanic black, and 20% Hispanic. Mean age was 51.0 (standard deviation 7.5) years and median nadir CD4 count was 212 (interquartile range 72, 324) cells/µL; 6% were frail, 7% had slow gait, and 21% weak grip. H haplogroup participants were more likely to be frail/prefrail (p = .064), have slow gait (p = .09), or weak grip (p = .017) compared with non-H haplogroup participants (not all comparisons reached statistical significance). In adjusted analyses, PWH with haplogroup H had a greater odds of being frail versus nonfrail [odds ratio (OR) 4.0 (95% confidence interval 1.0-15.4)] and having weak grip [OR 2.1 (1.1, 4.1)], but not slow gait [OR 1.6 (0.5, 5.0)] compared with non-H haplogroup. Among black and Hispanic participants, haplogroup was not significantly associated with frailty, grip, or gait. Among antiretroviral therapy (ART)-treated PWH, mtDNA haplogroup H was independently associated with weak grip and frailty. This association could represent a mechanism of weakness and frailty in the setting of HIV and ART.

8.
Pac Symp Biocomput ; 25: 743-747, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-31797645

RESUMO

Translational bioinformatics (TBI) is focused on the integration of biomedical data science and informatics. This combination is extremely powerful for scientific discovery as well as translation into clinical practice. Several topics where TBI research is at the leading edge are 1) the use of large-scale biobanks linked to electronic health records, 2) pharmacogenomics, and 3) artificial intelligence and machine learning. This perspective discusses these three topics and points to the important elements for driving precision medicine into the future.

9.
Artigo em Inglês | MEDLINE | ID: mdl-31504375

RESUMO

AIMS: Clopidogrel is prescribed for the prevention of atherothrombotic events. While investigations have identified genetic determinants of inter-individual variability in on-treatment platelet inhibition (e.g. CYP2C19*2), evidence that these variants have clinical utility to predict major adverse cardiovascular events remains controversial. METHODS AND RESULTS: We assessed the impact of 31 candidate gene polymorphisms on ADP-stimulated platelet reactivity in 3,391 clopidogrel-treated coronary artery disease patients of the International Clopidogrel Pharmacogenomics Consortium (ICPC). The influence of these polymorphisms on cardiovascular events (CVE) was tested in 2,134 ICPC patients (N = 129 events) in whom clinical event data were available. Several variants were associated with on-treatment ADP-stimulated platelet reactivity (CYP2C19*2, P = 8.8x10-54; CES1 G143E, P = 1.3x10-16; CYP2C19*17, P = 9.5x10-10; CYP2B6 1294 + 53C>T, P = 3.0x10-4; CYP2B6 516G>T, P = 1.0x10-3; CYP2C9*2, P = 1.2x10-3; and CYP2C9*3, P = 1.5x10-3). While no individual variant was associated with CVEs, generation of a pharmacogenomic polygenic response score (PgxRS) revealed that patients who carried a greater number of alleles that associated with increased on-treatment platelet reactivity were more likely to experience CVEs (ß = 0.17, SE 0.06, P = 0.01) and cardiovascular-related death (ß = 0.43, SE 0.16, P = 0.007). Patients who carried 8 or more risk alleles were significantly more likely to experience CVEs (OR = 1.78, 95%CI 1.14-2.76, P = 0.01) and cardiovascular death (OR = 4.39, 95%CI 1.35-14.27, P = 0.01) compared to patients who carried 6 or fewer of these alleles. CONCLUSION: Several polymorphisms impact clopidogrel response and PgxRS is a predictor of cardiovascular outcomes. Additional investigations that identify novel determinants of clopidogrel response and validating polygenic models may facilitate future precision medicine strategies.

10.
BioData Min ; 12: 14, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31320928

RESUMO

Background: The principal line of investigation in Genome Wide Association Studies (GWAS) is the identification of main effects, that is individual Single Nucleotide Polymorphisms (SNPs) which are associated with the trait of interest, independent of other factors. A variety of methods have been proposed to this end, mostly statistical in nature and differing in assumptions and type of model employed. Moreover, for a given model, there may be multiple choices for the SNP genotype encoding. As an alternative to statistical methods, machine learning methods are often applicable. Typically, for a given GWAS, a single approach is selected and utilized to identify potential SNPs of interest. Even when multiple GWAS are combined through meta-analyses within a consortium, each GWAS is typically analyzed with a single approach and the resulting summary statistics are then utilized in meta-analyses. Results: In this work we use as case studies a Type 2 Diabetes (T2D) and a breast cancer GWAS to explore a diversity of applicable approaches spanning different methods and encoding choices. We assess similarity of these approaches based on the derived ranked lists of SNPs and, for each GWAS, we identify a subset of representative approaches that we use as an ensemble to derive a union list of top SNPs. Among these are SNPs which are identified by multiple approaches as well as several SNPs identified by only one or a few of the less frequently used approaches. The latter include SNPs from established loci and SNPs which have other supporting lines of evidence in terms of their potential relevance to the traits. Conclusions: Not every main effect analysis method is suitable for every GWAS, but for each GWAS there are typically multiple applicable methods and encoding options. We suggest a workflow for a single GWAS, extensible to multiple GWAS from consortia, where representative approaches are selected among a pool of suitable options, to yield a more comprehensive set of SNPs, potentially including SNPs that would typically be missed with the most popular analyses, but that could provide additional valuable insights for follow-up.

11.
BMC Med ; 17(1): 135, 2019 07 17.
Artigo em Inglês | MEDLINE | ID: mdl-31311600

RESUMO

BACKGROUND: Non-alcoholic fatty liver disease (NAFLD) is a common chronic liver illness with a genetically heterogeneous background that can be accompanied by considerable morbidity and attendant health care costs. The pathogenesis and progression of NAFLD is complex with many unanswered questions. We conducted genome-wide association studies (GWASs) using both adult and pediatric participants from the Electronic Medical Records and Genomics (eMERGE) Network to identify novel genetic contributors to this condition. METHODS: First, a natural language processing (NLP) algorithm was developed, tested, and deployed at each site to identify 1106 NAFLD cases and 8571 controls and histological data from liver tissue in 235 available participants. These include 1242 pediatric participants (396 cases, 846 controls). The algorithm included billing codes, text queries, laboratory values, and medication records. Next, GWASs were performed on NAFLD cases and controls and case-only analyses using histologic scores and liver function tests adjusting for age, sex, site, ancestry, PC, and body mass index (BMI). RESULTS: Consistent with previous results, a robust association was detected for the PNPLA3 gene cluster in participants with European ancestry. At the PNPLA3-SAMM50 region, three SNPs, rs738409, rs738408, and rs3747207, showed strongest association (best SNP rs738409 p = 1.70 × 10- 20). This effect was consistent in both pediatric (p = 9.92 × 10- 6) and adult (p = 9.73 × 10- 15) cohorts. Additionally, this variant was also associated with disease severity and NAFLD Activity Score (NAS) (p = 3.94 × 10- 8, beta = 0.85). PheWAS analysis link this locus to a spectrum of liver diseases beyond NAFLD with a novel negative correlation with gout (p = 1.09 × 10- 4). We also identified novel loci for NAFLD disease severity, including one novel locus for NAS score near IL17RA (rs5748926, p = 3.80 × 10- 8), and another near ZFP90-CDH1 for fibrosis (rs698718, p = 2.74 × 10- 11). Post-GWAS and gene-based analyses identified more than 300 genes that were used for functional and pathway enrichment analyses. CONCLUSIONS: In summary, this study demonstrates clear confirmation of a previously described NAFLD risk locus and several novel associations. Further collaborative studies including an ethnically diverse population with well-characterized liver histologic features of NAFLD are needed to further validate the novel findings.


Assuntos
Hepatopatia Gordurosa não Alcoólica/genética , Adulto , Idoso , Índice de Massa Corporal , Estudos de Casos e Controles , Redes Comunitárias/organização & administração , Redes Comunitárias/estatística & dados numéricos , Progressão da Doença , Registros Eletrônicos de Saúde/organização & administração , Registros Eletrônicos de Saúde/estatística & dados numéricos , Feminino , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Genômica/organização & administração , Genômica/estatística & dados numéricos , Humanos , Lipase/genética , Masculino , Proteínas de Membrana/genética , Pessoa de Meia-Idade , Morbidade , Hepatopatia Gordurosa não Alcoólica/epidemiologia , Fenótipo , Polimorfismo de Nucleotídeo Único , Transdução de Sinais/genética
12.
Circulation ; 140(1): 42-54, 2019 Jul 02.
Artigo em Inglês | MEDLINE | ID: mdl-31216868

RESUMO

BACKGROUND: Truncating variants in the Titin gene (TTNtvs) are common in individuals with idiopathic dilated cardiomyopathy (DCM). However, a comprehensive genomics-first evaluation of the impact of TTNtvs in different clinical contexts, and the evaluation of modifiers such as genetic ancestry, has not been performed. METHODS: We reviewed whole exome sequence data for >71 000 individuals (61 040 from the Geisinger MyCode Community Health Initiative (2007 to present) and 10 273 from the PennMedicine BioBank (2013 to present) to identify anyone with TTNtvs. We further selected individuals with TTNtvs in exons highly expressed in the heart (proportion spliced in [PSI] >0.9). Using linked electronic health records, we evaluated associations of TTNtvs with diagnoses and quantitative echocardiographic measures, including subanalyses for individuals with and without DCM diagnoses. We also reviewed data from the Jackson Heart Study to validate specific analyses for individuals of African ancestry. RESULTS: Identified with a TTNtv in a highly expressed exon (hiPSI) were 1.2% individuals in PennMedicine BioBank and 0.6% at Geisinger. The presence of a hiPSI TTNtv was associated with increased odds of DCM in individuals of European ancestry (odds ratio [95% CI]: 18.7 [9.1-39.4] {PennMedicine BioBank} and 10.8 [7.0-16.0] {Geisinger}). hiPSI TTNtvs were not associated with DCM in individuals of African ancestry, despite a high DCM prevalence (odds ratio, 1.8 [0.2-13.7]; P=0.57). Among 244 individuals of European ancestry with DCM in PennMedicine BioBank, hiPSI TTNtv carriers had lower left ventricular ejection fraction (ß=-12%, P=3×10-7), and increased left ventricular diameter (ß=0.65 cm, P=9×10-3). In the Geisinger cohort, hiPSI TTNtv carriers without a cardiomyopathy diagnosis had more atrial fibrillation (odds ratio, 2.4 [1.6-3.6]) and heart failure (odds ratio, 3.8 [2.4-6.0]), and lower left ventricular ejection fraction (ß=-3.4%, P=1×10-7). CONCLUSIONS: Individuals of European ancestry with hiPSI TTNtv have an abnormal cardiac phenotype characterized by lower left ventricular ejection fraction, irrespective of the clinical manifestation of cardiomyopathy. Associations with arrhythmias, including atrial fibrillation, were observed even when controlling for cardiomyopathy diagnosis. In contrast, no association between hiPSI TTNtvs and DCM was discerned among individuals of African ancestry. Given these findings, clinical identification of hiPSI TTNtv carriers may alter clinical management strategies.

13.
BMC Med Genomics ; 12(1): 59, 2019 05 03.
Artigo em Inglês | MEDLINE | ID: mdl-31053132

RESUMO

BACKGROUND: Endometrial cancer (EMCA) is the fifth most common cancer among women in the world. Identification of potentially pathogenic germline variants from individuals with EMCA will help characterize genetic features that underlie the disease and potentially predispose individuals to its pathogenesis. METHODS: The Geisinger Health System's (GHS) DiscovEHR cohort includes exome sequencing on over 50,000 consenting patients, 297 of whom have evidence of an EMCA diagnosis in their electronic health record. Here, rare variants were annotated as potentially pathogenic. RESULTS: Eight genes were identified as having increased burden in the EMCA cohort relative to the non-cancer control cohort. None of the eight genes had an increased burden in the other hormone related cancer cohort from GHS, suggesting they can help characterize the underlying genetic variation that gives rise to EMCA. Comparing GHS to the cancer genome atlas (TCGA) EMCA germline data illustrated 34 genes with potentially pathogenic variation and eight unique potentially pathogenic variants that were present in both studies. Thus, similar germline variation among genes can be observed in unique EMCA cohorts and could help prioritize genes to investigate for future work. CONCLUSION: In summary, this systematic characterization of potentially pathogenic germline variants describes the genetic underpinnings of EMCA through the use of data from a single hospital system.


Assuntos
Registros Eletrônicos de Saúde , Neoplasias do Endométrio/genética , Mutação em Linhagem Germinativa , Adulto , Idoso , Idoso de 80 Anos ou mais , Estudos de Coortes , Neoplasias do Endométrio/patologia , Feminino , Humanos , Pessoa de Meia-Idade , Sequenciamento Completo do Exoma
14.
BMC Med Genomics ; 12(1): 65, 2019 May 22.
Artigo em Inglês | MEDLINE | ID: mdl-31118041

RESUMO

Following publication of the original article [1], the authors reported that Fig. 1 was not correctly processed during the production process. The correct Fig. 1 is given below.

15.
BioData Min ; 12: 10, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31114635

RESUMO

Characterizing how variation at the level of individual nucleotides contributes to traits and diseases has been an area of growing interest since the completion of sequencing the first human genome. Our understanding of how a single nucleotide polymorphism (SNP) leads to a pathogenic phenotype on a genome-wide scale is a fruitful endeavor for anyone interested in developing diagnostic tests, therapeutics, or simply wanting to understand the etiology of a disease or trait. To this end, many datasets and algorithms have been developed as resources/tools to annotate SNPs. One of the most common practices is to annotate coding SNPs that affect the protein sequence. Synonymous variants are often grouped as one type of variant, however there are in fact many tools available to dissect their effects on gene expression. More recently, large consortiums like ENCODE and GTEx have made it possible to annotate non-coding regions. Although annotating variants is a common technique among human geneticists, the constant advances in tools and biology surrounding SNPs requires an updated summary of what is known and the trajectory of the field. This review will discuss the history behind SNP annotation, commonly used tools, and newer strategies for SNP annotation. Additionally, we will comment on the caveats that distinguish approaches from one another, along with gaps in the current state of knowledge, and potential future directions. We do not intend for this to be a comprehensive review for any specific area of SNP annotation, but rather it will be an excellent resource for those unfamiliar with computational tools used to functionally characterize SNPs. In summary, this review will help illustrate how each SNP annotation method impacts the way in which the genetic and molecular etiology of a disease is explored in-silico.

16.
Sci Rep ; 9(1): 6077, 2019 Apr 15.
Artigo em Inglês | MEDLINE | ID: mdl-30988330

RESUMO

Benign prostatic hyperplasia (BPH) results in a significant public health burden due to the morbidity caused by the disease and many of the available remedies. As much as 70% of men over 70 will develop BPH. Few studies have been conducted to discover the genetic determinants of BPH risk. Understanding the biological basis for this condition may provide necessary insight for development of novel pharmaceutical therapies or risk prediction. We have evaluated SNP-based heritability of BPH in two cohorts and conducted a genome-wide association study (GWAS) of BPH risk using 2,656 cases and 7,763 controls identified from the Electronic Medical Records and Genomics (eMERGE) network. SNP-based heritability estimates suggest that roughly 60% of the phenotypic variation in BPH is accounted for by genetic factors. We used logistic regression to model BPH risk as a function of principal components of ancestry, age, and imputed genotype data, with meta-analysis performed using METAL. The top result was on chromosome 22 in SYN3 at rs2710383 (p-value = 4.6 × 10-7; Odds Ratio = 0.69, 95% confidence interval = 0.55-0.83). Other suggestive signals were near genes GLGC, UNCA13, SORCS1 and between BTBD3 and SPTLC3. We also evaluated genetically-predicted gene expression in prostate tissue. The most significant result was with increasing predicted expression of ETV4 (chr17; p-value = 0.0015). Overexpression of this gene has been associated with poor prognosis in prostate cancer. In conclusion, although there were no genome-wide significant variants identified for BPH susceptibility, we present evidence supporting the heritability of this phenotype, have identified suggestive signals, and evaluated the association between BPH and genetically-predicted gene expression in prostate.

17.
Pac Symp Biocomput ; 24: 272-283, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-30864329

RESUMO

The link between cardiovascular diseases and neurological disorders has been widely observed in the aging population. Disease prevention and treatment rely on understanding the potential genetic nexus of multiple diseases in these categories. In this study, we were interested in detecting pleiotropy, or the phenomenon in which a genetic variant influences more than one phenotype. Marker-phenotype association approaches can be grouped into univariate, bivariate, and multivariate categories based on the number of phenotypes considered at one time. Here we applied one statistical method per category followed by an eQTL colocalization analysis to identify potential pleiotropic variants that contribute to the link between cardiovascular and neurological diseases. We performed our analyses on ~530,000 common SNPs coupled with 65 electronic health record (EHR)-based phenotypes in 43,870 unrelated European adults from the Electronic Medical Records and Genomics (eMERGE) network. There were 31 variants identified by all three methods that showed significant associations across late onset cardiac- and neurologic- diseases. We further investigated functional implications of gene expression on the detected "lead SNPs" via colocalization analysis, providing a deeper understanding of the discovered associations. In summary, we present the framework and landscape for detecting potential pleiotropy using univariate, bivariate, multivariate, and colocalization methods. Further exploration of these potentially pleiotropic genetic variants will work toward understanding disease causing mechanisms across cardiovascular and neurological diseases and may assist in considering disease prevention as well as drug repositioning in future research.


Assuntos
Doenças Cardiovasculares/genética , Pleiotropia Genética , Doenças do Sistema Nervoso/genética , Adulto , Idoso , Biologia Computacional , Registros Eletrônicos de Saúde , Feminino , Estudos de Associação Genética , Predisposição Genética para Doença , Humanos , Masculino , Pessoa de Meia-Idade , Análise Multivariada , Fenótipo , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas
18.
Pac Symp Biocomput ; 24: 296-307, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-30864331

RESUMO

Transcriptome-wide association studies (TWAS) have recently gained great attention due to their ability to prioritize complex trait-associated genes and promote potential therapeutics development for complex human diseases. TWAS integrates genotypic data with expression quantitative trait loci (eQTLs) to predict genetically regulated gene expression components and associates predictions with a trait of interest. As such, TWAS can prioritize genes whose differential expressions contribute to the trait of interest and provide mechanistic explanation of complex trait(s). Tissue-specific eQTL information grants TWAS the ability to perform association analysis on tissues whose gene expression profiles are otherwise hard to obtain, such as liver and heart. However, as eQTLs are tissue context-dependent, whether and how the tissue-specificity of eQTLs influences TWAS gene prioritization has not been fully investigated. In this study, we addressed this question by adopting two distinct TWAS methods, PrediXcan and UTMOST, which assume single tissue and integrative tissue effects of eQTLs, respectively. Thirty-eight baseline laboratory traits in 4,360 antiretroviral treatment-naïve individuals from the AIDS Clinical Trials Group (ACTG) studies comprised the input dataset for TWAS. We performed TWAS in a tissue-specific manner and obtained a total of 430 significant gene-trait associations (q-value < 0.05) across multiple tissues. Single tissue-based analysis by PrediXcan contributed 116 of the 430 associations including 64 unique gene-trait pairs in 28 tissues. Integrative tissue-based analysis by UTMOST found the other 314 significant associations that include 50 unique gene-trait pairs across all 44 tissues. Both analyses were able to replicate some associations identified in past variant-based genome-wide association studies (GWAS), such as high-density lipoprotein (HDL) and CETP (PrediXcan, q-value = 3.2e-16). Both analyses also identified novel associations. Moreover, single tissue-based and integrative tissuebased analysis shared 11 of 103 unique gene-trait pairs, for example, PSRC1-low-density lipoprotein (PrediXcan's lowest q-value = 8.5e-06; UTMOST's lowest q-value = 1.8e-05). This study suggests that single tissue-based analysis may have performed better at discovering gene-trait associations when combining results from all tissues. Integrative tissue-based analysis was better at prioritizing genes in multiple tissues and in trait-related tissue. Additional exploration is needed to confirm this conclusion. Finally, although single tissue-based and integrative tissue-based analysis shared significant novel discoveries, tissue context-dependency of eQTLs impacted TWAS gene prioritization. This study provides preliminary data to support continued work on tissue contextdependency of eQTL studies and TWAS.


Assuntos
Perfilação da Expressão Gênica/estatística & dados numéricos , Estudo de Associação Genômica Ampla/estatística & dados numéricos , Especificidade de Órgãos/genética , Locos de Características Quantitativas , Transcriptoma , Fármacos Anti-HIV/uso terapêutico , Biologia Computacional , Perfilação da Expressão Gênica/métodos , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla/métodos , Genótipo , Infecções por HIV/tratamento farmacológico , Infecções por HIV/genética , Humanos , Variantes Farmacogenômicos , Polimorfismo de Nucleotídeo Único
19.
Per Med ; 16(3): 247-257, 2019 05 01.
Artigo em Inglês | MEDLINE | ID: mdl-30760118

RESUMO

Personalized medicine is being realized by our ability to measure biological and environmental information about patients. Much of these data are being stored in electronic health records yielding big data that presents challenges for its management and analysis. Here, we review several areas of knowledge that are necessary for next-generation scientists to fully realize the potential of biomedical big data. We begin with an overview of big data and its storage and management. We then review statistics and data science as foundational topics followed by a core curriculum of artificial intelligence, machine learning and natural language processing that are needed to develop predictive models for clinical decision making. We end with some specific training recommendations for preparing next-generation scientists for biomedical big data.


Assuntos
Ciência de Dados/métodos , Medicina de Precisão/métodos , Big Data , Tomada de Decisão Clínica , Mineração de Dados , Registros Eletrônicos de Saúde , Humanos , Aprendizado de Máquina , Processamento de Linguagem Natural
20.
NPJ Genom Med ; 4: 3, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-30774981

RESUMO

We conducted an electronic health record (EHR)-based phenome-wide association study (PheWAS) to discover pleiotropic effects of variants in three lipoprotein metabolism genes PCSK9, APOB, and LDLR. Using high-density genotype data, we tested the associations of variants in the three genes with 1232 EHR-derived binary phecodes in 51,700 European-ancestry (EA) individuals and 585 phecodes in 10,276 African-ancestry (AA) individuals; 457 PCSK9, 730 APOB, and 720 LDLR variants were filtered by imputation quality (r 2 > 0.4), minor allele frequency (>1%), linkage disequilibrium (r 2 < 0.3), and association with LDL-C levels, yielding a set of two PCSK9, three APOB, and five LDLR variants in EA but no variants in AA. Cases and controls were defined for each phecode using the PheWAS package in R. Logistic regression assuming an additive genetic model was used with adjustment for age, sex, and the first two principal components. Significant associations were tested in additional cohorts from Vanderbilt University (n = 29,713), the Marshfield Clinic Personalized Medicine Research Project (n = 9562), and UK Biobank (n = 408,455). We identified one PCSK9, two APOB, and two LDLR variants significantly associated with an examined phecode. Only one of the variants was associated with a non-lipid disease phecode, ("myopia") but this association was not significant in the replication cohorts. In this large-scale PheWAS we did not find LDL-C-related variants in PCSK9, APOB, and LDLR to be associated with non-lipid-related phenotypes including diabetes, neurocognitive disorders, or cataracts.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA