RESUMO
Transcriptome-wide association study (TWAS) tools have been applied to conduct proteome-wide association studies (PWASs) by integrating proteomics data with genome-wide association study (GWAS) summary data. The genetic effects of PWAS-identified significant genes are potentially mediated through genetically regulated protein abundance, thus informing the underlying disease mechanisms better than GWAS loci. However, existing TWAS/PWAS tools are limited by considering only one statistical model. We propose an omnibus PWAS pipeline to account for multiple statistical models and demonstrate improved performance by simulation and application studies of Alzheimer disease (AD) dementia. We employ the Aggregated Cauchy Association Test to derive omnibus PWAS (PWAS-O) p values from PWAS p values obtained by three existing tools assuming complementary statistical models-TIGAR, PrediXcan, and FUSION. Our simulation studies demonstrated improved power, with well-calibrated type I error, for PWAS-O over all three individual tools. We applied PWAS-O to studying AD dementia with reference proteomic data profiled from dorsolateral prefrontal cortex of postmortem brains from individuals of European ancestry. We identified 43 risk genes, including 5 not identified by previous studies, which are interconnected through a protein-protein interaction network that includes the well-known AD risk genes TOMM40, APOC1, and APOC2. We also validated causal genetic effects mediated through the proteome for 27 (63%) PWAS-O risk genes, providing insights into the underlying biological mechanisms of AD dementia and highlighting promising targets for therapeutic development. PWAS-O can be easily applied to studying other complex diseases.
Assuntos
Doença de Alzheimer , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Proteoma , Doença de Alzheimer/genética , Doença de Alzheimer/metabolismo , Humanos , Proteoma/genética , Proteoma/metabolismo , Proteômica/métodos , Apolipoproteína C-I/genética , Apolipoproteína C-I/metabolismo , Polimorfismo de Nucleotídeo Único , Fatores de Risco , Transcriptoma , Proteínas do Complexo de Importação de Proteína Precursora MitocondrialRESUMO
Although genomewide association studies (GWASs) have identified many genetic variants underlying complex traits, a large fraction of heritability still remains unexplained. Integrative analysis that incorporates additional information, such as expression quantitativetrait locus (eQTL) data into sequencing studies (denoted as transcriptomewide association study [TWAS]), can aid the discovery of trait-associated genetic variants. However, general TWAS methods only incorporate one eQTL-derived weight (e.g., cis-effect), and thus can suffer a substantial loss of power when the single estimated cis-effect is not predictive for the effect size of a genetic variant or when there are estimation errors in the estimated cis-effect, or if the data are not consistent with the model assumption. In this study, we propose an omnibus test (OT) which utilizes a Cauchy association test to integrate association evidence demonstrated by three different traditional tests (burden test, quadratic test, and adaptive test) using GWAS summary data with multiple eQTL-derived weights. The p value of the proposed test can be calculated analytically, and thus it is fast and efficient. We applied our proposed test to two schizophrenia (SCZ) GWAS summary data sets and two lipids trait (HDL) GWAS summary data sets. Compared with the three traditional tests, our proposed OT can identify more trait-associated genes.
Assuntos
Genes , Estudo de Associação Genômica Ampla , Locos de Características Quantitativas/genética , Característica Quantitativa Herdável , Simulação por Computador , Humanos , Lipoproteínas HDL/metabolismo , Modelos Genéticos , Herança Multifatorial/genética , Fenótipo , Polimorfismo de Nucleotídeo Único/genética , Esquizofrenia/genéticaRESUMO
BACKGROUND: Integrating functional annotations into SNP-set association studies has been proven a powerful analysis strategy. Statistical methods for such integration have been developed for continuous and binary phenotypes; however, the SNP-set integrative approaches for time-to-event or survival outcomes are lacking. METHODS: We here propose IEHC, an integrative eQTL (expression quantitative trait loci) hierarchical Cox regression, for SNP-set based survival association analysis by modeling effect sizes of genetic variants as a function of eQTL via a hierarchical manner. Three p-values combination tests are developed to examine the joint effects of eQTL and genetic variants after a novel decorrelated modification of statistics for the two components. An omnibus test (IEHC-ACAT) is further adapted to aggregate the strengths of all available tests. RESULTS: Simulations demonstrated that the IEHC joint tests were more powerful if both eQTL and genetic variants contributed to association signal, while IEHC-ACAT was robust and often outperformed other approaches across various simulation scenarios. When applying IEHC to ten TCGA cancers by incorporating eQTL from relevant tissues of GTEx, we revealed that substantial correlations existed between the two types of effect sizes of genetic variants from TCGA and GTEx, and identified 21 (9 unique) cancer-associated genes which would otherwise be missed by approaches not incorporating eQTL. CONCLUSION: IEHC represents a flexible, robust, and powerful approach to integrate functional omics information to enhance the power of identifying association signals for the survival risk of complex human cancers.
Assuntos
Estudo de Associação Genômica Ampla , Locos de Características Quantitativas , Humanos , Fenótipo , Polimorfismo de Nucleotídeo Único/genética , Modelos de Riscos Proporcionais , Locos de Características Quantitativas/genéticaRESUMO
BACKGROUND: Transcriptome-wide association study (TWAS) is an influential tool for identifying genes associated with complex diseases whose genetic effects are likely mediated through transcriptome. TWAS utilizes reference genetic and transcriptomic data to estimate effect sizes of genetic variants on gene expression (i.e., effect sizes of a broad sense of expression quantitative trait loci, eQTL). These estimated effect sizes are employed as variant weights in gene-based association tests, facilitating the mapping of risk genes with genome-wide association study (GWAS) data. However, most existing TWAS of Alzheimer's disease (AD) dementia are limited to studying only cis-eQTL proximal to the test gene. To overcome this limitation, we applied the Bayesian Genome-wide TWAS (BGW-TWAS) method to leveraging both cis- and trans- eQTL of brain and blood tissues, in order to enhance mapping risk genes for AD dementia. METHODS: We first applied BGW-TWAS to the Genotype-Tissue Expression (GTEx) V8 dataset to estimate cis- and trans- eQTL effect sizes of the prefrontal cortex, cortex, and whole blood tissues. Estimated eQTL effect sizes were integrated with the summary data of the most recent GWAS of AD dementia to obtain BGW-TWAS (i.e., gene-based association test) p-values of AD dementia per gene per tissue type. Then we used the aggregated Cauchy association test to combine TWAS p-values across three tissues to obtain omnibus TWAS p-values per gene. RESULTS: We identified 85 significant genes in prefrontal cortex, 82 in cortex, and 76 in whole blood that were significantly associated with AD dementia. By combining BGW-TWAS p-values across these three tissues, we obtained 141 significant risk genes including 34 genes primarily due to trans-eQTL and 35 mapped risk genes in GWAS Catalog. With these 141 significant risk genes, we detected functional clusters comprised of both known mapped GWAS risk genes of AD in GWAS Catalog and our identified TWAS risk genes by protein-protein interaction network analysis, as well as several enriched phenotypes related to AD. CONCLUSION: We applied BGW-TWAS and aggregated Cauchy test methods to integrate both cis- and trans- eQTL data of brain and blood tissues with GWAS summary data, identifying 141 TWAS risk genes of AD dementia. These identified risk genes provide novel insights into the underlying biological mechanisms of AD dementia and potential gene targets for therapeutics development.
Assuntos
Doença de Alzheimer , Teorema de Bayes , Encéfalo , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Locos de Características Quantitativas , Transcriptoma , Humanos , Doença de Alzheimer/genética , Doença de Alzheimer/sangue , Estudo de Associação Genômica Ampla/métodos , Encéfalo/metabolismo , Predisposição Genética para Doença/genética , Locos de Características Quantitativas/genética , Polimorfismo de Nucleotídeo Único , Perfilação da Expressão Gênica/métodosRESUMO
Openness-weighted association study (OWAS) is a method that leverages the in silico prediction of chromatin accessibility to prioritize genome-wide association studies (GWAS) signals, and can provide novel insights into the roles of non-coding variants in complex diseases. A prerequisite to apply OWAS is to choose a trait-related cell type beforehand. However, for most complex traits, the trait-relevant cell types remain elusive. In addition, many complex traits involve multiple related cell types. To address these issues, we develop OWAS-joint, an efficient framework that aggregates predicted chromatin accessibility across multiple cell types, to prioritize disease-associated genomic segments. In simulation studies, we demonstrate that OWAS-joint achieves a greater statistical power compared to OWAS. Moreover, the heritability explained by OWAS-joint segments is higher than or comparable to OWAS segments. OWAS-joint segments also have high replication rates in independent replication cohorts. Applying the method to six complex human traits, we demonstrate the advantages of OWAS-joint over a single-cell-type OWAS approach. We highlight that OWAS-joint enhances the biological interpretation of disease mechanisms, especially for non-coding regions.
Assuntos
Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Cromatina , Estudo de Associação Genômica Ampla/métodos , Genômica , Humanos , FenótipoRESUMO
Abundant Genome-wide association study (GWAS) findings have reflected the sharing of genetic variants among multiple phenotypes. Exploring the association between genetic variants and multiple traits can provide novel insights into the biological mechanism of complex human traits. In this article, we proposed to apply the generalized Berk-Jones (GBJ) test and the generalized higher criticism (GHC) test to identify the genetic variants that affect multiple traits based on GWAS summary statistics. To be more robust to different gene-multiple traits association patterns across the whole genome, we proposed an omnibus test (OMNI) by using the aggregated Cauchy association test. We conducted extensive simulation studies to investigate the type one error rates and compare the powers of the proposed tests (i.e., the GBJ, GHC and OMNI tests) and the existing tests (i.e., the minimum of the p-values (MinP) and the cross-phenotype association test (CPASSOC) in a wide range of simulation settings. We found that all of these methods could control the type one error rates well and the proposed OMNI test has robust power. We applied those methods to the summary statistics dataset from Global Lipids Genetics Consortium and identified 19 new genetic variants that were missed by the original single trait association analysis.
RESUMO
BACKGROUND: Multiple genes were previously identified to be associated with cervical cancer; however, the genetic architecture of cervical cancer remains unknown and many potential causal genes are yet to be discovered. METHODS: To explore potential causal genes related to cervical cancer, a two-stage causal inference approach was proposed within the framework of Mendelian randomization, where the gene expression was treated as exposure, with methylations located within the promoter regions of genes serving as instrumental variables. Five prediction models were first utilized to characterize the relationship between the expression and methylations for each gene; then, the methylation-regulated gene expression (MReX) was obtained and the association was evaluated via Cox mixed-effect model based on MReX. We further implemented the aggregated Cauchy association test (ACAT) combination to take advantage of respective strengths of these prediction models while accounting for dependency among the p-values. RESULTS: A total of 14 potential causal genes were discovered to be associated with the survival risk of cervical cancer in TCGA when the five prediction models were separately employed. The total number of potential causal genes was brought to 23 when conducting ACAT. Some of the newly discovered genes may be novel (e.g., YJEFN3, SPATA5L1, IMMP1L, C5orf55, PPIP5K2, ZNF330, CRYZL1, PPM1A, ESCO2, ZNF605, ZNF225, ZNF266, FICD, and OSTC). Functional analyses showed that these genes were enriched in tumor-associated pathways. Additionally, four genes (i.e., COL6A1, SYDE1, ESCO2, and GIPC1) were differentially expressed between tumor and normal tissues. CONCLUSION: Our study discovered promising candidate genes that were causally associated with the survival risk of cervical cancer and thus provided new insights into the genetic etiology of cervical cancer.