Search | VHL Regional Portal

Analyzing and reconciling colocalization and transcriptome-wide association studies from the perspective of inferential reproducibility.

Hukku, Abhay; Sampson, Matthew G; Luca, Francesca; Pique-Regi, Roger; Wen, Xiaoquan.

Am J Hum Genet ; 109(5): 825-837, 2022 05 05.

Article in English | MEDLINE | ID: mdl-35523146

ABSTRACT

Transcriptome-wide association studies and colocalization analysis are popular computational approaches for integrating genetic-association data from molecular and complex traits. They show the unique ability to go beyond variant-level genetic-association evidence and implicate critical functional units, e.g., genes, in disease etiology. However, in practice, when the two approaches are applied to the same molecular and complex-trait data, the inference results can be markedly different. This paper systematically investigates the inferential reproducibility between the two approaches through theoretical derivation, numerical experiments, and analyses of four complex trait GWAS and GTEx eQTL data. We identify two classes of inconsistent inference results. We find that the first class of inconsistent results (i.e., genes with strong colocalization but weak transcriptome-wide association study [TWAS] signals) might suggest an interesting biological phenomenon, i.e., horizontal pleiotropy; thus, the two approaches are truly complementary. The inconsistency in the second class (i.e., genes with weak colocalization but strong TWAS signals) can be understood and effectively reconciled. To this end, we propose a computational approach for locus-level colocalization analysis. We demonstrate that the joint TWAS and locus-level colocalization analysis improves specificity and sensitivity for implicating biologically relevant genes.

Subject(s)

Genome-Wide Association Study , Transcriptome , Genetic Predisposition to Disease , Genome-Wide Association Study/methods , Humans , Polymorphism, Single Nucleotide/genetics , Quantitative Trait Loci/genetics , Reproducibility of Results , Transcriptome/genetics

Probabilistic colocalization of genetic variants from complex and molecular traits: promise and limitations.

Hukku, Abhay; Pividori, Milton; Luca, Francesca; Pique-Regi, Roger; Im, Hae Kyung; Wen, Xiaoquan.

Am J Hum Genet ; 108(1): 25-35, 2021 01 07.

Article in English | MEDLINE | ID: mdl-33308443

ABSTRACT

Colocalization analysis has emerged as a powerful tool to uncover the overlapping of causal variants responsible for both molecular and complex disease phenotypes. The findings from colocalization analysis yield insights into the molecular pathways of complex diseases. In this paper, we conduct an in-depth investigation of the promise and limitations of the available colocalization analysis approaches. Focusing on variant-level colocalization approaches, we first establish the connections between various existing methods. We proceed to discuss the impacts of various controllable analytical factors and uncontrollable practical factors on outcomes of colocalization analysis through realistic simulations and real data examples. We identify a single analytical factor, the specification of prior enrichment levels, which can lead to severe inflation of false-positive colocalization findings. Meanwhile, the combination of many other analytical and practical factors all lead to diminished power. Consequently, we recommend the following strategies for the best practice of colocalization analysis: (1) estimating prior enrichment level from the observed data and (2) separating fine-mapping and colocalization analysis. Our analysis of 4,091 complex traits and the multi-tissue expression quantitative trait loci (eQTL) data from the GTEx (v.8) suggests that colocalizations of molecular QTLs and causal complex trait associations are widespread. However, only a small proportion can be confidently identified from currently available data due to a lack of power. Our findings set a benchmark for current and future integrative genetic association analysis applications.

Subject(s)

Genome-Wide Association Study/methods , Multifactorial Inheritance/genetics , Polymorphism, Single Nucleotide/genetics , Quantitative Trait Loci/genetics , Genetic Predisposition to Disease/genetics , Humans , Linkage Disequilibrium/genetics , Phenotype

BAGSE: a Bayesian hierarchical model approach for gene set enrichment analysis.

Hukku, Abhay; Quick, Corbin; Luca, Francesca; Pique-Regi, Roger; Wen, Xiaoquan.

Bioinformatics ; 36(6): 1689-1695, 2020 03 01.

Article in English | MEDLINE | ID: mdl-31702789

ABSTRACT

MOTIVATION: Gene set enrichment analysis has been shown to be effective in identifying relevant biological pathways underlying complex diseases. Existing approaches lack the ability to quantify the enrichment levels accurately, hence preventing the enrichment information to be further utilized in both upstream and downstream analyses. A modernized and rigorous approach for gene set enrichment analysis that emphasizes both hypothesis testing and enrichment estimation is much needed. RESULTS: We propose a novel computational method, Bayesian Analysis of Gene Set Enrichment (BAGSE), for gene set enrichment analysis. BAGSE is built on a Bayesian hierarchical model and fully accounts for the uncertainty embedded in the association evidence of individual genes. We adopt an empirical Bayes inference framework to fit the proposed hierarchical model by implementing an efficient EM algorithm. Through simulation studies, we illustrate that BAGSE yields accurate enrichment quantification while achieving similar power as the state-of-the-art methods. Further simulation studies show that BAGSE can effectively utilize the enrichment information to improve the power in gene discovery. Finally, we demonstrate the application of BAGSE in analyzing real data from a differential expression experiment and a transcriptome-wide association study. Our results indicate that the proposed statistical framework is effective in aiding the discovery of potentially causal pathways and gene networks. AVAILABILITY AND IMPLEMENTATION: BAGSE is implemented using the C++ programing language and is freely available from https://github.com/xqwen/bagse/. Simulated and real data used in this paper are also available at the Github repository for reproducibility purposes. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Subject(s)

Algorithms , Transcriptome , Bayes Theorem , Probability , Reproducibility of Results

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL