Estimating colocalization probability from limited summary statistics.

King, Emily A; Dunbar, Fengjiao; Davis, Justin Wade; Degner, Jacob F

King, Emily A; Dunbar, Fengjiao; Davis, Justin Wade; Degner, Jacob F.

Afiliación

King EA; AbbVie Genomics Research Center, North Chicago, IL, USA.
Dunbar F; AbbVie Genomics Research Center, North Chicago, IL, USA.
Davis JW; AbbVie Genomics Research Center, North Chicago, IL, USA.
Degner JF; AbbVie Genomics Research Center, North Chicago, IL, USA. Jacob.Degner@abbvie.com.

BMC Bioinformatics ; 22(1): 254, 2021 May 17.

Article en En | MEDLINE | ID: mdl-34000989

RESUMEN

BACKGROUND: Colocalization is a statistical method used in genetics to determine whether the same variant is causal for multiple phenotypes, for example, complex traits and gene expression. It provides stronger mechanistic evidence than shared significance, which can be produced through separate causal variants in linkage disequilibrium. Current colocalization methods require full summary statistics for both traits, limiting their use with the majority of reported GWAS associations (e.g. GWAS Catalog). We propose a new approximation to the popular coloc method that can be applied when limited summary statistics are available. Our method (POint EstiMation of Colocalization, POEMColoc) imputes missing summary statistics for one or both traits using LD structure in a reference panel, and performs colocalization using the imputed summary statistics. RESULTS: We evaluate the performance of POEMColoc using real (UK Biobank phenotypes and GTEx eQTL) and simulated datasets. We show good correlation between posterior probabilities of colocalization computed from imputed and observed datasets and similar accuracy in simulation. We evaluate scenarios that might reduce performance and show that multiple independent causal variants in a region and imputation from a limited subset of typed variants have a larger effect while mismatched ancestry in the reference panel has a modest effect. Further, we find that POEMColoc is a better approximation of coloc when the imputed association statistics are from a well powered study (e.g., relatively larger sample size or effect size). Applying POEMColoc to estimate colocalization of GWAS Catalog entries and GTEx eQTL, we find evidence for colocalization of 150,000 trait-gene-tissue triplets. CONCLUSIONS: We find that colocalization analysis performed with full summary statistics can be closely approximated when only the summary statistics of the top SNP are available for one or both traits. When applied to the full GWAS Catalog and GTEx eQTL, we find that colocalized trait-gene pairs are enriched in tissues relevant to disease etiology and for matches to approved drug mechanisms. POEMColoc R package is available at https://github.com/AbbVie-ComputationalGenomics/POEMColoc .

Asunto(s)

Estudio de Asociación del Genoma Completo; Sitios de Carácter Cuantitativo; Desequilibrio de Ligamiento; Herencia Multifactorial; Polimorfismo de Nucleótido Simple; Probabilidad

Palabras clave

Colocalization; Expression quantitative trait locus; GTEx; GWAS; GWAS catalog; Genome-wide association study; eQTL

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Sitios de Carácter Cuantitativo / Estudio de Asociación del Genoma Completo Idioma: En Revista: BMC Bioinformatics Asunto de la revista: INFORMATICA MEDICA Año: 2021 Tipo del documento: Article País de afiliación: Estados Unidos Pais de publicación: Reino Unido

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google