RESUMO
Understanding breast cancer genetic risk relies on identifying causal variants and candidate target genes in risk loci identified by genome-wide association studies (GWAS), which remains challenging. Since most loci fall in active gene regulatory regions, we developed a novel approach facilitated by pinpointing the variants with greater regulatory potential in the disease's tissue of origin. Through genome-wide differential allelic expression (DAE) analysis, using microarray data from 64 normal breast tissue samples, we mapped the variants associated with DAE (daeQTLs). Then, we intersected these with GWAS data to reveal candidate risk regulatory variants and analysed their cis-acting regulatory potential. Finally, we validated our approach by extensive functional analysis of the 5q14.1 breast cancer risk locus. We observed widespread gene expression regulation by cis-acting variants in breast tissue, with 65% of coding and noncoding expressed genes displaying DAE (daeGenes). We identified over 54 K daeQTLs for 6761 (26%) daeGenes, including 385 daeGenes harbouring variants previously associated with BC risk. We found 1431 daeQTLs mapped to 93 different loci in strong linkage disequilibrium with risk-associated variants (risk-daeQTLs), suggesting a link between risk-causing variants and cis-regulation. There were 122 risk-daeQTL with stronger cis-acting potential in active regulatory regions with protein binding evidence. These variants mapped to 41 risk loci, of which 29 had no previous report of target genes and were candidates for regulating the expression levels of 65 genes. As validation, we identified and functionally characterised five candidate causal variants at the 5q14.1 risk locus targeting the ATG10 and ATP6AP1L genes, likely acting via modulation of alternative transcription and transcription factor binding. Our study demonstrates the power of DAE analysis and daeQTL mapping to identify causal regulatory variants and target genes at breast cancer risk loci, including those with complex regulatory landscapes. It additionally provides a genome-wide resource of variants associated with DAE for future functional studies.
Assuntos
Alelos , Neoplasias da Mama , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas , Humanos , Neoplasias da Mama/genética , Feminino , Regulação Neoplásica da Expressão Gênica , Perfilação da Expressão GênicaRESUMO
Most breast cancer (BC) risk-associated single-nucleotide polymorphisms (raSNPs) identified in genome-wide association studies (GWAS) are believed to cis-regulate the expression of genes. We hypothesise that cis-regulatory variants contributing to disease risk may be affecting microRNA (miRNA) genes and/or miRNA binding. To test this, we adapted two miRNA-binding prediction algorithms-TargetScan and miRanda-to perform allele-specific queries, and integrated differential allelic expression (DAE) and expression quantitative trait loci (eQTL) data, to query 150 genome-wide significant ( P ≤ 5 × 10 - 8 ) raSNPs, plus proxies. We found that no raSNP mapped to a miRNA gene, suggesting that altered miRNA targeting is an unlikely mechanism involved in BC risk. Also, 11.5% (6 out of 52) raSNPs located in 3'-untranslated regions of putative miRNA target genes were predicted to alter miRNA::mRNA (messenger RNA) pair binding stability in five candidate target genes. Of these, we propose RNF115, at locus 1q21.1, as a strong novel target gene associated with BC risk, and reinforce the role of miRNA-mediated cis-regulation at locus 19p13.11. We believe that integrating allele-specific querying in miRNA-binding prediction, and data supporting cis-regulation of expression, improves the identification of candidate target genes in BC risk, as well as in other common cancers and complex diseases.