RESUMO
Most existing expression quantitative trait locus (eQTL) mapping studies have been focused on individuals of European ancestry and are underrepresented in other populations including populations with African ancestry. Lack of large-scale well-powered eQTL mapping studies in populations with African ancestry can both impede the dissemination of eQTL mapping results that would otherwise benefit individuals with African ancestry and hinder the comparable analysis for understanding how gene regulation is shaped through evolution. We fill this critical knowledge gap by performing a large-scale in-depth eQTL mapping study on 1,032 African Americans (AA) and 801 European Americans (EA) in the GENOA cohort. We identified a total of 354,931 eSNPs in AA and 371,309 eSNPs in EA, with 112,316 eSNPs overlapped between the two. We found that eQTL harboring genes (eGenes) are enriched in metabolic pathways and tend to have higher SNP heritability compared to non-eGenes. We found that eGenes that are common in the two populations tend to be less conserved than eGenes that are unique to one population, which are less conserved than non-eGenes. Through conditional analysis, we found that eGenes in AA tend to harbor more independent eQTLs than eGenes in EA, suggesting potentially diverse genetic architecture underlying expression variation in the two populations. Finally, the large sample sizes in GENOA allow us to construct accurate expression prediction models in both AA and EA, facilitating powerful transcriptome-wide association studies. Overall, our results represent an important step toward revealing the genetic architecture underlying expression variation in African Americans.
Assuntos
Negro ou Afro-Americano/genética , Regulação da Expressão Gênica/genética , Locos de Características Quantitativas/genética , População Branca/genética , Mapeamento Cromossômico/métodos , Estudos de Coortes , Feminino , Predisposição Genética para Doença/genética , Estudo de Associação Genômica Ampla/métodos , Humanos , Masculino , Polimorfismo de Nucleotídeo Único/genética , Transcriptoma/genéticaRESUMO
Genome-wide association studies (GWASs) have primarily focused on the association between individual genetic markers and risk of disease. We applied a novel approach that integrates skin expression-related single-nucleotide polymorphisms (eSNPs) and pathway analysis for GWAS of basal cell carcinoma (BCC) to identify potential novel biological pathways. We evaluated the associations between 70,932 skin eSNPs and risk of BCC among 2,323 cases and 7,275 controls of European ancestry, and then assigned them to the pathways defined by KEGG, GO, and BioCarta databases. Three KEGG pathways (colorectal cancer, actin cytoskeleton, and BCC), two GO pathways (cellular component disassembly in apoptosis, and nucleus organization), and four BioCarta pathways (Ras signaling, T cell receptor signaling, natural killer cell-mediated cytotoxicity, and links between Pyk2 and Map Kinases) showed significant association with BCC risk with p-value<0.05 and FDR<0.2. These pathways also ranked at top in sensitivity analyses. Two positive controls in KEGG, the hedgehog pathway and the BCC pathway, showed significant association with BCC risk in both main and sensitivity analyses. Our results indicate that SNPs that are undetectable by conventional GWASs are significantly associated with BCC when tested as pathways. Biological studies of these gene groups suggest their potential roles in the etiology of BCC.
Assuntos
Carcinoma Basocelular/genética , Perfilação da Expressão Gênica/métodos , Predisposição Genética para Doença/genética , Neoplasias Cutâneas/genética , Estudo de Associação Genômica Ampla , Humanos , Polimorfismo de Nucleotídeo Único , TranscriptomaRESUMO
Reconstructing biological networks using high-throughput technologies has the potential to produce condition-specific interactomes. But are these reconstructed networks a reliable source of biological interactions? Do some network inference methods offer dramatically improved performance on certain types of networks? To facilitate the use of network inference methods in systems biology, we report a large-scale simulation study comparing the ability of Markov chain Monte Carlo (MCMC) samplers to reverse engineer Bayesian networks. The MCMC samplers we investigated included foundational and state-of-the-art Metropolis-Hastings and Gibbs sampling approaches, as well as novel samplers we have designed. To enable a comprehensive comparison, we simulated gene expression and genetics data from known network structures under a range of biologically plausible scenarios. We examine the overall quality of network inference via different methods, as well as how their performance is affected by network characteristics. Our simulations reveal that network size, edge density, and strength of gene-to-gene signaling are major parameters that differentiate the performance of various samplers. Specifically, more recent samplers including our novel methods outperform traditional samplers for highly interconnected large networks with strong gene-to-gene signaling. Our newly developed samplers show comparable or superior performance to the top existing methods. Moreover, this performance gain is strongest in networks with biologically oriented topology, which indicates that our novel samplers are suitable for inferring biological networks. The performance of MCMC samplers in this simulation framework can guide the choice of methods for network reconstruction using systems genetics data.