Search | VHL Search Portal

1.

Genetic Control of Expression and Splicing in Developing Human Brain Informs Disease Mechanisms.

Walker, Rebecca L; Ramaswami, Gokul; Hartl, Christopher; Mancuso, Nicholas; Gandal, Michael J; de la Torre-Ubieta, Luis; Pasaniuc, Bogdan; Stein, Jason L; Geschwind, Daniel H.

Cell ; 179(3): 750-771.e22, 2019 10 17.

Article in English | MEDLINE | ID: mdl-31626773

ABSTRACT

Tissue-specific regulatory regions harbor substantial genetic risk for disease. Because brain development is a critical epoch for neuropsychiatric disease susceptibility, we characterized the genetic control of the transcriptome in 201 mid-gestational human brains, identifying 7,962 expression quantitative trait loci (eQTL) and 4,635 spliceQTL (sQTL), including several thousand prenatal-specific regulatory regions. We show that significant genetic liability for neuropsychiatric disease lies within prenatal eQTL and sQTL. Integration of eQTL and sQTL with genome-wide association studies (GWAS) via transcriptome-wide association identified dozens of novel candidate risk genes, highlighting shared and stage-specific mechanisms in schizophrenia (SCZ). Gene network analysis revealed that SCZ and autism spectrum disorder (ASD) affect distinct developmental gene co-expression modules. Yet, in each disorder, common and rare genetic variation converges within modules, which in ASD implicates superficial cortical neurons. More broadly, these data, available as a web browser and our analyses, demonstrate the genetic mechanisms by which developmental events have a widespread influence on adult anatomical and behavioral phenotypes.

Subject(s)

Autism Spectrum Disorder/genetics , Quantitative Trait Loci/genetics , Schizophrenia/genetics , Transcriptome/genetics , Autism Spectrum Disorder/metabolism , Autism Spectrum Disorder/pathology , Brain/growth & development , Brain/metabolism , Female , Fetus/metabolism , Gene Expression Regulation, Developmental , Genetic Predisposition to Disease , Genome-Wide Association Study , Gestational Age , Humans , Male , Neurons/metabolism , Polymorphism, Single Nucleotide/genetics , RNA Splicing/genetics , Schizophrenia/metabolism , Schizophrenia/pathology

2.

Genetic Control of Expression and Splicing in Developing Human Brain Informs Disease Mechanisms.

Walker, Rebecca L; Ramaswami, Gokul; Hartl, Christopher; Mancuso, Nicholas; Gandal, Michael J; de la Torre-Ubieta, Luis; Pasaniuc, Bogdan; Stein, Jason L; Geschwind, Daniel H.

Cell ; 181(3): 745, 2020 Apr 30.

Article in English | MEDLINE | ID: mdl-32359439

3.

Genetic Control of Expression and Splicing in Developing Human Brain Informs Disease Mechanisms.

Walker, Rebecca L; Ramaswami, Gokul; Hartl, Christopher; Mancuso, Nicholas; Gandal, Michael J; Torre-Ubieta, Luis de la; Pasaniuc, Bogdan; Stein, Jason L; Geschwind, Daniel H.

Cell ; 181(2): 484, 2020 Apr 16.

Article in English | MEDLINE | ID: mdl-32302575

4.

Novel insight into the etiology of ischemic stroke gained by integrative multiome-wide association study.

Jung, Junghyun; Lu, Zeyun; de Smith, Adam; Mancuso, Nicholas.

Hum Mol Genet ; 33(2): 170-181, 2024 Jan 07.

Article in English | MEDLINE | ID: mdl-37824084

ABSTRACT

Stroke, characterized by sudden neurological deficits, is the second leading cause of death worldwide. Although genome-wide association studies (GWAS) have successfully identified many genomic regions associated with ischemic stroke (IS), the genes underlying risk and their regulatory mechanisms remain elusive. Here, we integrate a large-scale GWAS (N = 1 296 908) for IS together with molecular QTLs data, including mRNA, splicing, enhancer RNA (eRNA), and protein expression data from up to 50 tissues (total N = 11 588). We identify 136 genes/eRNA/proteins associated with IS risk across 60 independent genomic regions and find IS risk is most enriched for eQTLs in arterial and brain-related tissues. Focusing on IS-relevant tissues, we prioritize 9 genes/proteins using probabilistic fine-mapping TWAS analyses. In addition, we discover that blood cell traits, particularly reticulocyte cells, have shared genetic contributions with IS using TWAS-based pheWAS and genetic correlation analysis. Lastly, we integrate our findings with a large-scale pharmacological database and identify a secondary bile acid, deoxycholic acid, as a potential therapeutic component. Our work highlights IS risk genes/splicing-sites/enhancer activity/proteins with their phenotypic consequences using relevant tissues as well as identify potential therapeutic candidates for IS.

Subject(s)

Ischemic Stroke , Transcriptome , Humans , Genome-Wide Association Study , Ischemic Stroke/genetics , Genomics , Phenotype , Genetic Predisposition to Disease , Polymorphism, Single Nucleotide/genetics

5.

Novel breast cancer susceptibility loci under linkage peaks identified in African ancestry consortia.

Ochs-Balcom, Heather M; Preus, Leah; Du, Zhaohui; Elston, Robert C; Teerlink, Craig C; Jia, Guochong; Guo, Xingyi; Cai, Qiuyin; Long, Jirong; Ping, Jie; Li, Bingshan; Stram, Daniel O; Shu, Xiao-Ou; Sanderson, Maureen; Gao, Guimin; Ahearn, Thomas; Lunetta, Kathryn L; Zirpoli, Gary; Troester, Melissa A; Ruiz-Narváez, Edward A; Haddad, Stephen A; Figueroa, Jonine; John, Esther M; Bernstein, Leslie; Hu, Jennifer J; Ziegler, Regina G; Nyante, Sarah; Bandera, Elisa V; Ingles, Sue A; Mancuso, Nicholas; Press, Michael F; Deming, Sandra L; Rodriguez-Gil, Jorge L; Yao, Song; Ogundiran, Temidayo O; Ojengbede, Oladosu; Bolla, Manjeet K; Dennis, Joe; Dunning, Alison M; Easton, Douglas F; Michailidou, Kyriaki; Pharoah, Paul D P; Sandler, Dale P; Taylor, Jack A; Wang, Qin; O'Brien, Katie M; Weinberg, Clarice R; Kitahara, Cari M; Blot, William; Nathanson, Katherine L.

Hum Mol Genet ; 33(8): 687-697, 2024 Apr 08.

Article in English | MEDLINE | ID: mdl-38263910

ABSTRACT

BACKGROUND: Expansion of genome-wide association studies across population groups is needed to improve our understanding of shared and unique genetic contributions to breast cancer. We performed association and replication studies guided by a priori linkage findings from African ancestry (AA) relative pairs. METHODS: We performed fixed-effect inverse-variance weighted meta-analysis under three significant AA breast cancer linkage peaks (3q26-27, 12q22-23, and 16q21-22) in 9241 AA cases and 10 193 AA controls. We examined associations with overall breast cancer as well as estrogen receptor (ER)-positive and negative subtypes (193,132 SNPs). We replicated associations in the African-ancestry Breast Cancer Genetic Consortium (AABCG). RESULTS: In AA women, we identified two associations on chr12q for overall breast cancer (rs1420647, OR = 1.15, p = 2.50×10-6; rs12322371, OR = 1.14, p = 3.15×10-6), and one for ER-negative breast cancer (rs77006600, OR = 1.67, p = 3.51×10-6). On chr3, we identified two associations with ER-negative disease (rs184090918, OR = 3.70, p = 1.23×10-5; rs76959804, OR = 3.57, p = 1.77×10-5) and on chr16q we identified an association with ER-negative disease (rs34147411, OR = 1.62, p = 8.82×10-6). In the replication study, the chr3 associations were significant and effect sizes were larger (rs184090918, OR: 6.66, 95% CI: 1.43, 31.01; rs76959804, OR: 5.24, 95% CI: 1.70, 16.16). CONCLUSION: The two chr3 SNPs are upstream to open chromatin ENSR00000710716, a regulatory feature that is actively regulated in mammary tissues, providing evidence that variants in this chr3 region may have a regulatory role in our target organ. Our study provides support for breast cancer variant discovery using prioritization based on linkage evidence.

Subject(s)

Black People , Breast Neoplasms , Genetic Predisposition to Disease , Female , Humans , Black People/genetics , Breast Neoplasms/genetics , Genome-Wide Association Study , Polymorphism, Single Nucleotide

6.

A scalable approach to characterize pleiotropy across thousands of human diseases and complex traits using GWAS summary statistics.

Zhang, Zixuan; Jung, Junghyun; Kim, Artem; Suboc, Noah; Gazal, Steven; Mancuso, Nicholas.

Am J Hum Genet ; 110(11): 1863-1874, 2023 11 02.

Article in English | MEDLINE | ID: mdl-37879338

ABSTRACT

Genome-wide association studies (GWASs) across thousands of traits have revealed the pervasive pleiotropy of trait-associated genetic variants. While methods have been proposed to characterize pleiotropic components across groups of phenotypes, scaling these approaches to ultra-large-scale biobanks has been challenging. Here, we propose FactorGo, a scalable variational factor analysis model to identify and characterize pleiotropic components using biobank GWAS summary data. In extensive simulations, we observe that FactorGo outperforms the state-of-the-art (model-free) approach tSVD in capturing latent pleiotropic factors across phenotypes while maintaining a similar computational cost. We apply FactorGo to estimate 100 latent pleiotropic factors from GWAS summary data of 2,483 phenotypes measured in European-ancestry Pan-UK BioBank individuals (N = 420,531). Next, we find that factors from FactorGo are more enriched with relevant tissue-specific annotations than those identified by tSVD (p = 2.58E-10) and validate our approach by recapitulating brain-specific enrichment for BMI and the height-related connection between reproductive system and muscular-skeletal growth. Finally, our analyses suggest shared etiologies between rheumatoid arthritis and periodontal condition in addition to alkaline phosphatase as a candidate prognostic biomarker for prostate cancer. Overall, FactorGo improves our biological understanding of shared etiologies across thousands of GWASs.

Subject(s)

Arthritis, Rheumatoid , Genome-Wide Association Study , Male , Humans , Genome-Wide Association Study/methods , Multifactorial Inheritance , Phenotype , Brain , Arthritis, Rheumatoid/genetics , Polymorphism, Single Nucleotide/genetics , Genetic Pleiotropy

7.

Tree-based QTL mapping with expected local genetic relatedness matrices.

Link, Vivian; Schraiber, Joshua G; Fan, Caoqi; Dinh, Bryan; Mancuso, Nicholas; Chiang, Charleston W K; Edge, Michael D.

Am J Hum Genet ; 110(12): 2077-2091, 2023 Dec 07.

Article in English | MEDLINE | ID: mdl-38065072

ABSTRACT

Understanding the genetic basis of complex phenotypes is a central pursuit of genetics. Genome-wide association studies (GWASs) are a powerful way to find genetic loci associated with phenotypes. GWASs are widely and successfully used, but they face challenges related to the fact that variants are tested for association with a phenotype independently, whereas in reality variants at different sites are correlated because of their shared evolutionary history. One way to model this shared history is through the ancestral recombination graph (ARG), which encodes a series of local coalescent trees. Recent computational and methodological breakthroughs have made it feasible to estimate approximate ARGs from large-scale samples. Here, we explore the potential of an ARG-based approach to quantitative-trait locus (QTL) mapping, echoing existing variance-components approaches. We propose a framework that relies on the conditional expectation of a local genetic relatedness matrix (local eGRM) given the ARG. Simulations show that our method is especially beneficial for finding QTLs in the presence of allelic heterogeneity. By framing QTL mapping in terms of the estimated ARG, we can also facilitate the detection of QTLs in understudied populations. We use local eGRM to analyze two chromosomes containing known body size loci in a sample of Native Hawaiians. Our investigations can provide intuition about the benefits of using estimated ARGs in population- and statistical-genetic methods in general.

Subject(s)

Genetics, Population , Genome-Wide Association Study , Quantitative Trait Loci , Humans , Chromosome Mapping/methods , Models, Genetic , Phenotype , Quantitative Trait Loci/genetics , Native Hawaiian or Other Pacific Islander/genetics

8.

Estimating heritability explained by local ancestry and evaluating stratification bias in admixture mapping from summary statistics.

Chan, Tsz Fung; Rui, Xinyue; Conti, David V; Fornage, Myriam; Graff, Mariaelisa; Haessler, Jeffrey; Haiman, Christopher; Highland, Heather M; Jung, Su Yon; Kenny, Eimear E; Kooperberg, Charles; Le Marchand, Loic; North, Kari E; Tao, Ran; Wojcik, Genevieve; Gignoux, Christopher R; Chiang, Charleston W K; Mancuso, Nicholas.

Am J Hum Genet ; 110(11): 1853-1862, 2023 11 02.

Article in English | MEDLINE | ID: mdl-37875120

ABSTRACT

The heritability explained by local ancestry markers in an admixed population (hÎ³2) provides crucial insight into the genetic architecture of a complex disease or trait. Estimation of hÎ³2 can be susceptible to biases due to population structure in ancestral populations. Here, we present heritability estimation from admixture mapping summary statistics (HAMSTA), an approach that uses summary statistics from admixture mapping to infer heritability explained by local ancestry while adjusting for biases due to ancestral stratification. Through extensive simulations, we demonstrate that HAMSTA hÎ³2 estimates are approximately unbiased and are robust to ancestral stratification compared to existing approaches. In the presence of ancestral stratification, we show a HAMSTA-derived sampling scheme provides a calibrated family-wise error rate (FWER) of â¼5% for admixture mapping, unlike existing FWER estimation approaches. We apply HAMSTA to 20 quantitative phenotypes of up to 15,988 self-reported African American individuals in the Population Architecture using Genomics and Epidemiology (PAGE) study. We observe hËÎ³2 in the 20 phenotypes range from 0.0025 to 0.033 (mean hËÎ³2 = 0.012 ± 9.2 × 10-4), which translates to hË2 ranging from 0.062 to 0.85 (mean hË2 = 0.30 ± 0.023). Across these phenotypes we find little evidence of inflation due to ancestral population stratification in current admixture mapping studies (mean inflation factor of 0.99 ± 0.001). Overall, HAMSTA provides a fast and powerful approach to estimate genome-wide heritability and evaluate biases in test statistics of admixture mapping studies.

Subject(s)

Black or African American , Genetics, Population , Humans , Chromosome Mapping , Phenotype , Polymorphism, Single Nucleotide/genetics

9.

The motif composition of variable number tandem repeats impacts gene expression.

Lu, Tsung-Yu; Smaruj, Paulina N; Fudenberg, Geoffrey; Mancuso, Nicholas; Chaisson, Mark J P.

Genome Res ; 33(4): 511-524, 2023 04.

Article in English | MEDLINE | ID: mdl-37037626

ABSTRACT

Understanding the impact of DNA variation on human traits is a fundamental question in human genetics. Variable number tandem repeats (VNTRs) make up â¼3% of the human genome but are often excluded from association analysis owing to poor read mappability or divergent repeat content. Although methods exist to estimate VNTR length from short-read data, it is known that VNTRs vary in both length and repeat (motif) composition. Here, we use a repeat-pangenome graph (RPGG) constructed on 35 haplotype-resolved assemblies to detect variation in both VNTR length and repeat composition. We align population-scale data from the Genotype-Tissue Expression (GTEx) Consortium to examine how variations in sequence composition may be linked to expression, including cases independent of overall VNTR length. We find that 9422 out of 39,125 VNTRs are associated with nearby gene expression through motif variations, of which only 23.4% are accessible from length. Fine-mapping identifies 174 genes to be likely driven by variation in certain VNTR motifs and not overall length. We highlight two genes, CACNA1C and RNF213, that have expression associated with motif variation, showing the utility of RPGG analysis as a new approach for trait association in multiallelic and highly variable loci.

Subject(s)

Adenosine Triphosphatases , Minisatellite Repeats , Humans , Minisatellite Repeats/genetics , Phenotype , Haplotypes , Gene Expression , Adenosine Triphosphatases/genetics , Ubiquitin-Protein Ligases/genetics

10.

Hierarchical joint analysis of marginal summary statistics-Part II: High-dimensional instrumental analysis of omics data.

Jiang, Lai; Shen, Jiayi; Darst, Burcu F; Haiman, Christopher A; Mancuso, Nicholas; Conti, David V.

Genet Epidemiol ; 2024 Jun 17.

Article in English | MEDLINE | ID: mdl-38887957

ABSTRACT

Instrumental variable (IV) analysis has been widely applied in epidemiology to infer causal relationships using observational data. Genetic variants can also be viewed as valid IVs in Mendelian randomization and transcriptome-wide association studies. However, most multivariate IV approaches cannot scale to high-throughput experimental data. Here, we leverage the flexibility of our previous work, a hierarchical model that jointly analyzes marginal summary statistics (hJAM), to a scalable framework (SHA-JAM) that can be applied to a large number of intermediates and a large number of correlated genetic variants-situations often encountered in modern experiments leveraging omic technologies. SHA-JAM aims to estimate the conditional effect for high-dimensional risk factors on an outcome by incorporating estimates from association analyses of single-nucleotide polymorphism (SNP)-intermediate or SNP-gene expression as prior information in a hierarchical model. Results from extensive simulation studies demonstrate that SHA-JAM yields a higher area under the receiver operating characteristics curve (AUC), a lower mean-squared error of the estimates, and a much faster computation speed, compared to an existing approach for similar analyses. In two applied examples for prostate cancer, we investigated metabolite and transcriptome associations, respectively, using summary statistics from a GWAS for prostate cancer with more than 140,000 men and high dimensional publicly available summary data for metabolites and transcriptomes.

11.

A genealogical estimate of genetic relationships.

Fan, Caoqi; Mancuso, Nicholas; Chiang, Charleston W K.

Am J Hum Genet ; 109(5): 812-824, 2022 05 05.

Article in English | MEDLINE | ID: mdl-35417677

ABSTRACT

The application of genetic relationships among individuals, characterized by a genetic relationship matrix (GRM), has far-reaching effects in human genetics. However, the current standard to calculate the GRM treats linked markers as independent and does not explicitly model the underlying genealogical history of the study sample. Here, we propose a coalescent-informed framework, namely the expected GRM (eGRM), to infer the expected relatedness between pairs of individuals given an ancestral recombination graph (ARG) of the sample. Through extensive simulations, we show that the eGRM is an unbiased estimate of latent pairwise genome-wide relatedness and is robust when computed with ARG inferred from incomplete genetic data. As a result, the eGRM better captures the structure of a population than the canonical GRM, even when using the same genetic information. More importantly, our framework allows a principled approach to estimate the eGRM at different time depths of the ARG, thereby revealing the time-varying nature of population structure in a sample. When applied to SNP array genotypes from a population sample from Northern and Eastern Finland, we find that clustering analysis with the eGRM reveals population structure driven by subpopulations that would not be apparent via the canonical GRM and that temporally the population model is consistent with recent divergence and expansion. Taken together, our proposed eGRM provides a robust tree-centric estimate of relatedness with wide application to genetic studies.

Subject(s)

Genome , Models, Genetic , Finland , Genetics, Population , Genotype , Humans

12.

Multi-ancestry fine-mapping improves precision to identify causal genes in transcriptome-wide association studies.

Lu, Zeyun; Gopalan, Shyamalika; Yuan, Dong; Conti, David V; Pasaniuc, Bogdan; Gusev, Alexander; Mancuso, Nicholas.

Am J Hum Genet ; 109(8): 1388-1404, 2022 08 04.

Article in English | MEDLINE | ID: mdl-35931050

ABSTRACT

Transcriptome-wide association studies (TWASs) are a powerful approach to identify genes whose expression is associated with complex disease risk. However, non-causal genes can exhibit association signals due to confounding by linkage disequilibrium (LD) patterns and eQTL pleiotropy at genomic risk regions, which necessitates fine-mapping of TWAS signals. Here, we present MA-FOCUS, a multi-ancestry framework for the improved identification of genes underlying traits of interest. We demonstrate that by leveraging differences in ancestry-specific patterns of LD and eQTL signals, MA-FOCUS consistently outperforms single-ancestry fine-mapping approaches with equivalent total sample sizes across multiple metrics. We perform TWASs for 15 blood traits using genome-wide summary statistics (average nEA = 511 k, nAA = 13 k) and lymphoblastoid cell line eQTL data from cohorts of primarily European and African continental ancestries. We recapitulate evidence demonstrating shared genetic architectures for eQTL and blood traits between the two ancestry groups and observe that gene-level effects correlate 20% more strongly across ancestries than SNP-level effects. Lastly, we perform fine-mapping using MA-FOCUS and find evidence that genes at TWAS risk regions are more likely to be shared across ancestries than they are to be ancestry specific. Using multiple lines of evidence to validate our findings, we find that gene sets produced by MA-FOCUS are more enriched in hematopoietic categories than alternative approaches (p = 2.36 × 10-15). Our work demonstrates that including and appropriately accounting for genetic diversity can drive more profound insights into the genetic architecture of complex traits.

Subject(s)

Genome-Wide Association Study , Transcriptome , Humans , Linkage Disequilibrium , Multifactorial Inheritance/genetics , Polymorphism, Single Nucleotide/genetics , Transcriptome/genetics

13.

Investigating DNA methylation as a mediator of genetic risk in childhood acute lymphoblastic leukemia.

Xu, Keren; Li, Shaobo; Pandey, Priyatama; Kang, Alice Y; Morimoto, Libby M; Mancuso, Nicholas; Ma, Xiaomei; Metayer, Catherine; Wiemels, Joseph L; de Smith, Adam J.

Hum Mol Genet ; 31(21): 3741-3756, 2022 10 28.

Article in English | MEDLINE | ID: mdl-35717575

ABSTRACT

Genome-wide association studies have identified a growing number of single nucleotide polymorphisms (SNPs) associated with childhood acute lymphoblastic leukemia (ALL), yet the functional roles of most SNPs are unclear. Multiple lines of evidence suggest that epigenetic mechanisms may mediate the impact of heritable genetic variation on phenotypes. Here, we investigated whether DNA methylation mediates the effect of genetic risk loci for childhood ALL. We performed an epigenome-wide association study (EWAS) including 808 childhood ALL cases and 919 controls from California-based studies using neonatal blood DNA. For differentially methylated CpG positions (DMPs), we next conducted association analysis with 23 known ALL risk SNPs followed by causal mediation analyses addressing the significant SNP-DMP pairs. DNA methylation at CpG cg01139861, in the promoter region of IKZF1, mediated the effects of the intronic IKZF1 risk SNP rs78396808, with the average causal mediation effect (ACME) explaining ~30% of the total effect (ACME P = 0.0031). In analyses stratified by self-reported race/ethnicity, the mediation effect was only significant in Latinos, explaining ~41% of the total effect of rs78396808 on ALL risk (ACME P = 0.0037). Conditional analyses confirmed the presence of at least three independent genetic risk loci for childhood ALL at IKZF1, with rs78396808 unique to non-European populations. We also demonstrated that the most significant DMP in the EWAS, CpG cg13344587 at gene ARID5B (P = 8.61 × 10-10), was entirely confounded by the ARID5B ALL risk SNP rs7090445. Our findings provide new insights into the functional pathways of ALL risk SNPs and the DNA methylation differences associated with risk of childhood ALL.

Subject(s)

DNA Methylation , Precursor Cell Lymphoblastic Leukemia-Lymphoma , Humans , DNA Methylation/genetics , Genome-Wide Association Study , Polymorphism, Single Nucleotide/genetics , Precursor Cell Lymphoblastic Leukemia-Lymphoma/genetics , Transcription Factors/genetics

14.

H3K27ac HiChIP in prostate cell lines identifies risk genes for prostate cancer susceptibility.

Giambartolomei, Claudia; Seo, Ji-Heui; Schwarz, Tommer; Freund, Malika Kumar; Johnson, Ruth Dolly; Spisak, Sandor; Baca, Sylvan C; Gusev, Alexander; Mancuso, Nicholas; Pasaniuc, Bogdan; Freedman, Matthew L.

Am J Hum Genet ; 108(12): 2284-2300, 2021 12 02.

Article in English | MEDLINE | ID: mdl-34822763

ABSTRACT

Genome-wide association studies (GWASs) have identified more than 200 prostate cancer (PrCa) risk regions, which provide potential insights into causal mechanisms. Multiple lines of evidence show that a significant proportion of PrCa risk can be explained by germline causal variants that dysregulate nearby target genes in prostate-relevant tissues, thus altering disease risk. The traditional approach to explore this hypothesis has been correlating GWAS variants with steady-state transcript levels, referred to as expression quantitative trait loci (eQTLs). In this work, we assess the utility of chromosome conformation capture (3C) coupled with immunoprecipitation (HiChIP) to identify target genes for PrCa GWAS risk loci. We find that interactome data confirm previously reported PrCa target genes identified through GWAS/eQTL overlap (e.g., MLPH). Interestingly, HiChIP identifies links between PrCa GWAS variants and genes well-known to play a role in prostate cancer biology (e.g., AR) that are not detected by eQTL-based methods. HiChIP predicted enhancer elements at the AR and NKX3-1 prostate cancer risk loci, and both were experimentally confirmed to regulate expression of the corresponding genes through CRISPR interference (CRISPRi) perturbation in LNCaP cells. Our results demonstrate that looping data harbor additional information beyond eQTLs and expand the number of PrCa GWAS loci that can be linked to candidate susceptibility genes.

Subject(s)

Chromatin Immunoprecipitation Sequencing , Genetic Predisposition to Disease , Genome-Wide Association Study , Histone Code/genetics , Prostatic Neoplasms/genetics , Cell Line, Tumor , Chromosomes, Human , Clustered Regularly Interspaced Short Palindromic Repeats , Genetic Techniques , Humans , Male , Quantitative Trait Loci

15.

twas_sim, a Python-based tool for simulation and power analysis of transcriptome-wide association analysis.

Wang, Xinran; Lu, Zeyun; Bhattacharya, Arjun; Pasaniuc, Bogdan; Mancuso, Nicholas.

Bioinformatics ; 39(5)2023 05 04.

Article in English | MEDLINE | ID: mdl-37099718

ABSTRACT

SUMMARY: Genome-wide association studies (GWASs) have identified numerous genetic variants associated with complex disease risk; however, most of these associations are non-coding, complicating identifying their proximal target gene. Transcriptome-wide association studies (TWASs) have been proposed to mitigate this gap by integrating expression quantitative trait loci (eQTL) data with GWAS data. Numerous methodological advancements have been made for TWAS, yet each approach requires ad hoc simulations to demonstrate feasibility. Here, we present twas_sim, a computationally scalable and easily extendable tool for simplified performance evaluation and power analysis for TWAS methods. AVAILABILITY AND IMPLEMENTATION: Software and documentation are available at https://github.com/mancusolab/twas_sim.

Subject(s)

Genome-Wide Association Study , Transcriptome , Humans , Genome-Wide Association Study/methods , Gene Expression Profiling , Computer Simulation , Software , Polymorphism, Single Nucleotide , Genetic Predisposition to Disease

16.

Leveraging expression from multiple tissues using sparse canonical correlation analysis and aggregate tests improves the power of transcriptome-wide association studies.

Feng, Helian; Mancuso, Nicholas; Gusev, Alexander; Majumdar, Arunabha; Major, Megan; Pasaniuc, Bogdan; Kraft, Peter.

PLoS Genet ; 17(4): e1008973, 2021 04.

Article in English | MEDLINE | ID: mdl-33831007

ABSTRACT

Transcriptome-wide association studies (TWAS) test the association between traits and genetically predicted gene expression levels. The power of a TWAS depends in part on the strength of the correlation between a genetic predictor of gene expression and the causally relevant gene expression values. Consequently, TWAS power can be low when expression quantitative trait locus (eQTL) data used to train the genetic predictors have small sample sizes, or when data from causally relevant tissues are not available. Here, we propose to address these issues by integrating multiple tissues in the TWAS using sparse canonical correlation analysis (sCCA). We show that sCCA-TWAS combined with single-tissue TWAS using an aggregate Cauchy association test (ACAT) outperforms traditional single-tissue TWAS. In empirically motivated simulations, the sCCA+ACAT approach yielded the highest power to detect a gene associated with phenotype, even when expression in the causal tissue was not directly measured, while controlling the Type I error when there is no association between gene expression and phenotype. For example, when gene expression explains 2% of the variability in outcome, and the GWAS sample size is 20,000, the average power difference between the ACAT combined test of sCCA features and single-tissue, versus single-tissue combined with Generalized Berk-Jones (GBJ) method, single-tissue combined with S-MultiXcan, UTMOST, or summarizing cross-tissue expression patterns using Principal Component Analysis (PCA) approaches was 5%, 8%, 5% and 38%, respectively. The gain in power is likely due to sCCA cross-tissue features being more likely to be detectably heritable. When applied to publicly available summary statistics from 10 complex traits, the sCCA+ACAT test was able to increase the number of testable genes and identify on average an additional 400 additional gene-trait associations that single-trait TWAS missed. Our results suggest that aggregating eQTL data across multiple tissues using sCCA can improve the sensitivity of TWAS while controlling for the false positive rate.

Subject(s)

Genome-Wide Association Study/statistics & numerical data , Models, Genetic , Multivariate Analysis , Transcriptome/genetics , Computer Simulation , Gene Expression Regulation/genetics , Genetic Predisposition to Disease , Humans , Polymorphism, Single Nucleotide/genetics , Quantitative Trait Loci/genetics

17.

Localizing Components of Shared Transethnic Genetic Architecture of Complex Traits from GWAS Summary Data.

Shi, Huwenbo; Burch, Kathryn S; Johnson, Ruth; Freund, Malika K; Kichaev, Gleb; Mancuso, Nicholas; Manuel, Astrid M; Dong, Natalie; Pasaniuc, Bogdan.

Am J Hum Genet ; 106(6): 805-817, 2020 06 04.

Article in English | MEDLINE | ID: mdl-32442408

ABSTRACT

Despite strong transethnic genetic correlations reported in the literature for many complex traits, the non-transferability of polygenic risk scores across populations suggests the presence of population-specific components of genetic architecture. We propose an approach that models GWAS summary data for one trait in two populations to estimate genome-wide proportions of population-specific/shared causal SNPs. In simulations across various genetic architectures, we show that our approach yields approximately unbiased estimates with in-sample LD and slight upward-bias with out-of-sample LD. We analyze nine complex traits in individuals of East Asian and European ancestry, restricting to common SNPs (MAF > 5%), and find that most common causal SNPs are shared by both populations. Using the genome-wide estimates as priors in an empirical Bayes framework, we perform fine-mapping and observe that high-posterior SNPs (for both the population-specific and shared causal configurations) have highly correlated effects in East Asians and Europeans. In population-specific GWAS risk regions, we observe a 2.8× enrichment of shared high-posterior SNPs, suggesting that population-specific GWAS risk regions harbor shared causal SNPs that are undetected in the other GWASs due to differences in LD, allele frequencies, and/or sample size. Finally, we report enrichments of shared high-posterior SNPs in 53 tissue-specific functional categories and find evidence that SNP-heritability enrichments are driven largely by many low-effect common SNPs.

Subject(s)

Ethnicity/genetics , Genome-Wide Association Study , Multifactorial Inheritance/genetics , Polymorphism, Single Nucleotide/genetics , Bayes Theorem , Europe/ethnology , Asia, Eastern/ethnology , Gene Frequency , Humans , Linkage Disequilibrium , Organ Specificity/genetics

18.

Genetically regulated multi-omics study for symptom clusters of posttraumatic stress disorder highlights pleiotropy with hematologic and cardio-metabolic traits.

Pathak, Gita A; Singh, Kritika; Wendt, Frank R; Fleming, Tyne W; Overstreet, Cassie; Koller, Dora; Tylee, Daniel S; De Angelis, Flavio; Cabrera Mendoza, Brenda; Levey, Daniel F; Koenen, Karestan C; Krystal, John H; Pietrzak, Robert H; O' Donell, Christopher; Gaziano, J Michael; Falcone, Guido; Stein, Murray B; Gelernter, Joel; Pasaniuc, Bogdan; Mancuso, Nicholas; Davis, Lea K; Polimanti, Renato.

Mol Psychiatry ; 27(3): 1394-1404, 2022 03.

Article in English | MEDLINE | ID: mdl-35241783

ABSTRACT

Posttraumatic stress disorder (PTSD) is a psychiatric disorder that may arise in response to severe traumatic event and is diagnosed based on three main symptom clusters (reexperiencing, avoidance, and hyperarousal) per the Diagnostic Manual of Mental Disorders (version DSM-IV-TR). In this study, we characterized the biological heterogeneity of PTSD symptom clusters by performing a multi-omics investigation integrating genetically regulated gene, splicing, and protein expression in dorsolateral prefrontal cortex tissue within a sample of US veterans enrolled in the Million Veteran Program (N total = 186,689). We identified 30 genes in 19 regions across the three PTSD symptom clusters. We found nine genes to have cell-type specific expression, and over-representation of miRNA-families - miR-148, 30, and 8. Gene-drug target prioritization approach highlighted cyclooxygenase and acetylcholine compounds. Next, we tested molecular-profile based phenome-wide impact of identified genes with respect to 1678 phenotypes derived from the Electronic Health Records of the Vanderbilt University biorepository (N = 70,439). Lastly, we tested for local genetic correlation across PTSD symptom clusters which highlighted metabolic (e.g., obesity, diabetes, vascular health) and laboratory traits (e.g., neutrophil, eosinophil, tau protein, creatinine kinase). Overall, this study finds comprehensive genomic evidence including clinical and regulatory profiles between PTSD, hematologic and cardiometabolic traits, that support comorbidities observed in epidemiologic studies of PTSD.

Subject(s)

Stress Disorders, Post-Traumatic , Veterans , Diagnostic and Statistical Manual of Mental Disorders , Humans , Phenotype , Stress Disorders, Post-Traumatic/psychology , Syndrome , Veterans/psychology

19.

Multitrait transcriptome-wide association study (TWAS) tests.

Feng, Helian; Mancuso, Nicholas; Pasaniuc, Bogdan; Kraft, Peter.

Genet Epidemiol ; 45(6): 563-576, 2021 09.

Article in English | MEDLINE | ID: mdl-34082479

ABSTRACT

Multitrait tests can improve power to detect associations between individual single-nucleotide polymorphisms (SNPs) and several related traits. Here, we develop methods for multi-SNP transcriptome-wide association (TWAS) tests to test the association between predicted gene expression levels and multiple phenotypes. We show that the correlation in TWAS test statistics for multiple phenotypes has the same form as multitrait statistics for the single-SNP setting. Thus, established methods for combining single-SNP test statistics across multiple traits can be extended directly to the TWAS setting. We performed an extensive evaluation across eight multitrait methods in simulations that varied gene-phenotype effect sizes in addition to the underlying covariance structure among the phenotypes. We found that all multitrait TWAS tests have well-calibrated Type I error (except ASSET, which can have a slightly elevated or depressed Type I error rate). Our results show that multitrait TWAS can improve statistical power compared with multiple single-trait TWAS followed by Bonferroni correction. To illustrate our approach to real data, we conducted a multitrait TWAS of four circulating lipid traits from the Global Lipids Genetics Consortium. We found that our multitrait Wald TWAS approach identified 506 genes associated with lipid levels compared with 87 identified through Bonferroni-corrected single-trait TWAS. Overall, we find that our proposed multitrait TWAS framework outperforms single-trait approaches to identify new genetic associations, especially for functionally correlated phenotypes and phenotypes with overlapping genome-wide association studies samples, leading to insights into the genetic architecture of multiple phenotypes.

Subject(s)

Genome-Wide Association Study , Transcriptome , Humans , Models, Genetic , Phenotype , Polymorphism, Single Nucleotide , Quantitative Trait Loci

20.

A transcriptome-wide association study identifies novel candidate susceptibility genes for prostate cancer risk.

Liu, Duo; Zhu, Jingjing; Zhou, Dan; Nikas, Emily G; Mitanis, Nikos T; Sun, Yanfa; Wu, Chong; Mancuso, Nicholas; Cox, Nancy J; Wang, Liang; Freedland, Stephen J; Haiman, Christopher A; Gamazon, Eric R; Nikas, Jason B; Wu, Lang.

Int J Cancer ; 150(1): 80-90, 2022 01 01.

Article in English | MEDLINE | ID: mdl-34520569

ABSTRACT

A large proportion of heritability for prostate cancer risk remains unknown. Transcriptome-wide association study combined with validation comparing overall levels will help to identify candidate genes potentially playing a role in prostate cancer development. Using data from the Genotype-Tissue Expression Project, we built genetic models to predict normal prostate tissue gene expression using the statistical framework PrediXcan, a modified version of the unified test for molecular signatures and Joint-Tissue Imputation. We applied these prediction models to the genetic data of 79 194 prostate cancer cases and 61 112 controls to investigate the associations of genetically determined gene expression with prostate cancer risk. Focusing on associated genes, we compared their expression in prostate tumor vs normal prostate tissue, compared methylation of CpG sites located at these loci in prostate tumor vs normal tissue, and assessed the correlations between the differentiated genes' expression and the methylation of corresponding CpG sites, by analyzing The Cancer Genome Atlas (TCGA) data. We identified 573 genes showing an association with prostate cancer risk at a false discovery rate (FDR) ≤ 0.05, including 451 novel genes and 122 previously reported genes. Of the 573 genes, 152 showed differential expression in prostate tumor vs normal tissue samples. At loci of 57 genes, 151 CpG sites showed differential methylation in prostate tumor vs normal tissue samples. Of these, 20 CpG sites were correlated with expression of 11 corresponding genes. In this TWAS, we identified novel candidate susceptibility genes for prostate cancer risk, providing new insights into prostate cancer genetics and biology.

Subject(s)

Biomarkers, Tumor/genetics , Epigenesis, Genetic , Gene Expression Regulation, Neoplastic , Genetic Predisposition to Disease , Polymorphism, Single Nucleotide , Prostatic Neoplasms/pathology , Transcriptome , Case-Control Studies , DNA Methylation , Follow-Up Studies , Genome-Wide Association Study , Humans , Male , Prognosis , Prostatic Neoplasms/epidemiology , Prostatic Neoplasms/genetics , Quantitative Trait Loci , United States/epidemiology

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL