Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 67
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Cell ; 179(3): 750-771.e22, 2019 10 17.
Artigo em Inglês | MEDLINE | ID: mdl-31626773

RESUMO

Tissue-specific regulatory regions harbor substantial genetic risk for disease. Because brain development is a critical epoch for neuropsychiatric disease susceptibility, we characterized the genetic control of the transcriptome in 201 mid-gestational human brains, identifying 7,962 expression quantitative trait loci (eQTL) and 4,635 spliceQTL (sQTL), including several thousand prenatal-specific regulatory regions. We show that significant genetic liability for neuropsychiatric disease lies within prenatal eQTL and sQTL. Integration of eQTL and sQTL with genome-wide association studies (GWAS) via transcriptome-wide association identified dozens of novel candidate risk genes, highlighting shared and stage-specific mechanisms in schizophrenia (SCZ). Gene network analysis revealed that SCZ and autism spectrum disorder (ASD) affect distinct developmental gene co-expression modules. Yet, in each disorder, common and rare genetic variation converges within modules, which in ASD implicates superficial cortical neurons. More broadly, these data, available as a web browser and our analyses, demonstrate the genetic mechanisms by which developmental events have a widespread influence on adult anatomical and behavioral phenotypes.


Assuntos
Transtorno do Espectro Autista/genética , Locos de Características Quantitativas/genética , Esquizofrenia/genética , Transcriptoma/genética , Transtorno do Espectro Autista/metabolismo , Transtorno do Espectro Autista/patologia , Encéfalo/crescimento & desenvolvimento , Encéfalo/metabolismo , Feminino , Feto/metabolismo , Regulação da Expressão Gênica no Desenvolvimento , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Idade Gestacional , Humanos , Masculino , Neurônios/metabolismo , Polimorfismo de Nucleotídeo Único/genética , Splicing de RNA/genética , Esquizofrenia/metabolismo , Esquizofrenia/patologia
4.
Am J Hum Genet ; 111(10): 2117-2128, 2024 Oct 03.
Artigo em Inglês | MEDLINE | ID: mdl-39191255

RESUMO

Multi-ancestry genome-wide association studies (GWASs) have highlighted the existence of variants with ancestry-specific effect sizes. Understanding where and why these ancestry-specific effects occur is fundamental to understanding the genetic basis of human diseases and complex traits. Here, we characterized genes differentially expressed across ancestries (ancDE genes) at the cell-type level by leveraging single-cell RNA-sequencing data in peripheral blood mononuclear cells for 21 individuals with East Asian (EAS) ancestry and 23 individuals with European (EUR) ancestry (172,385 cells); then, we tested whether variants surrounding those genes were enriched in disease variants with ancestry-specific effect sizes by leveraging ancestry-matched GWASs of 31 diseases and complex traits (average n ∼ 90,000 and ∼ 267,000 in EAS and EUR, respectively). We observed that ancDE genes tended to be cell-type specific and enriched in genes interacting with the environment and in variants with ancestry-specific disease effect sizes, which suggests cell-type-specific, gene-by-environment interactions shared between regulatory and disease architectures. Finally, we illustrated how different environments might have led to ancestry-specific myeloid cell leukemia 1 (MCL1) expression in B cells and ancestry-specific allele effect sizes in lymphocyte count GWASs for variants surrounding MCL1. Our results imply that large single-cell and GWAS datasets from diverse ancestries are required to improve our understanding of human diseases.


Assuntos
Interação Gene-Ambiente , Estudo de Associação Genômica Ampla , População Branca , Humanos , População Branca/genética , Predisposição Genética para Doença , Polimorfismo de Nucleotídeo Único , Povo Asiático/genética , Leucócitos Mononucleares/metabolismo , Análise de Célula Única , Proteína de Sequência 1 de Leucemia de Células Mieloides/genética , Regulação da Expressão Gênica
5.
Hum Mol Genet ; 33(2): 170-181, 2024 Jan 07.
Artigo em Inglês | MEDLINE | ID: mdl-37824084

RESUMO

Stroke, characterized by sudden neurological deficits, is the second leading cause of death worldwide. Although genome-wide association studies (GWAS) have successfully identified many genomic regions associated with ischemic stroke (IS), the genes underlying risk and their regulatory mechanisms remain elusive. Here, we integrate a large-scale GWAS (N = 1 296 908) for IS together with molecular QTLs data, including mRNA, splicing, enhancer RNA (eRNA), and protein expression data from up to 50 tissues (total N = 11 588). We identify 136 genes/eRNA/proteins associated with IS risk across 60 independent genomic regions and find IS risk is most enriched for eQTLs in arterial and brain-related tissues. Focusing on IS-relevant tissues, we prioritize 9 genes/proteins using probabilistic fine-mapping TWAS analyses. In addition, we discover that blood cell traits, particularly reticulocyte cells, have shared genetic contributions with IS using TWAS-based pheWAS and genetic correlation analysis. Lastly, we integrate our findings with a large-scale pharmacological database and identify a secondary bile acid, deoxycholic acid, as a potential therapeutic component. Our work highlights IS risk genes/splicing-sites/enhancer activity/proteins with their phenotypic consequences using relevant tissues as well as identify potential therapeutic candidates for IS.


Assuntos
AVC Isquêmico , Transcriptoma , Humanos , Estudo de Associação Genômica Ampla , AVC Isquêmico/genética , Genômica , Fenótipo , Predisposição Genética para Doença , Polimorfismo de Nucleotídeo Único/genética
6.
Hum Mol Genet ; 33(8): 687-697, 2024 Apr 08.
Artigo em Inglês | MEDLINE | ID: mdl-38263910

RESUMO

BACKGROUND: Expansion of genome-wide association studies across population groups is needed to improve our understanding of shared and unique genetic contributions to breast cancer. We performed association and replication studies guided by a priori linkage findings from African ancestry (AA) relative pairs. METHODS: We performed fixed-effect inverse-variance weighted meta-analysis under three significant AA breast cancer linkage peaks (3q26-27, 12q22-23, and 16q21-22) in 9241 AA cases and 10 193 AA controls. We examined associations with overall breast cancer as well as estrogen receptor (ER)-positive and negative subtypes (193,132 SNPs). We replicated associations in the African-ancestry Breast Cancer Genetic Consortium (AABCG). RESULTS: In AA women, we identified two associations on chr12q for overall breast cancer (rs1420647, OR = 1.15, p = 2.50×10-6; rs12322371, OR = 1.14, p = 3.15×10-6), and one for ER-negative breast cancer (rs77006600, OR = 1.67, p = 3.51×10-6). On chr3, we identified two associations with ER-negative disease (rs184090918, OR = 3.70, p = 1.23×10-5; rs76959804, OR = 3.57, p = 1.77×10-5) and on chr16q we identified an association with ER-negative disease (rs34147411, OR = 1.62, p = 8.82×10-6). In the replication study, the chr3 associations were significant and effect sizes were larger (rs184090918, OR: 6.66, 95% CI: 1.43, 31.01; rs76959804, OR: 5.24, 95% CI: 1.70, 16.16). CONCLUSION: The two chr3 SNPs are upstream to open chromatin ENSR00000710716, a regulatory feature that is actively regulated in mammary tissues, providing evidence that variants in this chr3 region may have a regulatory role in our target organ. Our study provides support for breast cancer variant discovery using prioritization based on linkage evidence.


Assuntos
População Negra , Neoplasias da Mama , Predisposição Genética para Doença , Feminino , Humanos , População Negra/genética , Neoplasias da Mama/genética , Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único
7.
Am J Hum Genet ; 110(11): 1863-1874, 2023 11 02.
Artigo em Inglês | MEDLINE | ID: mdl-37879338

RESUMO

Genome-wide association studies (GWASs) across thousands of traits have revealed the pervasive pleiotropy of trait-associated genetic variants. While methods have been proposed to characterize pleiotropic components across groups of phenotypes, scaling these approaches to ultra-large-scale biobanks has been challenging. Here, we propose FactorGo, a scalable variational factor analysis model to identify and characterize pleiotropic components using biobank GWAS summary data. In extensive simulations, we observe that FactorGo outperforms the state-of-the-art (model-free) approach tSVD in capturing latent pleiotropic factors across phenotypes while maintaining a similar computational cost. We apply FactorGo to estimate 100 latent pleiotropic factors from GWAS summary data of 2,483 phenotypes measured in European-ancestry Pan-UK BioBank individuals (N = 420,531). Next, we find that factors from FactorGo are more enriched with relevant tissue-specific annotations than those identified by tSVD (p = 2.58E-10) and validate our approach by recapitulating brain-specific enrichment for BMI and the height-related connection between reproductive system and muscular-skeletal growth. Finally, our analyses suggest shared etiologies between rheumatoid arthritis and periodontal condition in addition to alkaline phosphatase as a candidate prognostic biomarker for prostate cancer. Overall, FactorGo improves our biological understanding of shared etiologies across thousands of GWASs.


Assuntos
Artrite Reumatoide , Estudo de Associação Genômica Ampla , Masculino , Humanos , Estudo de Associação Genômica Ampla/métodos , Herança Multifatorial , Fenótipo , Encéfalo , Artrite Reumatoide/genética , Polimorfismo de Nucleotídeo Único/genética , Pleiotropia Genética
8.
Am J Hum Genet ; 110(12): 2077-2091, 2023 Dec 07.
Artigo em Inglês | MEDLINE | ID: mdl-38065072

RESUMO

Understanding the genetic basis of complex phenotypes is a central pursuit of genetics. Genome-wide association studies (GWASs) are a powerful way to find genetic loci associated with phenotypes. GWASs are widely and successfully used, but they face challenges related to the fact that variants are tested for association with a phenotype independently, whereas in reality variants at different sites are correlated because of their shared evolutionary history. One way to model this shared history is through the ancestral recombination graph (ARG), which encodes a series of local coalescent trees. Recent computational and methodological breakthroughs have made it feasible to estimate approximate ARGs from large-scale samples. Here, we explore the potential of an ARG-based approach to quantitative-trait locus (QTL) mapping, echoing existing variance-components approaches. We propose a framework that relies on the conditional expectation of a local genetic relatedness matrix (local eGRM) given the ARG. Simulations show that our method is especially beneficial for finding QTLs in the presence of allelic heterogeneity. By framing QTL mapping in terms of the estimated ARG, we can also facilitate the detection of QTLs in understudied populations. We use local eGRM to analyze two chromosomes containing known body size loci in a sample of Native Hawaiians. Our investigations can provide intuition about the benefits of using estimated ARGs in population- and statistical-genetic methods in general.


Assuntos
Genética Populacional , Estudo de Associação Genômica Ampla , Locos de Características Quantitativas , Humanos , Mapeamento Cromossômico/métodos , Modelos Genéticos , Fenótipo , Locos de Características Quantitativas/genética , Havaiano Nativo ou Outro Ilhéu do Pacífico/genética
9.
Am J Hum Genet ; 110(11): 1853-1862, 2023 11 02.
Artigo em Inglês | MEDLINE | ID: mdl-37875120

RESUMO

The heritability explained by local ancestry markers in an admixed population (hγ2) provides crucial insight into the genetic architecture of a complex disease or trait. Estimation of hγ2 can be susceptible to biases due to population structure in ancestral populations. Here, we present heritability estimation from admixture mapping summary statistics (HAMSTA), an approach that uses summary statistics from admixture mapping to infer heritability explained by local ancestry while adjusting for biases due to ancestral stratification. Through extensive simulations, we demonstrate that HAMSTA hγ2 estimates are approximately unbiased and are robust to ancestral stratification compared to existing approaches. In the presence of ancestral stratification, we show a HAMSTA-derived sampling scheme provides a calibrated family-wise error rate (FWER) of ∼5% for admixture mapping, unlike existing FWER estimation approaches. We apply HAMSTA to 20 quantitative phenotypes of up to 15,988 self-reported African American individuals in the Population Architecture using Genomics and Epidemiology (PAGE) study. We observe hˆγ2 in the 20 phenotypes range from 0.0025 to 0.033 (mean hˆγ2 = 0.012 ± 9.2 × 10-4), which translates to hˆ2 ranging from 0.062 to 0.85 (mean hˆ2 = 0.30 ± 0.023). Across these phenotypes we find little evidence of inflation due to ancestral population stratification in current admixture mapping studies (mean inflation factor of 0.99 ± 0.001). Overall, HAMSTA provides a fast and powerful approach to estimate genome-wide heritability and evaluate biases in test statistics of admixture mapping studies.


Assuntos
Negro ou Afro-Americano , Genética Populacional , Humanos , Mapeamento Cromossômico , Fenótipo , Polimorfismo de Nucleotídeo Único/genética
10.
Genome Res ; 33(4): 511-524, 2023 04.
Artigo em Inglês | MEDLINE | ID: mdl-37037626

RESUMO

Understanding the impact of DNA variation on human traits is a fundamental question in human genetics. Variable number tandem repeats (VNTRs) make up ∼3% of the human genome but are often excluded from association analysis owing to poor read mappability or divergent repeat content. Although methods exist to estimate VNTR length from short-read data, it is known that VNTRs vary in both length and repeat (motif) composition. Here, we use a repeat-pangenome graph (RPGG) constructed on 35 haplotype-resolved assemblies to detect variation in both VNTR length and repeat composition. We align population-scale data from the Genotype-Tissue Expression (GTEx) Consortium to examine how variations in sequence composition may be linked to expression, including cases independent of overall VNTR length. We find that 9422 out of 39,125 VNTRs are associated with nearby gene expression through motif variations, of which only 23.4% are accessible from length. Fine-mapping identifies 174 genes to be likely driven by variation in certain VNTR motifs and not overall length. We highlight two genes, CACNA1C and RNF213, that have expression associated with motif variation, showing the utility of RPGG analysis as a new approach for trait association in multiallelic and highly variable loci.


Assuntos
Adenosina Trifosfatases , Repetições Minissatélites , Humanos , Repetições Minissatélites/genética , Fenótipo , Haplótipos , Expressão Gênica , Adenosina Trifosfatases/genética , Ubiquitina-Proteína Ligases/genética
11.
Genet Epidemiol ; 48(7): 291-309, 2024 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-38887957

RESUMO

Instrumental variable (IV) analysis has been widely applied in epidemiology to infer causal relationships using observational data. Genetic variants can also be viewed as valid IVs in Mendelian randomization and transcriptome-wide association studies. However, most multivariate IV approaches cannot scale to high-throughput experimental data. Here, we leverage the flexibility of our previous work, a hierarchical model that jointly analyzes marginal summary statistics (hJAM), to a scalable framework (SHA-JAM) that can be applied to a large number of intermediates and a large number of correlated genetic variants-situations often encountered in modern experiments leveraging omic technologies. SHA-JAM aims to estimate the conditional effect for high-dimensional risk factors on an outcome by incorporating estimates from association analyses of single-nucleotide polymorphism (SNP)-intermediate or SNP-gene expression as prior information in a hierarchical model. Results from extensive simulation studies demonstrate that SHA-JAM yields a higher area under the receiver operating characteristics curve (AUC), a lower mean-squared error of the estimates, and a much faster computation speed, compared to an existing approach for similar analyses. In two applied examples for prostate cancer, we investigated metabolite and transcriptome associations, respectively, using summary statistics from a GWAS for prostate cancer with more than 140,000 men and high dimensional publicly available summary data for metabolites and transcriptomes.


Assuntos
Polimorfismo de Nucleotídeo Único , Neoplasias da Próstata , Humanos , Neoplasias da Próstata/genética , Masculino , Estudo de Associação Genômica Ampla/métodos , Modelos Estatísticos , Análise da Randomização Mendeliana , Curva ROC , Simulação por Computador
12.
Am J Hum Genet ; 109(5): 812-824, 2022 05 05.
Artigo em Inglês | MEDLINE | ID: mdl-35417677

RESUMO

The application of genetic relationships among individuals, characterized by a genetic relationship matrix (GRM), has far-reaching effects in human genetics. However, the current standard to calculate the GRM treats linked markers as independent and does not explicitly model the underlying genealogical history of the study sample. Here, we propose a coalescent-informed framework, namely the expected GRM (eGRM), to infer the expected relatedness between pairs of individuals given an ancestral recombination graph (ARG) of the sample. Through extensive simulations, we show that the eGRM is an unbiased estimate of latent pairwise genome-wide relatedness and is robust when computed with ARG inferred from incomplete genetic data. As a result, the eGRM better captures the structure of a population than the canonical GRM, even when using the same genetic information. More importantly, our framework allows a principled approach to estimate the eGRM at different time depths of the ARG, thereby revealing the time-varying nature of population structure in a sample. When applied to SNP array genotypes from a population sample from Northern and Eastern Finland, we find that clustering analysis with the eGRM reveals population structure driven by subpopulations that would not be apparent via the canonical GRM and that temporally the population model is consistent with recent divergence and expansion. Taken together, our proposed eGRM provides a robust tree-centric estimate of relatedness with wide application to genetic studies.


Assuntos
Genoma , Modelos Genéticos , Finlândia , Genética Populacional , Genótipo , Humanos
13.
Am J Hum Genet ; 109(8): 1388-1404, 2022 08 04.
Artigo em Inglês | MEDLINE | ID: mdl-35931050

RESUMO

Transcriptome-wide association studies (TWASs) are a powerful approach to identify genes whose expression is associated with complex disease risk. However, non-causal genes can exhibit association signals due to confounding by linkage disequilibrium (LD) patterns and eQTL pleiotropy at genomic risk regions, which necessitates fine-mapping of TWAS signals. Here, we present MA-FOCUS, a multi-ancestry framework for the improved identification of genes underlying traits of interest. We demonstrate that by leveraging differences in ancestry-specific patterns of LD and eQTL signals, MA-FOCUS consistently outperforms single-ancestry fine-mapping approaches with equivalent total sample sizes across multiple metrics. We perform TWASs for 15 blood traits using genome-wide summary statistics (average nEA = 511 k, nAA = 13 k) and lymphoblastoid cell line eQTL data from cohorts of primarily European and African continental ancestries. We recapitulate evidence demonstrating shared genetic architectures for eQTL and blood traits between the two ancestry groups and observe that gene-level effects correlate 20% more strongly across ancestries than SNP-level effects. Lastly, we perform fine-mapping using MA-FOCUS and find evidence that genes at TWAS risk regions are more likely to be shared across ancestries than they are to be ancestry specific. Using multiple lines of evidence to validate our findings, we find that gene sets produced by MA-FOCUS are more enriched in hematopoietic categories than alternative approaches (p = 2.36 × 10-15). Our work demonstrates that including and appropriately accounting for genetic diversity can drive more profound insights into the genetic architecture of complex traits.


Assuntos
Estudo de Associação Genômica Ampla , Transcriptoma , Humanos , Desequilíbrio de Ligação , Herança Multifatorial/genética , Polimorfismo de Nucleotídeo Único/genética , Transcriptoma/genética
14.
Diabetologia ; 2024 Sep 30.
Artigo em Inglês | MEDLINE | ID: mdl-39349773

RESUMO

AIMS/HYPOTHESIS: Several studies have reported associations between specific proteins and type 2 diabetes risk in European populations. To better understand the role played by proteins in type 2 diabetes aetiology across diverse populations, we conducted a large proteome-wide association study using genetic instruments across four racial and ethnic groups: African; Asian; Hispanic/Latino; and European. METHODS: Genome and plasma proteome data from the Multi-Ethnic Study of Atherosclerosis (MESA) study involving 182 African, 69 Asian, 284 Hispanic/Latino and 409 European individuals residing in the USA were used to establish protein prediction models by using potentially associated cis- and trans-SNPs. The models were applied to genome-wide association study summary statistics of 250,127 type 2 diabetes cases and 1,222,941 controls from different racial and ethnic populations. RESULTS: We identified three, 44 and one protein associated with type 2 diabetes risk in Asian, European and Hispanic/Latino populations, respectively. Meta-analysis identified 40 proteins associated with type 2 diabetes risk across the populations, including well-established as well as novel proteins not yet implicated in type 2 diabetes development. CONCLUSIONS/INTERPRETATION: Our study improves our understanding of the aetiology of type 2 diabetes in diverse populations. DATA AVAILABILITY: The summary statistics of multi-ethnic type 2 diabetes GWAS of MVP, DIAMANTE, Biobank Japan and other studies are available from The database of Genotypes and Phenotypes (dbGaP) under accession number phs001672.v3.p1. MESA genetic, proteome and covariate data can be accessed through dbGaP under phs000209.v13.p3. All code is available on GitHub ( https://github.com/Arthur1021/MESA-1K-PWAS ).

15.
Hum Mol Genet ; 31(21): 3741-3756, 2022 10 28.
Artigo em Inglês | MEDLINE | ID: mdl-35717575

RESUMO

Genome-wide association studies have identified a growing number of single nucleotide polymorphisms (SNPs) associated with childhood acute lymphoblastic leukemia (ALL), yet the functional roles of most SNPs are unclear. Multiple lines of evidence suggest that epigenetic mechanisms may mediate the impact of heritable genetic variation on phenotypes. Here, we investigated whether DNA methylation mediates the effect of genetic risk loci for childhood ALL. We performed an epigenome-wide association study (EWAS) including 808 childhood ALL cases and 919 controls from California-based studies using neonatal blood DNA. For differentially methylated CpG positions (DMPs), we next conducted association analysis with 23 known ALL risk SNPs followed by causal mediation analyses addressing the significant SNP-DMP pairs. DNA methylation at CpG cg01139861, in the promoter region of IKZF1, mediated the effects of the intronic IKZF1 risk SNP rs78396808, with the average causal mediation effect (ACME) explaining ~30% of the total effect (ACME P = 0.0031). In analyses stratified by self-reported race/ethnicity, the mediation effect was only significant in Latinos, explaining ~41% of the total effect of rs78396808 on ALL risk (ACME P = 0.0037). Conditional analyses confirmed the presence of at least three independent genetic risk loci for childhood ALL at IKZF1, with rs78396808 unique to non-European populations. We also demonstrated that the most significant DMP in the EWAS, CpG cg13344587 at gene ARID5B (P = 8.61 × 10-10), was entirely confounded by the ARID5B ALL risk SNP rs7090445. Our findings provide new insights into the functional pathways of ALL risk SNPs and the DNA methylation differences associated with risk of childhood ALL.


Assuntos
Metilação de DNA , Leucemia-Linfoma Linfoblástico de Células Precursoras , Humanos , Metilação de DNA/genética , Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único/genética , Leucemia-Linfoma Linfoblástico de Células Precursoras/genética , Fatores de Transcrição/genética
16.
Am J Hum Genet ; 108(12): 2284-2300, 2021 12 02.
Artigo em Inglês | MEDLINE | ID: mdl-34822763

RESUMO

Genome-wide association studies (GWASs) have identified more than 200 prostate cancer (PrCa) risk regions, which provide potential insights into causal mechanisms. Multiple lines of evidence show that a significant proportion of PrCa risk can be explained by germline causal variants that dysregulate nearby target genes in prostate-relevant tissues, thus altering disease risk. The traditional approach to explore this hypothesis has been correlating GWAS variants with steady-state transcript levels, referred to as expression quantitative trait loci (eQTLs). In this work, we assess the utility of chromosome conformation capture (3C) coupled with immunoprecipitation (HiChIP) to identify target genes for PrCa GWAS risk loci. We find that interactome data confirm previously reported PrCa target genes identified through GWAS/eQTL overlap (e.g., MLPH). Interestingly, HiChIP identifies links between PrCa GWAS variants and genes well-known to play a role in prostate cancer biology (e.g., AR) that are not detected by eQTL-based methods. HiChIP predicted enhancer elements at the AR and NKX3-1 prostate cancer risk loci, and both were experimentally confirmed to regulate expression of the corresponding genes through CRISPR interference (CRISPRi) perturbation in LNCaP cells. Our results demonstrate that looping data harbor additional information beyond eQTLs and expand the number of PrCa GWAS loci that can be linked to candidate susceptibility genes.


Assuntos
Sequenciamento de Cromatina por Imunoprecipitação , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Código das Histonas/genética , Neoplasias da Próstata/genética , Linhagem Celular Tumoral , Cromossomos Humanos , Repetições Palindrômicas Curtas Agrupadas e Regularmente Espaçadas , Técnicas Genéticas , Humanos , Masculino , Locos de Características Quantitativas
17.
Bioinformatics ; 39(5)2023 05 04.
Artigo em Inglês | MEDLINE | ID: mdl-37099718

RESUMO

SUMMARY: Genome-wide association studies (GWASs) have identified numerous genetic variants associated with complex disease risk; however, most of these associations are non-coding, complicating identifying their proximal target gene. Transcriptome-wide association studies (TWASs) have been proposed to mitigate this gap by integrating expression quantitative trait loci (eQTL) data with GWAS data. Numerous methodological advancements have been made for TWAS, yet each approach requires ad hoc simulations to demonstrate feasibility. Here, we present twas_sim, a computationally scalable and easily extendable tool for simplified performance evaluation and power analysis for TWAS methods. AVAILABILITY AND IMPLEMENTATION: Software and documentation are available at https://github.com/mancusolab/twas_sim.


Assuntos
Estudo de Associação Genômica Ampla , Transcriptoma , Humanos , Estudo de Associação Genômica Ampla/métodos , Perfilação da Expressão Gênica , Simulação por Computador , Software , Polimorfismo de Nucleotídeo Único , Predisposição Genética para Doença
18.
PLoS Genet ; 17(4): e1008973, 2021 04.
Artigo em Inglês | MEDLINE | ID: mdl-33831007

RESUMO

Transcriptome-wide association studies (TWAS) test the association between traits and genetically predicted gene expression levels. The power of a TWAS depends in part on the strength of the correlation between a genetic predictor of gene expression and the causally relevant gene expression values. Consequently, TWAS power can be low when expression quantitative trait locus (eQTL) data used to train the genetic predictors have small sample sizes, or when data from causally relevant tissues are not available. Here, we propose to address these issues by integrating multiple tissues in the TWAS using sparse canonical correlation analysis (sCCA). We show that sCCA-TWAS combined with single-tissue TWAS using an aggregate Cauchy association test (ACAT) outperforms traditional single-tissue TWAS. In empirically motivated simulations, the sCCA+ACAT approach yielded the highest power to detect a gene associated with phenotype, even when expression in the causal tissue was not directly measured, while controlling the Type I error when there is no association between gene expression and phenotype. For example, when gene expression explains 2% of the variability in outcome, and the GWAS sample size is 20,000, the average power difference between the ACAT combined test of sCCA features and single-tissue, versus single-tissue combined with Generalized Berk-Jones (GBJ) method, single-tissue combined with S-MultiXcan, UTMOST, or summarizing cross-tissue expression patterns using Principal Component Analysis (PCA) approaches was 5%, 8%, 5% and 38%, respectively. The gain in power is likely due to sCCA cross-tissue features being more likely to be detectably heritable. When applied to publicly available summary statistics from 10 complex traits, the sCCA+ACAT test was able to increase the number of testable genes and identify on average an additional 400 additional gene-trait associations that single-trait TWAS missed. Our results suggest that aggregating eQTL data across multiple tissues using sCCA can improve the sensitivity of TWAS while controlling for the false positive rate.


Assuntos
Estudo de Associação Genômica Ampla/estatística & dados numéricos , Modelos Genéticos , Análise Multivariada , Transcriptoma/genética , Simulação por Computador , Regulação da Expressão Gênica/genética , Predisposição Genética para Doença , Humanos , Polimorfismo de Nucleotídeo Único/genética , Locos de Características Quantitativas/genética
19.
Am J Hum Genet ; 106(6): 805-817, 2020 06 04.
Artigo em Inglês | MEDLINE | ID: mdl-32442408

RESUMO

Despite strong transethnic genetic correlations reported in the literature for many complex traits, the non-transferability of polygenic risk scores across populations suggests the presence of population-specific components of genetic architecture. We propose an approach that models GWAS summary data for one trait in two populations to estimate genome-wide proportions of population-specific/shared causal SNPs. In simulations across various genetic architectures, we show that our approach yields approximately unbiased estimates with in-sample LD and slight upward-bias with out-of-sample LD. We analyze nine complex traits in individuals of East Asian and European ancestry, restricting to common SNPs (MAF > 5%), and find that most common causal SNPs are shared by both populations. Using the genome-wide estimates as priors in an empirical Bayes framework, we perform fine-mapping and observe that high-posterior SNPs (for both the population-specific and shared causal configurations) have highly correlated effects in East Asians and Europeans. In population-specific GWAS risk regions, we observe a 2.8× enrichment of shared high-posterior SNPs, suggesting that population-specific GWAS risk regions harbor shared causal SNPs that are undetected in the other GWASs due to differences in LD, allele frequencies, and/or sample size. Finally, we report enrichments of shared high-posterior SNPs in 53 tissue-specific functional categories and find evidence that SNP-heritability enrichments are driven largely by many low-effect common SNPs.


Assuntos
Etnicidade/genética , Estudo de Associação Genômica Ampla , Herança Multifatorial/genética , Polimorfismo de Nucleotídeo Único/genética , Teorema de Bayes , Europa (Continente)/etnologia , Ásia Oriental/etnologia , Frequência do Gene , Humanos , Desequilíbrio de Ligação , Especificidade de Órgãos/genética
20.
Mol Psychiatry ; 27(3): 1394-1404, 2022 03.
Artigo em Inglês | MEDLINE | ID: mdl-35241783

RESUMO

Posttraumatic stress disorder (PTSD) is a psychiatric disorder that may arise in response to severe traumatic event and is diagnosed based on three main symptom clusters (reexperiencing, avoidance, and hyperarousal) per the Diagnostic Manual of Mental Disorders (version DSM-IV-TR). In this study, we characterized the biological heterogeneity of PTSD symptom clusters by performing a multi-omics investigation integrating genetically regulated gene, splicing, and protein expression in dorsolateral prefrontal cortex tissue within a sample of US veterans enrolled in the Million Veteran Program (N total = 186,689). We identified 30 genes in 19 regions across the three PTSD symptom clusters. We found nine genes to have cell-type specific expression, and over-representation of miRNA-families - miR-148, 30, and 8. Gene-drug target prioritization approach highlighted cyclooxygenase and acetylcholine compounds. Next, we tested molecular-profile based phenome-wide impact of identified genes with respect to 1678 phenotypes derived from the Electronic Health Records of the Vanderbilt University biorepository (N = 70,439). Lastly, we tested for local genetic correlation across PTSD symptom clusters which highlighted metabolic (e.g., obesity, diabetes, vascular health) and laboratory traits (e.g., neutrophil, eosinophil, tau protein, creatinine kinase). Overall, this study finds comprehensive genomic evidence including clinical and regulatory profiles between PTSD, hematologic and cardiometabolic traits, that support comorbidities observed in epidemiologic studies of PTSD.


Assuntos
Transtornos de Estresse Pós-Traumáticos , Veteranos , Manual Diagnóstico e Estatístico de Transtornos Mentais , Humanos , Fenótipo , Transtornos de Estresse Pós-Traumáticos/psicologia , Síndrome , Veteranos/psicologia
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA