Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 18 de 18
Filtrar
1.
Am J Hum Genet ; 2024 May 08.
Artigo em Inglês | MEDLINE | ID: mdl-38733992

RESUMO

Splicing-based transcriptome-wide association studies (splicing-TWASs) of breast cancer have the potential to identify susceptibility genes. However, existing splicing-TWASs test the association of individual excised introns in breast tissue only and thus have limited power to detect susceptibility genes. In this study, we performed a multi-tissue joint splicing-TWAS that integrated splicing-TWAS signals of multiple excised introns in each gene across 11 tissues that are potentially relevant to breast cancer risk. We utilized summary statistics from a meta-analysis that combined genome-wide association study (GWAS) results of 424,650 women of European ancestry. Splicing-level prediction models were trained in GTEx (v.8) data. We identified 240 genes by the multi-tissue joint splicing-TWAS at the Bonferroni-corrected significance level; in the tissue-specific splicing-TWAS that combined TWAS signals of excised introns in genes in breast tissue only, we identified nine additional significant genes. Of these 249 genes, 88 genes in 62 loci have not been reported by previous TWASs, and 17 genes in seven loci are at least 1 Mb away from published GWAS index variants. By comparing the results of our splicing-TWASs with previous gene-expression-based TWASs that used the same summary statistics and expression prediction models trained in the same reference panel, we found that 110 genes in 70 loci that are identified only by the splicing-TWASs. Our results showed that for many genes, expression quantitative trait loci (eQTL) did not show a significant impact on breast cancer risk, whereas splicing quantitative trait loci (sQTL) showed a strong impact through intron excision events.

2.
Am J Hum Genet ; 110(6): 950-962, 2023 06 01.
Artigo em Inglês | MEDLINE | ID: mdl-37164006

RESUMO

Genome-wide association studies (GWASs) have identified more than 200 genomic loci for breast cancer risk, but specific causal genes in most of these loci have not been identified. In fact, transcriptome-wide association studies (TWASs) of breast cancer performed using gene expression prediction models trained in breast tissue have yet to clearly identify most target genes. To identify candidate genes, we performed a GWAS analysis in a breast cancer dataset from UK Biobank (UKB) and combined the results with the GWAS results of the Breast Cancer Association Consortium (BCAC) by a meta-analysis. Using the summary statistics from the meta-analysis, we performed a joint TWAS analysis that combined TWAS signals from multiple tissues. We used expression prediction models trained in 11 tissues that are potentially relevant to breast cancer from the Genotype-Tissue Expression (GTEx) data. In the GWAS analysis, we identified eight loci distinct from those reported previously. In the TWAS analysis, we identified 309 genes at 108 genomic loci to be significantly associated with breast cancer at the Bonferroni threshold. Of these, 17 genes were located in eight regions that were at least 1 Mb away from published GWAS hits. The remaining TWAS-significant genes were located in 100 known genomic loci from previous GWASs of breast cancer. We found that 21 genes located in known GWAS loci remained statistically significant after conditioning on previous GWAS index variants. Our study provides insights into breast cancer genetics through mapping candidate target genes in a large proportion of known GWAS loci and discovering multiple new loci.


Assuntos
Neoplasias da Mama , Transcriptoma , Humanos , Feminino , Transcriptoma/genética , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Neoplasias da Mama/genética , Locos de Características Quantitativas/genética , Polimorfismo de Nucleotídeo Único/genética
3.
HGG Adv ; 2(3)2021 Jul 08.
Artigo em Inglês | MEDLINE | ID: mdl-34317694

RESUMO

Familial, sequencing, and genome-wide association studies (GWASs) and genetic correlation analyses have progressively unraveled the shared or pleiotropic germline genetics of breast and ovarian cancer. In this study, we aimed to leverage this shared germline genetics to improve the power of transcriptome-wide association studies (TWASs) to identify candidate breast cancer and ovarian cancer susceptibility genes. We built gene expression prediction models using the PrediXcan method in 681 breast and 295 ovarian tumors from The Cancer Genome Atlas and 211 breast and 99 ovarian normal tissue samples from the Genotype-Tissue Expression project and integrated these with GWAS meta-analysis data from the Breast Cancer Association Consortium (122,977 cases/105,974 controls) and the Ovarian Cancer Association Consortium (22,406 cases/40,941 controls). The integration was achieved through application of a pleiotropy-guided conditional/conjunction false discovery rate (FDR) approach in the setting of a TWASs. This identified 14 candidate breast cancer susceptibility genes spanning 11 genomic regions and 8 candidate ovarian cancer susceptibility genes spanning 5 genomic regions at conjunction FDR < 0.05 that were >1 Mb away from known breast and/or ovarian cancer susceptibility loci. We also identified 38 candidate breast cancer susceptibility genes and 17 candidate ovarian cancer susceptibility genes at conjunction FDR < 0.05 at known breast and/or ovarian susceptibility loci. The 22 genes identified by our cross-cancer analysis represent promising candidates that further elucidate the role of the transcriptome in mediating germline breast and ovarian cancer risk.

4.
Cell ; 184(10): 2633-2648.e19, 2021 05 13.
Artigo em Inglês | MEDLINE | ID: mdl-33864768

RESUMO

Long non-coding RNA (lncRNA) genes have well-established and important impacts on molecular and cellular functions. However, among the thousands of lncRNA genes, it is still a major challenge to identify the subset with disease or trait relevance. To systematically characterize these lncRNA genes, we used Genotype Tissue Expression (GTEx) project v8 genetic and multi-tissue transcriptomic data to profile the expression, genetic regulation, cellular contexts, and trait associations of 14,100 lncRNA genes across 49 tissues for 101 distinct complex genetic traits. Using these approaches, we identified 1,432 lncRNA gene-trait associations, 800 of which were not explained by stronger effects of neighboring protein-coding genes. This included associations between lncRNA quantitative trait loci and inflammatory bowel disease, type 1 and type 2 diabetes, and coronary artery disease, as well as rare variant associations to body mass index.


Assuntos
Doença/genética , Herança Multifatorial/genética , População/genética , RNA Longo não Codificante/genética , Transcriptoma , Doença da Artéria Coronariana/genética , Diabetes Mellitus Tipo 1/genética , Diabetes Mellitus Tipo 2/genética , Perfilação da Expressão Gênica , Variação Genética , Humanos , Doenças Inflamatórias Intestinais/genética , Especificidade de Órgãos/genética , Locos de Características Quantitativas
5.
Nat Commun ; 12(1): 1424, 2021 03 03.
Artigo em Inglês | MEDLINE | ID: mdl-33658504

RESUMO

Genetic studies of the transcriptome help bridge the gap between genetic variation and phenotypes. To maximize the potential of such studies, efficient methods to identify expression quantitative trait loci (eQTLs) and perform fine-mapping and genetic prediction of gene expression traits are needed. Current methods that leverage both total read counts and allele-specific expression to identify eQTLs are generally computationally intractable for large transcriptomic studies. Here, we describe a unified framework that addresses these needs and is scalable to thousands of samples. Using simulations and data from GTEx, we demonstrate its calibration and performance. For example, mixQTL shows a power gain equivalent to a 29% increase in sample size for genes with sufficient allele-specific read coverage. To showcase the potential of mixQTL, we apply it to 49 GTEx tissues and find 20% additional eQTLs (FDR < 0.05, per tissue) that are significantly more enriched among trait associated variants and candidate cis-regulatory elements comparing to the standard approach.


Assuntos
Alelos , Mapeamento Cromossômico/métodos , Locos de Características Quantitativas , Bases de Dados Genéticas , Estudo de Associação Genômica Ampla , Projeto Genoma Humano , Humanos , Modelos Genéticos , Modelos Estatísticos , Sequências Reguladoras de Ácido Nucleico
6.
Genome Biol ; 22(1): 49, 2021 01 26.
Artigo em Inglês | MEDLINE | ID: mdl-33499903

RESUMO

The resources generated by the GTEx consortium offer unprecedented opportunities to advance our understanding of the biology of human diseases. Here, we present an in-depth examination of the phenotypic consequences of transcriptome regulation and a blueprint for the functional interpretation of genome-wide association study-discovered loci. Across a broad set of complex traits and diseases, we demonstrate widespread dose-dependent effects of RNA expression and splicing. We develop a data-driven framework to benchmark methods that prioritize causal genes and find no single approach outperforms the combination of multiple approaches. Using colocalization and association approaches that take into account the observed allelic heterogeneity of gene expression, we propose potential target genes for 47% (2519 out of 5385) of the GWAS loci examined.


Assuntos
Expressão Gênica , Predisposição Genética para Doença/genética , Estudo de Associação Genômica Ampla/métodos , Genótipo , Genes , Humanos , Herança Multifatorial , Transcriptoma
7.
Genet Epidemiol ; 2020 Sep 10.
Artigo em Inglês | MEDLINE | ID: mdl-32964524

RESUMO

The integration of transcriptomic studies and genome-wide association studies (GWAS) via imputed expression has seen extensive application in recent years, enabling the functional characterization and causal gene prioritization of GWAS loci. However, the techniques for imputing transcriptomic traits from DNA variation remain underdeveloped. Furthermore, associations found when linking eQTL studies to complex traits through methods like PrediXcan can lead to false positives due to linkage disequilibrium between distinct causal variants. Therefore, the best prediction performance models may not necessarily lead to more reliable causal gene discovery. With the goal of improving discoveries without increasing false positives, we develop and compare multiple transcriptomic imputation approaches using the most recent GTEx release of expression and splicing data on 17,382 RNA-sequencing samples from 948 post-mortem donors in 54 tissues. We find that informing prediction models with posterior causal probability from fine-mapping (dap-g) and borrowing information across tissues (mashr) can lead to better performance in terms of number and proportion of significant associations that are colocalized and the proportion of silver standard genes identified as indicated by precision-recall and receiver operating characteristic curves. All prediction models are made publicly available at predictdb.org.

8.
Science ; 369(6509)2020 09 11.
Artigo em Inglês | MEDLINE | ID: mdl-32913072

RESUMO

Many complex human phenotypes exhibit sex-differentiated characteristics. However, the molecular mechanisms underlying these differences remain largely unknown. We generated a catalog of sex differences in gene expression and in the genetic regulation of gene expression across 44 human tissue sources surveyed by the Genotype-Tissue Expression project (GTEx, v8 release). We demonstrate that sex influences gene expression levels and cellular composition of tissue samples across the human body. A total of 37% of all genes exhibit sex-biased expression in at least one tissue. We identify cis expression quantitative trait loci (eQTLs) with sex-differentiated effects and characterize their cellular origin. By integrating sex-biased eQTLs with genome-wide association study data, we identify 58 gene-trait associations that are driven by genetic regulation of gene expression in a single sex. These findings provide an extensive characterization of sex differences in the human transcriptome and its genetic regulation.


Assuntos
Regulação da Expressão Gênica , Expressão Gênica , Caracteres Sexuais , Cromossomos Humanos X/genética , Doença/genética , Epigênese Genética , Feminino , Variação Genética , Estudo de Associação Genômica Ampla , Humanos , Masculino , Especificidade de Órgãos , Regiões Promotoras Genéticas , Locos de Características Quantitativas , Fatores Sexuais
9.
Science ; 369(6509)2020 09 11.
Artigo em Inglês | MEDLINE | ID: mdl-32913075

RESUMO

The Genotype-Tissue Expression (GTEx) project has identified expression and splicing quantitative trait loci in cis (QTLs) for the majority of genes across a wide range of human tissues. However, the functional characterization of these QTLs has been limited by the heterogeneous cellular composition of GTEx tissue samples. We mapped interactions between computational estimates of cell type abundance and genotype to identify cell type-interaction QTLs for seven cell types and show that cell type-interaction expression QTLs (eQTLs) provide finer resolution to tissue specificity than bulk tissue cis-eQTLs. Analyses of genetic associations with 87 complex traits show a contribution from cell type-interaction QTLs and enables the discovery of hundreds of previously unidentified colocalized loci that are masked in bulk tissue.


Assuntos
Regulação da Expressão Gênica , Locos de Características Quantitativas , Transcriptoma , Células/metabolismo , Humanos , Especificidade de Órgãos , RNA Longo não Codificante/genética
10.
Science ; 369(6509)2020 09 11.
Artigo em Inglês | MEDLINE | ID: mdl-32913073

RESUMO

Rare genetic variants are abundant across the human genome, and identifying their function and phenotypic impact is a major challenge. Measuring aberrant gene expression has aided in identifying functional, large-effect rare variants (RVs). Here, we expanded detection of genetically driven transcriptome abnormalities by analyzing gene expression, allele-specific expression, and alternative splicing from multitissue RNA-sequencing data, and demonstrate that each signal informs unique classes of RVs. We developed Watershed, a probabilistic model that integrates multiple genomic and transcriptomic signals to predict variant function, validated these predictions in additional cohorts and through experimental assays, and used them to assess RVs in the UK Biobank, the Million Veterans Program, and the Jackson Heart Study. Our results link thousands of RVs to diverse molecular effects and provide evidence to associate RVs affecting the transcriptome with human traits.


Assuntos
Variação Genética , Genoma Humano , Herança Multifatorial , Transcriptoma , Humanos , Especificidade de Órgãos
11.
Genome Biol ; 21(1): 235, 2020 09 11.
Artigo em Inglês | MEDLINE | ID: mdl-32912314

RESUMO

Genetic regulation of gene expression, revealed by expression quantitative trait loci (eQTLs), exhibits complex patterns of tissue-specific effects. Characterization of these patterns may allow us to better understand mechanisms of gene regulation and disease etiology. We develop a constrained matrix factorization model, sn-spMF, to learn patterns of tissue-sharing and apply it to 49 human tissues from the Genotype-Tissue Expression (GTEx) project. The learned factors reflect tissues with known biological similarity and identify transcription factors that may mediate tissue-specific effects. sn-spMF, available at https://github.com/heyuan7676/ts_eQTLs , can be applied to learn biologically interpretable patterns of eQTL tissue-specificity and generate testable mechanistic hypotheses.


Assuntos
Regulação da Expressão Gênica , Modelos Genéticos , Locos de Características Quantitativas , Fatores de Transcrição/metabolismo , Humanos
12.
Genome Biol ; 21(1): 233, 2020 09 11.
Artigo em Inglês | MEDLINE | ID: mdl-32912333

RESUMO

BACKGROUND: Population structure among study subjects may confound genetic association studies, and lack of proper correction can lead to spurious findings. The Genotype-Tissue Expression (GTEx) project largely contains individuals of European ancestry, but the v8 release also includes up to 15% of individuals of non-European ancestry. Assessing ancestry-based adjustments in GTEx improves portability of this research across populations and further characterizes the impact of population structure on GWAS colocalization. RESULTS: Here, we identify a subset of 117 individuals in GTEx (v8) with a high degree of population admixture and estimate genome-wide local ancestry. We perform genome-wide cis-eQTL mapping using admixed samples in seven tissues, adjusted by either global or local ancestry. Consistent with previous work, we observe improved power with local ancestry adjustment. At loci where the two adjustments produce different lead variants, we observe 31 loci (0.02%) where a significant colocalization is called only with one eQTL ancestry adjustment method. Notably, both adjustments produce similar numbers of significant colocalizations within each of two different colocalization methods, COLOC and FINEMAP. Finally, we identify a small subset of eQTL-associated variants highly correlated with local ancestry, providing a resource to enhance functional follow-up. CONCLUSIONS: We provide a local ancestry map for admixed individuals in the GTEx v8 release and describe the impact of ancestry and admixture on gene expression, eQTLs, and GWAS colocalization. While the majority of the results are concordant between local and global ancestry-based adjustments, we identify distinct advantages and disadvantages to each approach.


Assuntos
Genoma Humano , Estudo de Associação Genômica Ampla , Locos de Características Quantitativas , Grupos Raciais/genética , Expressão Gênica , Genótipo , Humanos
13.
Genet Epidemiol ; 43(6): 596-608, 2019 09.
Artigo em Inglês | MEDLINE | ID: mdl-30950127

RESUMO

Regulation of gene expression is an important mechanism through which genetic variation can affect complex traits. A substantial portion of gene expression variation can be explained by both local (cis) and distal (trans) genetic variation. Much progress has been made in uncovering cis-acting expression quantitative trait loci (cis-eQTL), but trans-eQTL have been more difficult to identify and replicate. Here we take advantage of our ability to predict the cis component of gene expression coupled with gene mapping methods such as PrediXcan to identify high confidence candidate trans-acting genes and their targets. That is, we correlate the cis component of gene expression with observed expression of genes in different chromosomes. Leveraging the shared cis-acting regulation across tissues, we combine the evidence of association across all available Genotype-Tissue Expression Project tissues and find 2,356 trans-acting/target gene pairs with high mappability scores. Reassuringly, trans-acting genes are enriched in transcription and nucleic acid binding pathways and target genes are enriched in known transcription factor binding sites. Interestingly, trans-acting genes are more significantly associated with selected complex traits and diseases than target or background genes, consistent with percolating trans effects. Our scripts and summary statistics are publicly available for future studies of trans-acting gene regulation.


Assuntos
Doenças Cardiovasculares/genética , Regulação da Expressão Gênica , Estudos de Associação Genética , Herança Multifatorial , Locos de Características Quantitativas , Transativadores/genética , Transcrição Gênica , Mapeamento Cromossômico , Genoma Humano , Humanos , Transcriptoma
14.
Nat Genet ; 51(4): 592-599, 2019 04.
Artigo em Inglês | MEDLINE | ID: mdl-30926968

RESUMO

Transcriptome-wide association studies (TWAS) integrate genome-wide association studies (GWAS) and gene expression datasets to identify gene-trait associations. In this Perspective, we explore properties of TWAS as a potential approach to prioritize causal genes at GWAS loci, by using simulations and case studies of literature-curated candidate causal genes for schizophrenia, low-density-lipoprotein cholesterol and Crohn's disease. We explore risk loci where TWAS accurately prioritizes the likely causal gene as well as loci where TWAS prioritizes multiple genes, some likely to be non-causal, owing to sharing of expression quantitative trait loci (eQTL). TWAS is especially prone to spurious prioritization with expression data from non-trait-related tissues or cell types, owing to substantial cross-cell-type variation in expression levels and eQTL strengths. Nonetheless, TWAS prioritizes candidate causal genes more accurately than simple baselines. We suggest best practices for causal-gene prioritization with TWAS and discuss future opportunities for improvement. Our results showcase the strengths and limitations of using eQTL datasets to determine causal genes at GWAS loci.


Assuntos
Predisposição Genética para Doença/genética , Transcriptoma/genética , Doença de Crohn/genética , Variação Genética/genética , Estudo de Associação Genômica Ampla/métodos , Humanos , Lipoproteínas LDL/genética , Locos de Características Quantitativas/genética , Esquizofrenia/genética
15.
PLoS Genet ; 15(1): e1007889, 2019 01.
Artigo em Inglês | MEDLINE | ID: mdl-30668570

RESUMO

Integration of genome-wide association studies (GWAS) and expression quantitative trait loci (eQTL) studies is needed to improve our understanding of the biological mechanisms underlying GWAS hits, and our ability to identify therapeutic targets. Gene-level association methods such as PrediXcan can prioritize candidate targets. However, limited eQTL sample sizes and absence of relevant developmental and disease context restrict our ability to detect associations. Here we propose an efficient statistical method (MultiXcan) that leverages the substantial sharing of eQTLs across tissues and contexts to improve our ability to identify potential target genes. MultiXcan integrates evidence across multiple panels using multivariate regression, which naturally takes into account the correlation structure. We apply our method to simulated and real traits from the UK Biobank and show that, in realistic settings, we can detect a larger set of significantly associated genes than using each panel separately. To improve applicability, we developed a summary result-based extension called S-MultiXcan, which we show yields highly concordant results with the individual level version when LD is well matched. Our multivariate model-based approach allowed us to use the individual level results as a gold standard to calibrate and develop a robust implementation of the summary-based extension. Results from our analysis as well as software and necessary resources to apply our method are publicly available.


Assuntos
Estudo de Associação Genômica Ampla/estatística & dados numéricos , Locos de Características Quantitativas/genética , Transcriptoma/genética , Expressão Gênica/genética , Humanos , Polimorfismo de Nucleotídeo Único/genética , Software/estatística & dados numéricos
16.
Genome Biol ; 19(1): 130, 2018 09 11.
Artigo em Inglês | MEDLINE | ID: mdl-30205839

RESUMO

Expression quantitative trait loci (eQTLs) identified using tumor gene expression data could affect gene expression in cancer cells, tumor-associated normal cells, or both. Here, we have demonstrated a method to identify eQTLs affecting expression in cancer cells by modeling the statistical interaction between genotype and tumor purity. Only one third of breast cancer risk variants, identified as eQTLs from a conventional analysis, could be confidently attributed to cancer cells. The remaining variants could affect cells of the tumor microenvironment, such as immune cells and fibroblasts. Deconvolution of tumor eQTLs will help determine how inherited polymorphisms influence cancer risk, development, and treatment response.


Assuntos
Expressão Gênica , Modelos Estatísticos , Neoplasias/genética , Locos de Características Quantitativas , Neoplasias da Mama/genética , Neoplasias da Mama/metabolismo , Carcinogênese/genética , Simulação por Computador , Feminino , Fibroblastos/metabolismo , Variação Genética , Estudo de Associação Genômica Ampla , Genótipo , Humanos , Neoplasias/metabolismo , Microambiente Tumoral
17.
Nat Commun ; 9(1): 1825, 2018 05 08.
Artigo em Inglês | MEDLINE | ID: mdl-29739930

RESUMO

Scalable, integrative methods to understand mechanisms that link genetic variants with phenotypes are needed. Here we derive a mathematical expression to compute PrediXcan (a gene mapping approach) results using summary data (S-PrediXcan) and show its accuracy and general robustness to misspecified reference sets. We apply this framework to 44 GTEx tissues and 100+ phenotypes from GWAS and meta-analysis studies, creating a growing public catalog of associations that seeks to capture the effects of gene expression variation on human phenotypes. Replication in an independent cohort is shown. Most of the associations are tissue specific, suggesting context specificity of the trait etiology. Colocalized significant associations in unexpected tissues underscore the need for an agnostic scanning of multiple contexts to improve our ability to detect causal regulatory mechanisms. Monogenic disease genes are enriched among significant associations for related traits, suggesting that smaller alterations of these genes may cause a spectrum of milder phenotypes.


Assuntos
Mapeamento Cromossômico/métodos , Expressão Gênica , Variação Genética , Estudo de Associação Genômica Ampla/estatística & dados numéricos , Modelos Genéticos , Simulação por Computador , Humanos , Metanálise como Assunto , Especificidade de Órgãos , Fenótipo , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas
18.
Nat Genet ; 50(1): 151-158, 2018 01.
Artigo em Inglês | MEDLINE | ID: mdl-29229983

RESUMO

The excision of introns from pre-mRNA is an essential step in mRNA processing. We developed LeafCutter to study sample and population variation in intron splicing. LeafCutter identifies variable splicing events from short-read RNA-seq data and finds events of high complexity. Our approach obviates the need for transcript annotations and circumvents the challenges in estimating relative isoform or exon usage in complex splicing events. LeafCutter can be used both to detect differential splicing between sample groups and to map splicing quantitative trait loci (sQTLs). Compared with contemporary methods, our approach identified 1.4-2.1 times more sQTLs, many of which helped us ascribe molecular effects to disease-associated variants. Transcriptome-wide associations between LeafCutter intron quantifications and 40 complex traits increased the number of associated disease genes at a 5% false discovery rate by an average of 2.1-fold compared with that detected through the use of gene expression levels alone. LeafCutter is fast, scalable, easy to use, and available online.


Assuntos
Processamento Alternativo , Análise de Sequência de RNA/métodos , Software , Animais , Doença/genética , Perfilação da Expressão Gênica , Variação Genética , Íntrons , Anotação de Sequência Molecular , Locos de Características Quantitativas
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...