Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 34
Filtrar
1.
Am J Hum Genet ; 111(6): 1100-1113, 2024 06 06.
Artigo em Inglês | MEDLINE | ID: mdl-38733992

RESUMO

Splicing-based transcriptome-wide association studies (splicing-TWASs) of breast cancer have the potential to identify susceptibility genes. However, existing splicing-TWASs test the association of individual excised introns in breast tissue only and thus have limited power to detect susceptibility genes. In this study, we performed a multi-tissue joint splicing-TWAS that integrated splicing-TWAS signals of multiple excised introns in each gene across 11 tissues that are potentially relevant to breast cancer risk. We utilized summary statistics from a meta-analysis that combined genome-wide association study (GWAS) results of 424,650 women of European ancestry. Splicing-level prediction models were trained in GTEx (v.8) data. We identified 240 genes by the multi-tissue joint splicing-TWAS at the Bonferroni-corrected significance level; in the tissue-specific splicing-TWAS that combined TWAS signals of excised introns in genes in breast tissue only, we identified nine additional significant genes. Of these 249 genes, 88 genes in 62 loci have not been reported by previous TWASs, and 17 genes in seven loci are at least 1 Mb away from published GWAS index variants. By comparing the results of our splicing-TWASs with previous gene-expression-based TWASs that used the same summary statistics and expression prediction models trained in the same reference panel, we found that 110 genes in 70 loci that are identified only by the splicing-TWASs. Our results showed that for many genes, expression quantitative trait loci (eQTL) did not show a significant impact on breast cancer risk, whereas splicing quantitative trait loci (sQTL) showed a strong impact through intron excision events.


Assuntos
Neoplasias da Mama , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Splicing de RNA , Transcriptoma , Humanos , Neoplasias da Mama/genética , Feminino , Splicing de RNA/genética , Íntrons/genética , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas , Perfilação da Expressão Gênica
2.
Am J Hum Genet ; 108(2): 240-256, 2021 02 04.
Artigo em Inglês | MEDLINE | ID: mdl-33434493

RESUMO

A transcriptome-wide association study (TWAS) integrates data from genome-wide association studies and gene expression mapping studies for investigating the gene regulatory mechanisms underlying diseases. Existing TWAS methods are primarily univariate in nature, focusing on analyzing one outcome trait at a time. However, many complex traits are correlated with each other and share a common genetic basis. Consequently, analyzing multiple traits jointly through multivariate analysis can potentially improve the power of TWASs. Here, we develop a method, moPMR-Egger (multiple outcome probabilistic Mendelian randomization with Egger assumption), for analyzing multiple outcome traits in TWAS applications. moPMR-Egger examines one gene at a time, relies on its cis-SNPs that are in potential linkage disequilibrium with each other to serve as instrumental variables, and tests its causal effects on multiple traits jointly. A key feature of moPMR-Egger is its ability to test and control for potential horizontal pleiotropic effects from instruments, thus maximizing power while minimizing false associations for TWASs. In simulations, moPMR-Egger provides calibrated type I error control for both causal effects testing and horizontal pleiotropic effects testing and is more powerful than existing univariate TWAS approaches in detecting causal associations. We apply moPMR-Egger to analyze 11 traits from 5 trait categories in the UK Biobank. In the analysis, moPMR-Egger identified 13.15% more gene associations than univariate approaches across trait categories and revealed distinct regulatory mechanisms underlying systolic and diastolic blood pressures.


Assuntos
Estudos de Associação Genética , Herança Multifatorial , Transcriptoma , Pressão Sanguínea/genética , Simulação por Computador , Pleiotropia Genética , Humanos , Desequilíbrio de Ligação , Análise da Randomização Mendeliana , Modelos Genéticos , Análise Multivariada , Fenótipo , Polimorfismo de Nucleotídeo Único
3.
Am J Hum Genet ; 108(9): 1765-1779, 2021 09 02.
Artigo em Inglês | MEDLINE | ID: mdl-34450030

RESUMO

An important goal of clinical genomics is to be able to estimate the risk of adverse disease outcomes. Between 5% and 10% of individuals with ulcerative colitis (UC) require colectomy within 5 years of diagnosis, but polygenic risk scores (PRSs) utilizing findings from genome-wide association studies (GWASs) are unable to provide meaningful prediction of this adverse status. By contrast, in Crohn disease, gene expression profiling of GWAS-significant genes does provide some stratification of risk of progression to complicated disease in the form of a transcriptional risk score (TRS). Here, we demonstrate that a measured TRS based on bulk rectal gene expression in the PROTECT inception cohort study has a positive predictive value approaching 50% for colectomy. Single-cell profiling demonstrates that the genes are active in multiple diverse cell types from both the epithelial and immune compartments. Expression quantitative trait locus (QTL) analysis identifies genes with differential effects at baseline and week 52 follow-up, but for the most part, differential expression associated with colectomy risk is independent of local genetic regulation. Nevertheless, a predicted polygenic transcriptional risk score (PPTRS) derived by summation of transcriptome-wide association study (TWAS) effects identifies UC-affected individuals at 5-fold elevated risk of colectomy with data from the UK Biobank population cohort studies, independently replicated in an NIDDK-IBDGC dataset. Prediction of gene expression from relatively small transcriptome datasets can thus be used in conjunction with TWASs for stratification of risk of disease complications.


Assuntos
Colectomia/estatística & dados numéricos , Colite Ulcerativa/cirurgia , Doença de Crohn/cirurgia , Locos de Características Quantitativas , Transcriptoma , Bancos de Espécimes Biológicos , Estudos de Coortes , Colite Ulcerativa/complicações , Colite Ulcerativa/diagnóstico , Colite Ulcerativa/genética , Colo/metabolismo , Colo/patologia , Colo/cirurgia , Doença de Crohn/complicações , Doença de Crohn/diagnóstico , Doença de Crohn/genética , Conjuntos de Dados como Assunto , Progressão da Doença , Perfilação da Expressão Gênica , Estudo de Associação Genômica Ampla , Humanos , Herança Multifatorial , Prognóstico , Medição de Risco , Reino Unido
4.
Int J Mol Sci ; 25(11)2024 Jun 05.
Artigo em Inglês | MEDLINE | ID: mdl-38892420

RESUMO

Genome-wide association studies (GWAS) significantly enhance our ability to identify trait-associated genomic variants by considering the host genome. Moreover, the hologenome refers to the host organism's collective genetic material and its associated microbiome. In this study, we utilized the hologenome framework, called Hologenome-wide association studies (HWAS), to dissect the architecture of complex traits, including milk yield, methane emissions, rumen physiology in cattle, and gut microbial composition in pigs. We employed four statistical models: (1) GWAS, (2) Microbial GWAS (M-GWAS), (3) HWAS-CG (hologenome interaction estimated using COvariance between Random Effects Genome-based restricted maximum likelihood (CORE-GREML)), and (4) HWAS-H (hologenome interaction estimated using the Hadamard product method). We applied Bonferroni correction to interpret the significant associations in the complex traits. The GWAS and M-GWAS detected one and sixteen significant SNPs for milk yield traits, respectively, whereas the HWAS-CG and HWAS-H each identified eight SNPs. Moreover, HWAS-CG revealed four, and the remaining models identified three SNPs each for methane emissions traits. The GWAS and HWAS-CG detected one and three SNPs for rumen physiology traits, respectively. For the pigs' gut microbial composition traits, the GWAS, M-GWAS, HWAS-CG, and HWAS-H identified 14, 16, 13, and 12 SNPs, respectively. We further explored these associations through SNP annotation and by analyzing biological processes and functional pathways. Additionally, we integrated our GWA results with expression quantitative trait locus (eQTL) data using transcriptome-wide association studies (TWAS) and summary-based Mendelian randomization (SMR) methods for a more comprehensive understanding of SNP-trait associations. Our study revealed hologenomic variability in agriculturally important traits, enhancing our understanding of host-microbiome interactions.


Assuntos
Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas , Animais , Bovinos/genética , Suínos/genética , Microbioma Gastrointestinal/genética , Rúmen/microbiologia , Rúmen/metabolismo , Fenótipo , Metano/metabolismo , Leite/metabolismo , Genoma
5.
Am J Hum Genet ; 106(2): 188-201, 2020 02 06.
Artigo em Inglês | MEDLINE | ID: mdl-31978332

RESUMO

There is particular interest in transcriptome-wide association studies (TWAS) gene-level tests based on multi-SNP predictive models of gene expression-for identifying causal genes at loci associated with complex traits. However, interpretation of TWAS associations may be complicated by divergent effects of model SNPs on phenotype and gene expression. We developed an iterative modeling scheme for obtaining multi-SNP models of gene expression and applied this framework to generate expression models for 43 human tissues from the Genotype-Tissue Expression (GTEx) Project. We characterized the performance of single- and multi-SNP models for identifying causal genes in GWAS data for 46 circulating metabolites. We show that: (A) multi-SNP models captured more variation in expression than did the top cis-eQTL (median 2-fold improvement); (B) predicted expression based on multi-SNP models was associated (false discovery rate < 0.01) with metabolite levels for 826 unique gene-metabolite pairs, but, after stepwise conditional analyses, 90% were dominated by a single eQTL SNP; (C) among the 35% of associations where a SNP in the expression model was a significant cis-eQTL and metabolomic-QTL (met-QTL), 92% demonstrated colocalization between these signals, but interpretation was often complicated by incomplete overlap of QTLs in multi-SNP models; and (D) using a "truth" set of causal genes at 61 met-QTLs, the sensitivity was high (67%), but the positive predictive value was low, as only 8% of TWAS associations (19% when restricted to colocalized associations at met-QTLs) involved true causal genes. These results guide the interpretation of TWAS and highlight the need for corroborative data to provide confident assignment of causality.


Assuntos
Regulação da Expressão Gênica , Predisposição Genética para Doença , Metaboloma , Modelos Genéticos , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas , Transcriptoma , Estudo de Associação Genômica Ampla , Humanos , Fenótipo
6.
Brief Bioinform ; 22(4)2021 07 20.
Artigo em Inglês | MEDLINE | ID: mdl-33200776

RESUMO

The power of genotype-phenotype association mapping studies increases greatly when contributions from multiple variants in a focal region are meaningfully aggregated. Currently, there are two popular categories of variant aggregation methods. Transcriptome-wide association studies (TWAS) represent a set of emerging methods that select variants based on their effect on gene expressions, providing pretrained linear combinations of variants for downstream association mapping. In contrast to this, kernel methods such as sequence kernel association test (SKAT) model genotypic and phenotypic variance use various kernel functions that capture genetic similarity between subjects, allowing nonlinear effects to be included. From the perspective of machine learning, these two methods cover two complementary aspects of feature engineering: feature selection/pruning and feature aggregation. Thus far, no thorough comparison has been made between these categories, and no methods exist which incorporate the advantages of TWAS- and kernel-based methods. In this work, we developed a novel method called kernel-based TWAS (kTWAS) that applies TWAS-like feature selection to a SKAT-like kernel association test, combining the strengths of both approaches. Through extensive simulations, we demonstrate that kTWAS has higher power than TWAS and multiple SKAT-based protocols, and we identify novel disease-associated genes in Wellcome Trust Case Control Consortium genotyping array data and MSSNG (Autism) sequence data. The source code for kTWAS and our simulations are available in our GitHub repository (https://github.com/theLongLab/kTWAS).


Assuntos
Simulação por Computador , Estudos de Associação Genética , Variação Genética , Modelos Genéticos , Software , Transcriptoma , Estudo de Associação Genômica Ampla , Genótipo , Humanos
7.
Genet Epidemiol ; 45(3): 324-337, 2021 04.
Artigo em Inglês | MEDLINE | ID: mdl-33369784

RESUMO

A transcriptome-wide association study (TWAS) attempts to identify disease associated genes by imputing gene expression into a genome-wide association study (GWAS) using an expression quantitative trait loci (eQTL) data set and then testing for associations with a trait of interest. Regulatory processes may be shared across related tissues and one natural extension of TWAS is harnessing cross-tissue correlation in gene expression to improve prediction accuracy. Here, we studied multi-tissue extensions of lasso regression and random forests (RF), joint lasso and RF-MTL (multi-task learning RF), respectively. We found that, on our chosen eQTL data set, multi-tissue methods were generally more accurate than their single-tissue counterparts, with RF-MTL performing the best. Simulations showed that these benefits generally translated into more associated genes identified, although highlighted that joint lasso had a tendency to erroneously identify genes in one tissue if there existed an eQTL signal for that gene in another. Applying the four methods to a type 1 diabetes GWAS, we found that multi-tissue methods found more unique associated genes for most of the tissues considered. We conclude that multi-tissue methods are competitive and, for some cell types, superior to single-tissue approaches and hold much promise for TWAS studies.


Assuntos
Estudo de Associação Genômica Ampla , Transcriptoma , Humanos , Modelos Genéticos , Fenótipo , Locos de Características Quantitativas
8.
Am J Hum Genet ; 105(2): 258-266, 2019 08 01.
Artigo em Inglês | MEDLINE | ID: mdl-31230719

RESUMO

The transcriptome-wide association studies (TWASs) that test for association between the study trait and the imputed gene expression levels from cis-acting expression quantitative trait loci (cis-eQTL) genotypes have successfully enhanced the discovery of genetic risk loci for complex traits. By using the gene expression imputation models fitted from reference datasets that have both genetic and transcriptomic data, TWASs facilitate gene-based tests with GWAS data while accounting for the reference transcriptomic data. The existing TWAS tools like PrediXcan and FUSION use parametric imputation models that have limitations for modeling the complex genetic architecture of transcriptomic data. Therefore, to improve on this, we employ a nonparametric Bayesian method that was originally proposed for genetic prediction of complex traits, which assumes a data-driven nonparametric prior for cis-eQTL effect sizes. The nonparametric Bayesian method is flexible and general because it includes both of the parametric imputation models used by PrediXcan and FUSION as special cases. Our simulation studies showed that the nonparametric Bayesian model improved both imputation R2 for transcriptomic data and the TWAS power over PrediXcan when ≥1% cis-SNPs co-regulate gene expression and gene expression heritability ≤0.2. In real applications, the nonparametric Bayesian method fitted transcriptomic imputation models for 57.8% more genes over PrediXcan, thus improving the power of follow-up TWASs. We implement both parametric PrediXcan and nonparametric Bayesian methods in a convenient software tool "TIGAR" (Transcriptome-Integrated Genetic Association Resource), which imputes transcriptomic data and performs subsequent TWASs using individual-level or summary-level GWAS data.


Assuntos
Envelhecimento/genética , Teorema de Bayes , Mapeamento Cromossômico/métodos , Demência/genética , Herança Multifatorial/genética , Polimorfismo de Nucleotídeo Único , Transcriptoma , Perfilação da Expressão Gênica , Estudo de Associação Genômica Ampla , Humanos , Fenótipo , Estudos Prospectivos , Locos de Características Quantitativas , Software
9.
Am J Epidemiol ; 190(6): 1148-1158, 2021 06 01.
Artigo em Inglês | MEDLINE | ID: mdl-33404048

RESUMO

Previous research has demonstrated the usefulness of hierarchical modeling for incorporating a flexible array of prior information in genetic association studies. When this prior information consists of estimates from association analyses of single-nucleotide polymorphisms (SNP)-intermediate or SNP-gene expression, a hierarchical model is equivalent to a 2-stage instrumental or transcriptome-wide association study (TWAS) analysis, respectively. We propose to extend our previous approach for the joint analysis of marginal summary statistics to incorporate prior information via a hierarchical model (hJAM). In this framework, the use of appropriate estimates as prior information yields an analysis similar to Mendelian randomization (MR) and TWAS approaches. hJAM is applicable to multiple correlated SNPs and intermediates to yield conditional estimates for the intermediates on the outcome, thus providing advantages over alternative approaches. We investigated the performance of hJAM in comparison with existing MR and TWAS approaches and demonstrated that hJAM yields an unbiased estimate, maintains correct type-I error, and has increased power across extensive simulations. We applied hJAM to 2 examples: estimating the causal effects of body mass index (GIANT Consortium) and type 2 diabetes (DIAGRAM data set, GERA Cohort, and UK Biobank) on myocardial infarction (UK Biobank) and estimating the causal effects of the expressions of the genes for nuclear casein kinase and cyclin dependent kinase substrate 1 and peptidase M20 domain containing 1 on the risk of prostate cancer (PRACTICAL and GTEx).


Assuntos
Interpretação Estatística de Dados , Perfilação da Expressão Gênica/métodos , Análise da Randomização Mendeliana/métodos , Modelos Genéticos , Amidoidrolases/análise , Viés , Índice de Massa Corporal , Diabetes Mellitus Tipo 2/genética , Feminino , Estudo de Associação Genômica Ampla , Humanos , Masculino , Infarto do Miocárdio/genética , Proteínas Nucleares/análise , Fosfoproteínas/análise , Polimorfismo de Nucleotídeo Único , Neoplasias da Próstata/genética
10.
Heart Vessels ; 34(11): 1882-1888, 2019 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-31065785

RESUMO

Atrial fibrillation (AF) is the most common sustained cardiac arrhythmia characterized by extensive structural, contractile and electrophysiological remodeling. The genetic basis of AF remained elusive until now. Transcriptome-wide association study (TWAS) was conducted by FUSION tool using gene expression weights of 7 tissues combined with a large-scale genome-wide association study (GWAS) dataset of AF, totally involving 8180 AF cases and 28,612 controls. Significant genes identified by TWAS were then subjected to gene ontology (GO) and pathway enrichment analysis. The genome-wide mRNA gene expression profiling of AF was compared with the results of TWAS to detect common genes shared by TWAS and mRNA expression profiling of AF. TWAS detected a group of candidate genes with PTWAS values < 0.05 across the seven tissues for AF, such as CMAH (PTWAS = 3.15 × 10-25 for whole blood), INCENP (PTWAS = 1.77 × 10-22 for artery aorta), CMAHP (PTWAS = 4.57 × 10-20 for artery aorta). Pathway enrichment analysis identified multiple candidate pathways, such as protein K48-linked ubiquitination (P value = 0.0124), positive regulation of leukocyte chemotaxis (P value = 0.0046) and fatty acid degradation (P value = 0.0295). Further comparing the GO results of TWAS and mRNA expression profiling, 2 common GO terms were identified, including actin binding (PTWAS = 0.0446, PmRNA = 7.00 × 10-4) and extracellular matrix (PTWAS = 0.0037, PmRNA = 3.00 × 10-6). We detected multiple novel candidate genes, GO terms and pathways for AF, providing novel clues for understanding the genetic mechanism of AF.


Assuntos
Fibrilação Atrial/genética , Perfilação da Expressão Gênica/métodos , Regulação da Expressão Gênica , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla/métodos , RNA Mensageiro/genética , Transcriptoma/genética , Fibrilação Atrial/metabolismo , Seguimentos , Humanos , Estudos Prospectivos , RNA Mensageiro/biossíntese
11.
Cell Biosci ; 14(1): 29, 2024 Feb 25.
Artigo em Inglês | MEDLINE | ID: mdl-38403629

RESUMO

Crohn's disease (CD) is regarded as a lifelong progressive disease affecting all segments of the intestinal tract and multiple organs. Based on genome-wide association studies (GWAS) and gene expression data, transcriptome-wide association studies (TWAS) can help identify susceptibility genes associated with pathogenesis and disease behavior. In this review, we overview seven reported TWASs of CD, summarize their study designs, and discuss the key methods and steps used in TWAS, which affect the prioritization of susceptibility genes. This article summarized the screening of tissue-specific susceptibility genes for CD, and discussed the reported potential pathological mechanisms of overlapping susceptibility genes related to CD in a certain tissue type. We observed that ileal lipid-related metabolism and colonic extracellular vesicles may be involved in the pathogenesis of CD by performing GO pathway enrichment analysis for susceptibility genes. We further pointed the low reproducibility of TWAS associated with CD and discussed the reasons for these issues, strategies for solving them. In the future, more TWAS are needed to be designed into large-scale, unified cohorts, unified analysis pipelines, and fully classified databases of expression trait loci.

12.
Curr Protoc ; 4(2): e981, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-38314955

RESUMO

Transcriptome-wide association study (TWAS) methodologies aim to identify genetic effects on phenotypes through the mediation of gene transcription. In TWAS, in silico models of gene expression are trained as functions of genetic variants and then applied to genome-wide association study (GWAS) data. This post-GWAS analysis identifies gene-trait associations with high interpretability, enabling follow-up functional genomics studies and the development of genetics-anchored resources. We provide an overview of commonly used TWAS approaches, their advantages and limitations, and some widely used applications. © 2024 Wiley Periodicals LLC.


Assuntos
Estudo de Associação Genômica Ampla , Transcriptoma , Transcriptoma/genética , Estudo de Associação Genômica Ampla/métodos , Locos de Características Quantitativas , Simulação por Computador , Fenótipo
13.
bioRxiv ; 2023 May 20.
Artigo em Inglês | MEDLINE | ID: mdl-36798214

RESUMO

Transcriptome prediction models built with data from European-descent individuals are less accurate when applied to different populations because of differences in linkage disequilibrium patterns and allele frequencies. We hypothesized methods that leverage shared regulatory effects across different conditions, in this case, across different populations may improve cross-population transcriptome prediction. To test this hypothesis, we made transcriptome prediction models for use in transcriptome-wide association studies (TWAS) using different methods (Elastic Net, Joint-Tissue Imputation (JTI), Matrix eQTL, Multivariate Adaptive Shrinkage in R (MASHR), and Transcriptome-Integrated Genetic Association Resource (TIGAR)) and tested their out-of-sample transcriptome prediction accuracy in population-matched and cross-population scenarios. Additionally, to evaluate model applicability in TWAS, we integrated publicly available multi-ethnic genome-wide association study (GWAS) summary statistics from the Population Architecture using Genomics and Epidemiology Study (PAGE) and Pan-UK Biobank with our developed transcriptome prediction models. In regard to transcriptome prediction accuracy, MASHR models performed better or the same as other methods in both population-matched and cross-population transcriptome predictions. Furthermore, in multi-ethnic TWAS, MASHR models yielded more discoveries that replicate in both PAGE and PanUKBB across all methods analyzed, including loci previously mapped in GWAS and new loci previously not found in GWAS. Overall, our study demonstrates the importance of using methods that benefit from different populations' effect size estimates in order to improve TWAS for multi-ethnic or underrepresented populations.

14.
bioRxiv ; 2023 Mar 31.
Artigo em Inglês | MEDLINE | ID: mdl-37034675

RESUMO

Reproducibility is a cornerstone of scientific progress. In epigenome- and transcriptome-wide association studies (E/TWAS) failure to reproduce may be the result of false discoveries. Whereas multiple methods exist to control false discoveries due to sampling error, minimizing false discoveries due to outliers and other data artefacts remains challenging. We propose a robust E/TWAS approach that outperforms alternative methods to improve reproducibility such as split-half replication. Furthermore, robust E/TWAS results in only a minor loss of power if there are no outliers and can in the presence of outliers, likely a more realistic scenario, even be more powerful than regular E/TWAS.

15.
HGG Adv ; 4(4): 100223, 2023 10 12.
Artigo em Inglês | MEDLINE | ID: mdl-37576186

RESUMO

Accurate imputation of tissue-specific gene expression can be a powerful tool for understanding the biological mechanisms underlying human complex traits. Existing imputation methods can be grouped into two categories according to the types of predictors used. The first category uses genotype data, while the second category uses whole-blood expression data. Both data types can be easily collected from blood, avoiding invasive tissue biopsies. In this study, we attempted to build an optimal predictive model for imputing tissue-specific gene expression by combining the genotype and whole-blood expression data. We first evaluated the imputation performance of each standalone model (using genotype data [GEN model] and using whole-blood expression data [WBE model]) using their respective data types across 47 human tissues. The WBE model outperformed the GEN model in most tissues by a large gain. Then, we developed several combined models that leverage both types of predictors to further improve imputation performance. We tried various strategies, including utilizing a merged dataset of the two data types (MERGED models) and integrating the imputation outcomes of the two standalone models (inverse variance-weighted [IVW] models). We found that one of the MERGED models noticeably outperformed the standalone models. This model involved a fixed ratio between the two regularization penalty factors for the two predictor types so that the contribution of the whole-blood transcriptome is upweighted compared with the genotype. Our study suggests that one can improve the imputation of tissue-specific gene expression by combining the genotype and whole-blood expression, but the improvement can be largely dependent on the combination strategy chosen.


Assuntos
Estudo de Associação Genômica Ampla , Transcriptoma , Humanos , Transcriptoma/genética , Fenótipo , Estudo de Associação Genômica Ampla/métodos , Locos de Características Quantitativas , Polimorfismo de Nucleotídeo Único , Genótipo
16.
HGG Adv ; 4(4): 100216, 2023 Oct 12.
Artigo em Inglês | MEDLINE | ID: mdl-37869564

RESUMO

Transcriptome prediction models built with data from European-descent individuals are less accurate when applied to different populations because of differences in linkage disequilibrium patterns and allele frequencies. We hypothesized that methods that leverage shared regulatory effects across different conditions, in this case, across different populations, may improve cross-population transcriptome prediction. To test this hypothesis, we made transcriptome prediction models for use in transcriptome-wide association studies (TWASs) using different methods (elastic net, joint-tissue imputation [JTI], matrix expression quantitative trait loci [Matrix eQTL], multivariate adaptive shrinkage in R [MASHR], and transcriptome-integrated genetic association resource [TIGAR]) and tested their out-of-sample transcriptome prediction accuracy in population-matched and cross-population scenarios. Additionally, to evaluate model applicability in TWASs, we integrated publicly available multiethnic genome-wide association study (GWAS) summary statistics from the Population Architecture using Genomics and Epidemiology (PAGE) study and Pan-ancestry genetic analysis of the UK Biobank (PanUKBB) with our developed transcriptome prediction models. In regard to transcriptome prediction accuracy, MASHR models performed better or the same as other methods in both population-matched and cross-population transcriptome predictions. Furthermore, in multiethnic TWASs, MASHR models yielded more discoveries that replicate in both PAGE and PanUKBB across all methods analyzed, including loci previously mapped in GWASs and loci previously not found in GWASs. Overall, our study demonstrates the importance of using methods that benefit from different populations' effect size estimates in order to improve TWASs for multiethnic or underrepresented populations.


Assuntos
Estudo de Associação Genômica Ampla , Transcriptoma , Humanos , Transcriptoma/genética , Locos de Características Quantitativas/genética , Frequência do Gene , Desequilíbrio de Ligação
17.
Mol Neurobiol ; 60(9): 5055-5066, 2023 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-37246165

RESUMO

Epilepsy is a severe neurological condition affecting 50-65 million individuals worldwide that can lead to brain damage. Nevertheless, the etiology of epilepsy remains poorly understood. Meta-analyses of genome-wide association studies involving 15,212 epilepsy cases and 29,677 controls of the ILAE Consortium cohort were used to conduct transcriptome-wide association studies (TWAS) and protein-wide association studies (PWAS). Furthermore, a protein-protein interaction (PPI) network was generated using the STRING database, and significant epilepsy-susceptible genes were verified using chip data. Chemical-related gene set enrichment analysis (CGSEA) was performed to determine novel drug targets for epilepsy. TWAS analysis identified 21,170 genes, of which 58 were significant (TWASfdr < 0.05) in ten brain regions, and 16 differentially expressed genes were verified based on mRNA expression profiles. The PWAS identified 2249 genes, of which 2 were significant (PWASfdr < 0.05). Through chemical-gene set enrichment analysis, 287 environmental chemicals associated with epilepsy were identified. We identified five significant genes (WIPF1, IQSEC1, JAM2, ICAM3, and ZNF143) that had causal relationships with epilepsy. CGSEA identified 159 chemicals that were significantly correlated with epilepsy (Pcgsea < 0.05), such as pentobarbital, ketone bodies, and polychlorinated biphenyl. In summary, we performed TWAS, PWAS (for genetic factors), and CGSEA (for environmental factors) analyses and identified several epilepsy-associated genes and chemicals. The results of this study will contribute to our understanding of genetic and environmental factors for epilepsy and may predict novel drug targets.


Assuntos
Epilepsia , Transcriptoma , Humanos , Transcriptoma/genética , Perfilação da Expressão Gênica/métodos , Estudo de Associação Genômica Ampla/métodos , Encéfalo , Epilepsia/genética , Predisposição Genética para Doença , Polimorfismo de Nucleotídeo Único , Proteínas do Citoesqueleto/genética , Peptídeos e Proteínas de Sinalização Intracelular/genética , Transativadores/genética
18.
Front Plant Sci ; 13: 905842, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35958208

RESUMO

Ionomics, the study of the composition of mineral nutrients and trace elements in organisms that represent the inorganic component of cells and tissues, has been widely studied to explore to unravel the molecular mechanism regulating the elemental composition of plants. However, the genetic factors of rice subspecies in the interaction between arsenic and functional ions have not yet been explained. Here, the correlation between As and eight essential ions in a rice core collection was analyzed, taking into account growing condition and genetic factors. The results demonstrated that the correlation between As and essential ions was affected by genetic factors and growing condition, but it was confirmed that the genetic factor was slightly larger with the heritability for arsenic content at 53%. In particular, the cluster coefficient of japonica (0.428) was larger than that of indica (0.414) in the co-expression network analysis for 23 arsenic genes, and it was confirmed that the distance between genes involved in As induction and detoxification of japonica was far than that of indica. These findings provide evidence that japonica populations could accumulate more As than indica populations. In addition, the cis-eQTLs of AIR2 (arsenic-induced RING finger protein) were isolated through transcriptome-wide association studies, and it was confirmed that AIR2 expression levels of indica were lower than those of japonica. This was consistent with the functional haplotype results for the genome sequence of AIR2, and finally, eight rice varieties with low AIR2 expression and arsenic content were selected. In addition, As-related QTLs were identified on chromosomes 5 and 6 under flooded and intermittently flooded conditions through genome-scale profiling. Taken together, these results might assist in developing markers and breeding plans to reduce toxic element content and breeding high-quality rice varieties in future.

19.
Genes (Basel) ; 13(8)2022 07 27.
Artigo em Inglês | MEDLINE | ID: mdl-35893077

RESUMO

Although previous genome-wide association studies (GWASs) on post-traumatic stress disorder (PTSD) have identified multiple risk loci, how these loci confer risk of PTSD remains unclear. Through the FUSION pipeline, we integrated two human brain proteome reference datasets (ROS/MAP and Banner) with the PTSD GWAS dataset, respectively, to conduct a proteome-wide association study (PWAS) analysis. Then two transcriptome reference weights (Rnaseq and Splicing) were applied to a transcriptome-wide association study (TWAS) analysis. Finally, the PWAS and TWAS results were investigated through brain imaging analysis. In the PWAS analysis, 8 and 13 candidate genes were identified in the ROS/MAP and Banner reference weight groups, respectively. Examples included ADK (pPWAS-ROS/MAP = 3.00 × 10-5) and C3orf18 (pPWAS-Banner = 7.07 × 10-31). Moreover, the TWAS also detected multiple candidate genes associated with PTSD in two different reference weight groups, including RIMS2 (pTWAS-Splicing = 3.84 × 10-2), CHMP1A (pTWAS-Rnaseq = 5.09 × 10-4), and SIRT5 (pTWAS-Splicing = 4.81 × 10-3). Further comparison of the PWAS and TWAS results in different populations detected the overlapping genes: MADD (pPWAS-Banner = 4.90 × 10-2, pTWAS-Splicing = 1.23 × 10-2) in the total population and GLO1(pPWAS-Banner = 4.89 × 10-3, pTWAS-Rnaseq = 1.41 × 10-3) in females. Brain imaging analysis revealed several different brain imaging phenotypes associated with MADD and GLO1 genes. Our study identified multiple candidate genes associated with PTSD in the proteome and transcriptome levels, which may provide new clues to the pathogenesis of PTSD.


Assuntos
Estudo de Associação Genômica Ampla , Transtornos de Estresse Pós-Traumáticos , Encéfalo/metabolismo , Feminino , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla/métodos , Humanos , Proteoma/genética , Proteoma/metabolismo , RNA Mensageiro/genética , Espécies Reativas de Oxigênio , Transtornos de Estresse Pós-Traumáticos/genética
20.
Genetics ; 222(4)2022 11 30.
Artigo em Inglês | MEDLINE | ID: mdl-36227056

RESUMO

Transcriptome-wide association studies aim to integrate genome-wide association studies and expression quantitative trait loci mapping studies for exploring the gene regulatory mechanisms underlying diseases. Existing transcriptome-wide association study methods primarily focus on 1 gene at a time. However, complex diseases are seldom resulted from the abnormality of a single gene, but from the biological network involving multiple genes. In addition, binary or ordinal categorical phenotypes are commonly encountered in biomedicine. We develop a proportional odds logistic model for network regression in transcriptome-wide association study, Proportional Odds LOgistic model for NEtwork regression in Transcriptome-wide association study, to detect the association between a network and binary or ordinal categorical phenotype. Proportional Odds LOgistic model for NEtwork regression in Transcriptome-wide association study relies on 2-stage transcriptome-wide association study framework. It first adopts the distribution-robust nonparametric Dirichlet process regression model in expression quantitative trait loci study to obtain the SNP effect estimate on each gene within the network. Then, Proportional Odds LOgistic model for NEtwork regression in Transcriptome-wide association study uses pointwise mutual information to represent the general relationship among the network nodes of predicted gene expression in genome-wide association study, followed by the association analysis with all nodes and edges involved in proportional odds logistic model. A key feature of Proportional Odds LOgistic model for NEtwork regression in Transcriptome-wide association study is its ability to simultaneously identify the disease-related network nodes or edges. With extensive realistic simulations including those under various between-node correlation patterns, we show Proportional Odds LOgistic model for NEtwork regression in Transcriptome-wide association study can provide calibrated type I error control and yield higher power than other existing methods. We finally apply Proportional Odds LOgistic model for NEtwork regression in Transcriptome-wide association study to analyze bipolar and major depression status and blood pressure from UK Biobank to illustrate its benefits in real data analysis.


Assuntos
Estudo de Associação Genômica Ampla , Transcriptoma , Humanos , Estudo de Associação Genômica Ampla/métodos , Locos de Características Quantitativas , Fenótipo , Análise de Regressão , Polimorfismo de Nucleotídeo Único , Predisposição Genética para Doença
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa