Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 176
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Cell ; 179(3): 750-771.e22, 2019 10 17.
Artigo em Inglês | MEDLINE | ID: mdl-31626773

RESUMO

Tissue-specific regulatory regions harbor substantial genetic risk for disease. Because brain development is a critical epoch for neuropsychiatric disease susceptibility, we characterized the genetic control of the transcriptome in 201 mid-gestational human brains, identifying 7,962 expression quantitative trait loci (eQTL) and 4,635 spliceQTL (sQTL), including several thousand prenatal-specific regulatory regions. We show that significant genetic liability for neuropsychiatric disease lies within prenatal eQTL and sQTL. Integration of eQTL and sQTL with genome-wide association studies (GWAS) via transcriptome-wide association identified dozens of novel candidate risk genes, highlighting shared and stage-specific mechanisms in schizophrenia (SCZ). Gene network analysis revealed that SCZ and autism spectrum disorder (ASD) affect distinct developmental gene co-expression modules. Yet, in each disorder, common and rare genetic variation converges within modules, which in ASD implicates superficial cortical neurons. More broadly, these data, available as a web browser and our analyses, demonstrate the genetic mechanisms by which developmental events have a widespread influence on adult anatomical and behavioral phenotypes.


Assuntos
Transtorno do Espectro Autista/genética , Locos de Características Quantitativas/genética , Esquizofrenia/genética , Transcriptoma/genética , Transtorno do Espectro Autista/metabolismo , Transtorno do Espectro Autista/patologia , Encéfalo/crescimento & desenvolvimento , Encéfalo/metabolismo , Feminino , Feto/metabolismo , Regulação da Expressão Gênica no Desenvolvimento , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Idade Gestacional , Humanos , Masculino , Neurônios/metabolismo , Polimorfismo de Nucleotídeo Único/genética , Splicing de RNA/genética , Esquizofrenia/metabolismo , Esquizofrenia/patologia
2.
Am J Hum Genet ; 2024 Jun 20.
Artigo em Inglês | MEDLINE | ID: mdl-38925119

RESUMO

Recent studies have highlighted the essential role of RNA splicing, a key mechanism of alternative RNA processing, in establishing connections between genetic variations and disease. Genetic loci influencing RNA splicing variations show considerable influence on complex traits, possibly surpassing those affecting total gene expression. Dysregulated RNA splicing has emerged as a major potential contributor to neurological and psychiatric disorders, likely due to the exceptionally high prevalence of alternatively spliced genes in the human brain. Nevertheless, establishing direct associations between genetically altered splicing and complex traits has remained an enduring challenge. We introduce Spliced-Transcriptome-Wide Associations (SpliTWAS) to integrate alternative splicing information with genome-wide association studies to pinpoint genes linked to traits through exon splicing events. We applied SpliTWAS to two schizophrenia (SCZ) RNA-sequencing datasets, BrainGVEX and CommonMind, revealing 137 and 88 trait-associated exons (in 84 and 67 genes), respectively. Enriched biological functions in the associated gene sets converged on neuronal function and development, immune cell activation, and cellular transport, which are highly relevant to SCZ. SpliTWAS variants impacted RNA-binding protein binding sites, revealing potential disruption of RNA-protein interactions affecting splicing. We extended the probabilistic fine-mapping method FOCUS to the exon level, identifying 36 genes and 48 exons as putatively causal for SCZ. We highlight VPS45 and APOPT1, where splicing of specific exons was associated with disease risk, eluding detection by conventional gene expression analysis. Collectively, this study supports the substantial role of alternative splicing in shaping the genetic basis of SCZ, providing a valuable approach for future investigations in this area.

3.
Am J Hum Genet ; 111(6): 1084-1099, 2024 Jun 06.
Artigo em Inglês | MEDLINE | ID: mdl-38723630

RESUMO

Transcriptome-wide association studies (TWASs) have investigated the role of genetically regulated transcriptional activity in the etiologies of breast and ovarian cancer. However, methods performed to date have focused on the regulatory effects of risk-associated SNPs thought to act in cis on a nearby target gene. With growing evidence for distal (trans) regulatory effects of variants on gene expression, we performed TWASs of breast and ovarian cancer using a Bayesian genome-wide TWAS method (BGW-TWAS) that considers effects of both cis- and trans-expression quantitative trait loci (eQTLs). We applied BGW-TWAS to whole-genome and RNA sequencing data in breast and ovarian tissues from the Genotype-Tissue Expression project to train expression imputation models. We applied these models to large-scale GWAS summary statistic data from the Breast Cancer and Ovarian Cancer Association Consortia to identify genes associated with risk of overall breast cancer, non-mucinous epithelial ovarian cancer, and 10 cancer subtypes. We identified 101 genes significantly associated with risk with breast cancer phenotypes and 8 with ovarian phenotypes. These loci include established risk genes and several novel candidate risk loci, such as ACAP3, whose associations are predominantly driven by trans-eQTLs. We replicated several associations using summary statistics from an independent GWAS of these cancer phenotypes. We further used genotype and expression data in normal and tumor breast tissue from the Cancer Genome Atlas to examine the performance of our trained expression imputation models. This work represents an in-depth look into the role of trans eQTLs in the complex molecular mechanisms underlying these diseases.


Assuntos
Neoplasias da Mama , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Neoplasias Ovarianas , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas , Humanos , Feminino , Neoplasias Ovarianas/genética , Neoplasias Ovarianas/patologia , Neoplasias da Mama/genética , Neoplasias da Mama/patologia , Teorema de Bayes , Transcriptoma , Regulação Neoplásica da Expressão Gênica
4.
Am J Hum Genet ; 111(3): 445-455, 2024 Mar 07.
Artigo em Inglês | MEDLINE | ID: mdl-38320554

RESUMO

Regulation of transcription and translation are mechanisms through which genetic variants affect complex traits. Expression quantitative trait locus (eQTL) studies have been more successful at identifying cis-eQTL (within 1 Mb of the transcription start site) than trans-eQTL. Here, we tested the cis component of gene expression for association with observed plasma protein levels to identify cis- and trans-acting genes that regulate protein levels. We used transcriptome prediction models from 49 Genotype-Tissue Expression (GTEx) Project tissues to predict the cis component of gene expression and tested the predicted expression of every gene in every tissue for association with the observed abundance of 3,622 plasma proteins measured in 3,301 individuals from the INTERVAL study. We tested significant results for replication in 971 individuals from the Trans-omics for Precision Medicine (TOPMed) Multi-Ethnic Study of Atherosclerosis (MESA). We found 1,168 and 1,210 cis- and trans-acting associations that replicated in TOPMed (FDR < 0.05) with a median expected true positive rate (π1) across tissues of 0.806 and 0.390, respectively. The target proteins of trans-acting genes were enriched for transcription factor binding sites and autoimmune diseases in the GWAS catalog. Furthermore, we found a higher correlation between predicted expression and protein levels of the same underlying gene (R = 0.17) than observed expression (R = 0.10, p = 7.50 × 10-11). This indicates the cis-acting genetically regulated (heritable) component of gene expression is more consistent across tissues than total observed expression (genetics + environment) and is useful in uncovering the function of SNPs associated with complex traits.


Assuntos
Proteoma , Transcriptoma , Humanos , Transcriptoma/genética , Proteoma/genética , Herança Multifatorial , Locos de Características Quantitativas/genética , Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único/genética
5.
Am J Hum Genet ; 2024 May 23.
Artigo em Inglês | MEDLINE | ID: mdl-38821058

RESUMO

Both trio and population designs are popular study designs for identifying risk genetic variants in genome-wide association studies (GWASs). The trio design, as a family-based design, is robust to confounding due to population structure, whereas the population design is often more powerful due to larger sample sizes. Here, we propose KnockoffHybrid, a knockoff-based statistical method for hybrid analysis of both the trio and population designs. KnockoffHybrid provides a unified framework that brings together the advantages of both designs and produces powerful hybrid analysis while controlling the false discovery rate (FDR) in the presence of linkage disequilibrium and population structure. Furthermore, KnockoffHybrid has the flexibility to leverage different types of summary statistics for hybrid analyses, including expression quantitative trait loci (eQTL) and GWAS summary statistics. We demonstrate in simulations that KnockoffHybrid offers power gains over non-hybrid methods for the trio and population designs with the same number of cases while controlling the FDR with complex correlation among variants and population structure among subjects. In hybrid analyses of three trio cohorts for autism spectrum disorders (ASDs) from the Autism Speaks MSSNG, Autism Sequencing Consortium, and Autism Genome Project with GWAS summary statistics from the iPSYCH project and eQTL summary statistics from the MetaBrain project, KnockoffHybrid outperforms conventional methods by replicating several known risk genes for ASDs and identifying additional associations with variants in other genes, including the PRAME family genes involved in axon guidance and which may act as common targets for human speech/language evolution and related disorders.

6.
Hum Mol Genet ; 33(7): 624-635, 2024 Mar 20.
Artigo em Inglês | MEDLINE | ID: mdl-38129112

RESUMO

Transcriptome-wide association studies (TWAS) integrate gene expression prediction models and genome-wide association studies (GWAS) to identify gene-trait associations. The power of TWAS is determined by the sample size of GWAS and the accuracy of the expression prediction model. Here, we present a new method, the Summary-level Unified Method for Modeling Integrated Transcriptome using Functional Annotations (SUMMIT-FA), which improves gene expression prediction accuracy by leveraging functional annotation resources and a large expression quantitative trait loci (eQTL) summary-level dataset. We build gene expression prediction models in whole blood using SUMMIT-FA with the comprehensive functional database MACIE and eQTL summary-level data from the eQTLGen consortium. We apply these models to GWAS for 24 complex traits and show that SUMMIT-FA identifies significantly more gene-trait associations and improves predictive power for identifying "silver standard" genes compared to several benchmark methods. We further conduct a simulation study to demonstrate the effectiveness of SUMMIT-FA.


Assuntos
Estudo de Associação Genômica Ampla , Transcriptoma , Humanos , Transcriptoma/genética , Estudo de Associação Genômica Ampla/métodos , Simulação por Computador , Locos de Características Quantitativas/genética , Fenótipo , Polimorfismo de Nucleotídeo Único , Predisposição Genética para Doença
7.
Hum Mol Genet ; 33(4): 333-341, 2024 Feb 01.
Artigo em Inglês | MEDLINE | ID: mdl-37903058

RESUMO

Transcriptome-wide association studies (TWAS) have identified many putative susceptibility genes for colorectal cancer (CRC) risk. However, susceptibility miRNAs, critical dysregulators of gene expression, remain unexplored. We genotyped DNA samples from 313 CRC East Asian patients and performed small RNA sequencing in their normal colon tissues distant from tumors to build genetic models for predicting miRNA expression. We applied these models and data from genome-wide association studies (GWAS) including 23 942 cases and 217 267 controls of East Asian ancestry to investigate associations of predicted miRNA expression with CRC risk. Perturbation experiments separately by promoting and inhibiting miRNAs expressions and further in vitro assays in both SW480 and HCT116 cells were conducted. At a Bonferroni-corrected threshold of P < 4.5 × 10-4, we identified two putative susceptibility miRNAs, miR-1307-5p and miR-192-3p, located in regions more than 500 kb away from any GWAS-identified risk variants in CRC. We observed that a high predicted expression of miR-1307-5p was associated with increased CRC risk, while a low predicted expression of miR-192-3p was associated with increased CRC risk. Our experimental results further provide strong evidence of their susceptible roles by showing that miR-1307-5p and miR-192-3p play a regulatory role, respectively, in promoting and inhibiting CRC cell proliferation, migration, and invasion, which was consistently observed in both SW480 and HCT116 cells. Our study provides additional insights into the biological mechanisms underlying CRC development.


Assuntos
Neoplasias Colorretais , MicroRNAs , Humanos , MicroRNAs/genética , MicroRNAs/metabolismo , Transcriptoma/genética , Estudo de Associação Genômica Ampla , Neoplasias Colorretais/metabolismo , Células HCT116 , Regulação Neoplásica da Expressão Gênica/genética , Proliferação de Células/genética
8.
Am J Hum Genet ; 110(1): 44-57, 2023 01 05.
Artigo em Inglês | MEDLINE | ID: mdl-36608684

RESUMO

Integrative genetic association methods have shown great promise in post-GWAS (genome-wide association study) analyses, in which one of the most challenging tasks is identifying putative causal genes and uncovering molecular mechanisms of complex traits. Recent studies suggest that prevailing computational approaches, including transcriptome-wide association studies (TWASs) and colocalization analysis, are individually imperfect, but their joint usage can yield robust and powerful inference results. This paper presents INTACT, a computational framework to integrate probabilistic evidence from these distinct types of analyses and implicate putative causal genes. This procedure is flexible and can work with a wide range of existing integrative analysis approaches. It has the unique ability to quantify the uncertainty of implicated genes, enabling rigorous control of false-positive discoveries. Taking advantage of this highly desirable feature, we further propose an efficient algorithm, INTACT-GSE, for gene set enrichment analysis based on the integrated probabilistic evidence. We examine the proposed computational methods and illustrate their improved performance over the existing approaches through simulation studies. We apply the proposed methods to analyze the multi-tissue eQTL data from the GTEx project and eight large-scale complex- and molecular-trait GWAS datasets from multiple consortia and the UK Biobank. Overall, we find that the proposed methods markedly improve the existing putative gene implication methods and are particularly advantageous in evaluating and identifying key gene sets and biological pathways underlying complex traits.


Assuntos
Estudo de Associação Genômica Ampla , Transcriptoma , Humanos , Transcriptoma/genética , Estudo de Associação Genômica Ampla/métodos , Herança Multifatorial/genética , Locos de Características Quantitativas/genética , Simulação por Computador , Polimorfismo de Nucleotídeo Único/genética , Predisposição Genética para Doença
9.
Genet Epidemiol ; 2024 Jun 17.
Artigo em Inglês | MEDLINE | ID: mdl-38887957

RESUMO

Instrumental variable (IV) analysis has been widely applied in epidemiology to infer causal relationships using observational data. Genetic variants can also be viewed as valid IVs in Mendelian randomization and transcriptome-wide association studies. However, most multivariate IV approaches cannot scale to high-throughput experimental data. Here, we leverage the flexibility of our previous work, a hierarchical model that jointly analyzes marginal summary statistics (hJAM), to a scalable framework (SHA-JAM) that can be applied to a large number of intermediates and a large number of correlated genetic variants-situations often encountered in modern experiments leveraging omic technologies. SHA-JAM aims to estimate the conditional effect for high-dimensional risk factors on an outcome by incorporating estimates from association analyses of single-nucleotide polymorphism (SNP)-intermediate or SNP-gene expression as prior information in a hierarchical model. Results from extensive simulation studies demonstrate that SHA-JAM yields a higher area under the receiver operating characteristics curve (AUC), a lower mean-squared error of the estimates, and a much faster computation speed, compared to an existing approach for similar analyses. In two applied examples for prostate cancer, we investigated metabolite and transcriptome associations, respectively, using summary statistics from a GWAS for prostate cancer with more than 140,000 men and high dimensional publicly available summary data for metabolites and transcriptomes.

10.
Plant J ; 119(2): 844-860, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38812347

RESUMO

Transcriptome-wide association studies (TWAS) can provide single gene resolution for candidate genes in plants, complementing genome-wide association studies (GWAS) but efforts in plants have been met with, at best, mixed success. We generated expression data from 693 maize genotypes, measured in a common field experiment, sampled over a 2-h period to minimize diurnal and environmental effects, using full-length RNA-seq to maximize the accurate estimation of transcript abundance. TWAS could identify roughly 10 times as many genes likely to play a role in flowering time regulation as GWAS conducted data from the same experiment. TWAS using mature leaf tissue identified known true-positive flowering time genes known to act in the shoot apical meristem, and trait data from a new environment enabled the identification of additional flowering time genes without the need for new expression data. eQTL analysis of TWAS-tagged genes identified at least one additional known maize flowering time gene through trans-eQTL interactions. Collectively these results suggest the gene expression resource described here can link genes to functions across different plant phenotypes expressed in a range of tissues and scored in different experiments.


Assuntos
Flores , Regulação da Expressão Gênica de Plantas , Estudo de Associação Genômica Ampla , Locos de Características Quantitativas , Transcriptoma , Zea mays , Zea mays/genética , Zea mays/fisiologia , Flores/genética , Flores/fisiologia , Locos de Características Quantitativas/genética , Genótipo , Fenótipo , Genes de Plantas/genética , Folhas de Planta/genética , Folhas de Planta/fisiologia , Folhas de Planta/metabolismo , Perfilação da Expressão Gênica
11.
Am J Hum Genet ; 109(4): 669-679, 2022 04 07.
Artigo em Inglês | MEDLINE | ID: mdl-35263625

RESUMO

One mechanism by which genetic factors influence complex traits and diseases is altering gene expression. Direct measurement of gene expression in relevant tissues is rarely tenable; however, genetically regulated gene expression (GReX) can be estimated using prediction models derived from large multi-omic datasets. These approaches have led to the discovery of many gene-trait associations, but whether models derived from predominantly European ancestry (EA) reference panels can map novel associations in ancestrally diverse populations remains unclear. We applied PrediXcan to impute GReX in 51,520 ancestrally diverse Population Architecture using Genomics and Epidemiology (PAGE) participants (35% African American, 45% Hispanic/Latino, 10% Asian, and 7% Hawaiian) across 25 key cardiometabolic traits and relevant tissues to identify 102 novel associations. We then compared associations in PAGE to those in a random subset of 50,000 White British participants from UK Biobank (UKBB50k) for height and body mass index (BMI). We identified 517 associations across 47 tissues in PAGE but not UKBB50k, demonstrating the importance of diverse samples in identifying trait-associated GReX. We observed that variants used in PrediXcan models were either more or less differentiated across continental-level populations than matched-control variants depending on the specific population reflecting sampling bias. Additionally, variants from identified genes specific to either PAGE or UKBB50k analyses were more ancestrally differentiated than those in genes detected in both analyses, underlining the value of population-specific discoveries. This suggests that while EA-derived transcriptome imputation models can identify new associations in non-EA populations, models derived from closely matched reference panels may yield further insights. Our findings call for more diversity in reference datasets of tissue-specific gene expression.


Assuntos
Doenças Cardiovasculares , Estudo de Associação Genômica Ampla , Predisposição Genética para Doença , Humanos , Estilo de Vida , Polimorfismo de Nucleotídeo Único , Transcriptoma
12.
Am J Hum Genet ; 109(8): 1388-1404, 2022 08 04.
Artigo em Inglês | MEDLINE | ID: mdl-35931050

RESUMO

Transcriptome-wide association studies (TWASs) are a powerful approach to identify genes whose expression is associated with complex disease risk. However, non-causal genes can exhibit association signals due to confounding by linkage disequilibrium (LD) patterns and eQTL pleiotropy at genomic risk regions, which necessitates fine-mapping of TWAS signals. Here, we present MA-FOCUS, a multi-ancestry framework for the improved identification of genes underlying traits of interest. We demonstrate that by leveraging differences in ancestry-specific patterns of LD and eQTL signals, MA-FOCUS consistently outperforms single-ancestry fine-mapping approaches with equivalent total sample sizes across multiple metrics. We perform TWASs for 15 blood traits using genome-wide summary statistics (average nEA = 511 k, nAA = 13 k) and lymphoblastoid cell line eQTL data from cohorts of primarily European and African continental ancestries. We recapitulate evidence demonstrating shared genetic architectures for eQTL and blood traits between the two ancestry groups and observe that gene-level effects correlate 20% more strongly across ancestries than SNP-level effects. Lastly, we perform fine-mapping using MA-FOCUS and find evidence that genes at TWAS risk regions are more likely to be shared across ancestries than they are to be ancestry specific. Using multiple lines of evidence to validate our findings, we find that gene sets produced by MA-FOCUS are more enriched in hematopoietic categories than alternative approaches (p = 2.36 × 10-15). Our work demonstrates that including and appropriately accounting for genetic diversity can drive more profound insights into the genetic architecture of complex traits.


Assuntos
Estudo de Associação Genômica Ampla , Transcriptoma , Humanos , Desequilíbrio de Ligação , Herança Multifatorial/genética , Polimorfismo de Nucleotídeo Único/genética , Transcriptoma/genética
13.
Am J Hum Genet ; 109(5): 825-837, 2022 05 05.
Artigo em Inglês | MEDLINE | ID: mdl-35523146

RESUMO

Transcriptome-wide association studies and colocalization analysis are popular computational approaches for integrating genetic-association data from molecular and complex traits. They show the unique ability to go beyond variant-level genetic-association evidence and implicate critical functional units, e.g., genes, in disease etiology. However, in practice, when the two approaches are applied to the same molecular and complex-trait data, the inference results can be markedly different. This paper systematically investigates the inferential reproducibility between the two approaches through theoretical derivation, numerical experiments, and analyses of four complex trait GWAS and GTEx eQTL data. We identify two classes of inconsistent inference results. We find that the first class of inconsistent results (i.e., genes with strong colocalization but weak transcriptome-wide association study [TWAS] signals) might suggest an interesting biological phenomenon, i.e., horizontal pleiotropy; thus, the two approaches are truly complementary. The inconsistency in the second class (i.e., genes with weak colocalization but strong TWAS signals) can be understood and effectively reconciled. To this end, we propose a computational approach for locus-level colocalization analysis. We demonstrate that the joint TWAS and locus-level colocalization analysis improves specificity and sensitivity for implicating biologically relevant genes.


Assuntos
Estudo de Associação Genômica Ampla , Transcriptoma , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla/métodos , Humanos , Polimorfismo de Nucleotídeo Único/genética , Locos de Características Quantitativas/genética , Reprodutibilidade dos Testes , Transcriptoma/genética
14.
Am J Hum Genet ; 109(3): 393-404, 2022 03 03.
Artigo em Inglês | MEDLINE | ID: mdl-35108496

RESUMO

Identifying gene sets that are associated to disease can provide valuable biological knowledge, but a fundamental challenge of gene set analyses of GWAS data is linking disease-associated SNPs to genes. Transcriptome-wide association studies (TWASs) detect associations between the genetically predicted expression of a gene and disease risk, thus implicating candidate disease genes. However, causal disease genes at TWAS-associated loci generally remain unknown due to gene co-regulation, which leads to correlations across genes in predicted expression. We developed a method, gene co-regulation score (GCSC) regression, to identify gene sets that are enriched for disease heritability explained by predicted expression. GCSC regresses TWAS chi-square statistics on gene co-regulation scores reflecting correlations in predicted gene expression; a gene set is enriched for heritability if genes with high co-regulation to the set have higher TWAS chi-square statistics than genes with low co-regulation to the set, beyond what is expected based on co-regulation to all genes. We verified via simulations that GCSC is well calibrated and well powered. We applied GCSC to gene expression data from GTEx (48 tissues) and GWAS summary statistics for 43 independent diseases and complex traits analyzing a broad set of biological pathways and specifically expressed gene sets. We identified many enriched sets, recapitulating known biology. For Alzheimer disease, we detected evidence of an immune basis, and specifically a role for antigen presentation, in analyses of both biological pathways and specifically expressed gene sets. Our results highlight the advantages of leveraging gene co-regulation within the TWAS framework to identify enriched gene sets.


Assuntos
Estudo de Associação Genômica Ampla , Locos de Características Quantitativas , Predisposição Genética para Doença , Humanos , Herança Multifatorial , Polimorfismo de Nucleotídeo Único/genética , Locos de Características Quantitativas/genética , Transcriptoma
15.
Am J Hum Genet ; 109(5): 783-801, 2022 05 05.
Artigo em Inglês | MEDLINE | ID: mdl-35334221

RESUMO

Integrative analysis of genome-wide association studies (GWASs) and gene expression studies in the form of a transcriptome-wide association study (TWAS) has the potential to better elucidate the molecular mechanisms underlying disease etiology. Here we present a method, METRO, that can leverage gene expression data collected from multiple genetic ancestries to enhance TWASs. METRO incorporates expression prediction models constructed in different genetic ancestries through a likelihood-based inference framework, producing calibrated p values with substantially improved TWAS power. We illustrate the benefits of METRO in both simulations and applications to seven complex traits and diseases obtained from four GWASs. These GWASs include two of primarily European ancestry (n = 188,577 and 339,226) and two of primarily African ancestry (n = 42,752 and 23,827). In the real data applications, we leverage gene expression data measured on 1,032 African Americans and 801 European Americans from the Genetic Epidemiology Network of Arteriopathy (GENOA) study to identify a substantially larger number of gene-trait associations as compared to existing TWAS approaches. The benefits of METRO are most prominent in applications to GWASs of African ancestry where the sample size is much smaller than GWASs of European ancestry and where a more powerful TWAS method is crucial. Among the identified associations are high-density lipoprotein-associated genes including PLTP and PPARG that are critical for maintaining lipid homeostasis and the type II diabetes-associated gene MAPT that supports microtubule-associated protein tau as a key component underlying impaired insulin secretion.


Assuntos
Diabetes Mellitus Tipo 2 , Estudo de Associação Genômica Ampla , Diabetes Mellitus Tipo 2/genética , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla/métodos , Humanos , Funções Verossimilhança , Polimorfismo de Nucleotídeo Único/genética , Locos de Características Quantitativas/genética , Transcriptoma/genética
16.
Biostatistics ; 25(2): 468-485, 2024 Apr 15.
Artigo em Inglês | MEDLINE | ID: mdl-36610078

RESUMO

Transcriptome-wide association studies (TWAS) have been increasingly applied to identify (putative) causal genes for complex traits and diseases. TWAS can be regarded as a two-sample two-stage least squares method for instrumental variable (IV) regression for causal inference. The standard TWAS (called TWAS-L) only considers a linear relationship between a gene's expression and a trait in stage 2, which may lose statistical power when not true. Recently, an extension of TWAS (called TWAS-LQ) considers both the linear and quadratic effects of a gene on a trait, which however is not flexible enough due to its parametric nature and may be low powered for nonquadratic nonlinear effects. On the other hand, a deep learning (DL) approach, called DeepIV, has been proposed to nonparametrically model a nonlinear effect in IV regression. However, it is both slow and unstable due to the ill-posed inverse problem of solving an integral equation with Monte Carlo approximations. Furthermore, in the original DeepIV approach, statistical inference, that is, hypothesis testing, was not studied. Here, we propose a novel DL approach, called DeLIVR, to overcome the major drawbacks of DeepIV, by estimating a related but different target function and including a hypothesis testing framework. We show through simulations that DeLIVR was both faster and more stable than DeepIV. We applied both parametric and DL approaches to the GTEx and UK Biobank data, showcasing that DeLIVR detected additional 8 and 7 genes nonlinearly associated with high-density lipoprotein (HDL) cholesterol and low-density lipoprotein (LDL) cholesterol, respectively, all of which would be missed by TWAS-L, TWAS-LQ, and DeepIV; these genes include BUD13 associated with HDL, SLC44A2 and GMIP with LDL, all supported by previous studies.


Assuntos
Aprendizado Profundo , Transcriptoma , Humanos , Locos de Características Quantitativas , Fenótipo , Estudo de Associação Genômica Ampla/métodos , Colesterol , Predisposição Genética para Doença , Polimorfismo de Nucleotídeo Único
17.
J Allergy Clin Immunol ; 153(1): 122-131, 2024 01.
Artigo em Inglês | MEDLINE | ID: mdl-37742934

RESUMO

BACKGROUND: Little is known about nasal epithelial gene expression and total IgE in youth. OBJECTIVE: We aimed to identify genes whose nasal epithelial expression differs by total IgE in youth, and group them into modules that could be mapped to airway epithelial cell types. METHODS: We conducted a transcriptome-wide association study of total IgE in 469 Puerto Ricans aged 9 to 20 years who participated in the Epigenetic Variation and Childhood Asthma in Puerto Ricans study, separately in all subjects and in those with asthma. We then attempted to replicate top findings for each analysis using data from 3 cohorts. Genes with a Benjamini-Hochberg-adjusted P value of less than .05 in the Epigenetic Variation and Childhood Asthma in Puerto Ricans study and a P value of less than .05 in the same direction of association in 1 or more replication cohort were considered differentially expressed genes (DEGs). DEGs for total IgE in subjects with asthma were further dissected into gene modules using coexpression analysis, and such modules were mapped to specific cell types in airway epithelia using public single-cell RNA-sequencing data. RESULTS: A higher number of DEGs for total IgE were identified in subjects with asthma (n = 1179 DEGs) than in all subjects (n = 631 DEGs). In subjects with asthma, DEGs were mapped to 11 gene modules. The top module for positive correlation with total IgE was mapped to myoepithelial and mucus secretory cells in lower airway epithelia and was regulated by IL-4, IL5, IL-13, and IL-33. Within this module, hub genes included CDH26, FETUB, NTRK2, CCBL1, CST1, and CST2. Furthermore, an enrichment analysis showed overrepresentation of genes in signaling pathways for synaptogenesis, IL-13, and ferroptosis, supporting interactions between interleukin- and acetylcholine-induced responses. CONCLUSIONS: Our findings for nasal epithelial gene expression support neuroimmune coregulation of total IgE in youth with asthma.


Assuntos
Asma , Interleucina-13 , Criança , Humanos , Adolescente , Interleucina-13/genética , Nariz , Transcriptoma , Imunoglobulina E
18.
BMC Genomics ; 25(1): 445, 2024 May 06.
Artigo em Inglês | MEDLINE | ID: mdl-38711039

RESUMO

BACKGROUND: Characterization of regulatory variants (e.g., gene expression quantitative trait loci, eQTL; gene splicing QTL, sQTL) is crucial for biologically interpreting molecular mechanisms underlying loci associated with complex traits. However, regulatory variants in dairy cattle, particularly in specific biological contexts (e.g., distinct lactation stages), remain largely unknown. In this study, we explored regulatory variants in whole blood samples collected during early to mid-lactation (22-150 days after calving) of 101 Holstein cows and analyzed them to decipher the regulatory mechanisms underlying complex traits in dairy cattle. RESULTS: We identified 14,303 genes and 227,705 intron clusters expressed in the white blood cells of 101 cattle. The average heritability of gene expression and intron excision ratio explained by cis-SNPs is 0.28 ± 0.13 and 0.25 ± 0.13, respectively. We identified 23,485 SNP-gene expression pairs and 18,166 SNP-intron cluster pairs in dairy cattle during early to mid-lactation. Compared with the 2,380,457 cis-eQTLs reported to be present in blood in the Cattle Genotype-Tissue Expression atlas (CattleGTEx), only 6,114 cis-eQTLs (P < 0.05) were detected in the present study. By conducting colocalization analysis between cis-e/sQTL and the results of genome-wide association studies (GWAS) from four traits, we identified a cis-e/sQTL (rs109421300) of the DGAT1 gene that might be a key marker in early to mid-lactation for milk yield, fat yield, protein yield, and somatic cell score (PP4 > 0.6). Finally, transcriptome-wide association studies (TWAS) revealed certain genes (e.g., FAM83H and TBC1D17) whose expression in white blood cells was significantly (P < 0.05) associated with complex traits. CONCLUSIONS: This study investigated the genetic regulation of gene expression and alternative splicing in dairy cows during early to mid-lactation and provided new insights into the regulatory mechanisms underlying complex traits of economic importance.


Assuntos
Lactação , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas , Animais , Bovinos/genética , Lactação/genética , Feminino , Splicing de RNA , Estudo de Associação Genômica Ampla , Perfilação da Expressão Gênica , Íntrons , Transcriptoma
19.
Breast Cancer Res ; 26(1): 51, 2024 Mar 21.
Artigo em Inglês | MEDLINE | ID: mdl-38515142

RESUMO

BACKGROUND: Although several transcriptome-wide association studies (TWASs) have been performed to identify genes associated with overall breast cancer (BC) risk, only a few TWAS have explored the differences in estrogen receptor-positive (ER+) and estrogen receptor-negative (ER-) breast cancer. Additionally, these studies were based on gene expression prediction models trained primarily in breast tissue, and they did not account for alternative splicing of genes. METHODS: In this study, we utilized two approaches to perform multi-tissue TWASs of breast cancer by ER subtype: (1) an expression-based TWAS that combined TWAS signals for each gene across multiple tissues and (2) a splicing-based TWAS that combined TWAS signals of all excised introns for each gene across tissues. To perform this TWAS, we utilized summary statistics for ER + BC from the Breast Cancer Association Consortium (BCAC) and for ER- BC from a meta-analysis of BCAC and the Consortium of Investigators of Modifiers of BRCA1 and BRCA2 (CIMBA). RESULTS: In total, we identified 230 genes in 86 loci that were associated with ER + BC and 66 genes in 29 loci that were associated with ER- BC at a Bonferroni threshold of significance. Of these genes, 2 genes associated with ER + BC at the 1q21.1 locus were located at least 1 Mb from published GWAS hits. For several well-studied tumor suppressor genes such as TP53 and CHEK2 which have historically been thought to impact BC risk through rare, penetrant mutations, we discovered that common variants, which modulate gene expression, may additionally contribute to ER + or ER- etiology. CONCLUSIONS: Our study comprehensively examined how differences in common variation contribute to molecular differences between ER + and ER- BC and introduces a novel, splicing-based framework that can be used in future TWAS studies.


Assuntos
Neoplasias da Mama , Humanos , Feminino , Neoplasias da Mama/patologia , Transcriptoma , Predisposição Genética para Doença , Estrogênios , Receptores de Estrogênio/genética , Receptores de Estrogênio/metabolismo , Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único
20.
J Transl Med ; 22(1): 392, 2024 Apr 29.
Artigo em Inglês | MEDLINE | ID: mdl-38685026

RESUMO

BACKGROUND: Epidemiological evidence links a close correlation between long-term exposure to air pollutants and autoimmune diseases, while the causality remained unknown. METHODS: Two-sample Mendelian randomization (TSMR) was used to investigate the role of PM10, PM2.5, NO2, and NOX (N = 423,796-456,380) in 15 autoimmune diseases (N = 14,890-314,995) using data from large European GWASs including UKB, FINNGEN, IMSGC, and IPSCSG. Multivariable Mendelian randomization (MVMR) was conducted to investigate the direct effect of each air pollutant and the mediating role of common factors, including body mass index (BMI), alcohol consumption, smoking status, and household income. Transcriptome-wide association studies (TWAS), two-step MR, and colocalization analyses were performed to explore underlying mechanisms between air pollution and autoimmune diseases. RESULTS: In TSMR, after correction of multiple testing, hypothyroidism was causally associated with higher exposure to NO2 [odds ratio (OR): 1.37, p = 9.08 × 10-4] and NOX [OR: 1.34, p = 2.86 × 10-3], ulcerative colitis (UC) was causally associated with higher exposure to NOX [OR: 2.24, p = 1.23 × 10-2] and PM2.5 [OR: 2.60, p = 5.96 × 10-3], rheumatoid arthritis was causally associated with higher exposure to NOX [OR: 1.72, p = 1.50 × 10-2], systemic lupus erythematosus was causally associated with higher exposure to NOX [OR: 4.92, p = 6.89 × 10-3], celiac disease was causally associated with lower exposure to NOX [OR: 0.14, p = 6.74 × 10-4] and PM2.5 [OR: 0.17, p = 3.18 × 10-3]. The risky effects of PM2.5 on UC remained significant in MVMR analyses after adjusting for other air pollutants. MVMR revealed several common mediators between air pollutants and autoimmune diseases. Transcriptional analysis identified specific gene transcripts and pathways interconnecting air pollutants and autoimmune diseases. Two-step MR revealed that POR, HSPA1B, and BRD2 might mediate from air pollutants to autoimmune diseases. POR pQTL (rs59882870, PPH4=1.00) strongly colocalized with autoimmune diseases. CONCLUSION: This research underscores the necessity of rigorous air pollutant surveillance within public health studies to curb the prevalence of autoimmune diseases.


Assuntos
Poluentes Atmosféricos , Doenças Autoimunes , Estudo de Associação Genômica Ampla , Humanos , Doenças Autoimunes/genética , Poluentes Atmosféricos/efeitos adversos , Análise da Randomização Mendeliana , Predisposição Genética para Doença , Material Particulado/efeitos adversos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA