Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 183
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Cell ; 179(3): 750-771.e22, 2019 10 17.
Artículo en Inglés | MEDLINE | ID: mdl-31626773

RESUMEN

Tissue-specific regulatory regions harbor substantial genetic risk for disease. Because brain development is a critical epoch for neuropsychiatric disease susceptibility, we characterized the genetic control of the transcriptome in 201 mid-gestational human brains, identifying 7,962 expression quantitative trait loci (eQTL) and 4,635 spliceQTL (sQTL), including several thousand prenatal-specific regulatory regions. We show that significant genetic liability for neuropsychiatric disease lies within prenatal eQTL and sQTL. Integration of eQTL and sQTL with genome-wide association studies (GWAS) via transcriptome-wide association identified dozens of novel candidate risk genes, highlighting shared and stage-specific mechanisms in schizophrenia (SCZ). Gene network analysis revealed that SCZ and autism spectrum disorder (ASD) affect distinct developmental gene co-expression modules. Yet, in each disorder, common and rare genetic variation converges within modules, which in ASD implicates superficial cortical neurons. More broadly, these data, available as a web browser and our analyses, demonstrate the genetic mechanisms by which developmental events have a widespread influence on adult anatomical and behavioral phenotypes.


Asunto(s)
Trastorno del Espectro Autista/genética , Sitios de Carácter Cuantitativo/genética , Esquizofrenia/genética , Transcriptoma/genética , Trastorno del Espectro Autista/metabolismo , Trastorno del Espectro Autista/patología , Encéfalo/crecimiento & desarrollo , Encéfalo/metabolismo , Femenino , Feto/metabolismo , Regulación del Desarrollo de la Expresión Génica , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Edad Gestacional , Humanos , Masculino , Neuronas/metabolismo , Polimorfismo de Nucleótido Simple/genética , Empalme del ARN/genética , Esquizofrenia/metabolismo , Esquizofrenia/patología
2.
Am J Hum Genet ; 111(8): 1782-1795, 2024 Aug 08.
Artículo en Inglés | MEDLINE | ID: mdl-39053457

RESUMEN

In Mendelian randomization, two single SNP-trait correlation-based methods have been developed to infer the causal direction between an exposure (e.g., a gene) and an outcome (e.g., a trait), called MR Steiger's method and its recent extension called Causal Direction-Ratio (CD-Ratio). Here we propose an approach based on R2, the coefficient of determination, to combine information from multiple (possibly correlated) SNPs to simultaneously infer the presence and direction of a causal relationship between an exposure and an outcome. Our proposed method generalizes Steiger's method from using a single SNP to multiple SNPs as IVs. It is especially useful in transcriptome-wide association studies (TWASs) (and similar applications) with typically small sample sizes for gene expression (or another molecular trait) data, providing a more flexible and powerful approach to inferring causal directions. It can be applied to GWAS summary data with a reference panel. We also discuss the influence of invalid IVs and introduce a new approach called R2S to select and remove invalid IVs (if any) to enhance the robustness. We compared the performance of the proposed method with existing methods in simulations to demonstrate its advantages. We applied the methods to identify causal genes for high/low-density lipoprotein cholesterol (HDL/LDL) using the individual-level GTEx gene expression data and UK Biobank GWAS data. The proposed method was able to confirm some well-known causal genes while identifying some novel ones. Additionally, we illustrated an application of the proposed method to GWAS summary to infer causal relationships between HDL/LDL and stroke/coronary artery disease (CAD).


Asunto(s)
Estudio de Asociación del Genoma Completo , Polimorfismo de Nucleótido Simple , Transcriptoma , Humanos , Estudio de Asociación del Genoma Completo/métodos , Transcriptoma/genética , Análisis de la Aleatorización Mendeliana/métodos , Modelos Genéticos , LDL-Colesterol/genética , LDL-Colesterol/sangre , Fenotipo
3.
Am J Hum Genet ; 111(9): 1864-1876, 2024 Sep 05.
Artículo en Inglés | MEDLINE | ID: mdl-39137781

RESUMEN

We performed a series of integrative analyses including transcriptome-wide association studies (TWASs) and proteome-wide association studies (PWASs) of renal cell carcinoma (RCC) to nominate and prioritize molecular targets for laboratory investigation. On the basis of a genome-wide association study (GWAS) of 29,020 affected individuals and 835,670 control individuals and prediction models trained in transcriptomic reference models, our TWAS across four kidney transcriptomes (GTEx kidney cortex, kidney tubules, TCGA-KIRC [The Cancer Genome Atlas kidney renal clear-cell carcinoma], and TCGA-KIRP [TCGA kidney renal papillary cell carcinoma]) identified 38 gene associations (false-discovery rate <5%) in at least two of four transcriptomic panels and identified 12 genes that were independent of GWAS susceptibility regions. Analyses combining TWAS associations across 48 tissues from GTEx identified associations that were replicable in tumor transcriptomes for 23 additional genes. Analyses by the two major histologic types (clear-cell RCC and papillary RCC) revealed subtype-specific associations, although at least three gene associations were common to both subtypes. PWAS identified 13 associated proteins, all mapping to GWAS-significant loci. TWAS-identified genes were enriched for active enhancer or promoter regions in RCC tumors and hypoxia-inducible factor binding sites in relevant cell lines. Using gene expression correlation, common cancers (breast and prostate) and RCC risk factors (e.g., hypertension and BMI) display genetic contributions shared with RCC. Our work identifies potential molecular targets for RCC susceptibility for downstream functional investigation.


Asunto(s)
Carcinoma de Células Renales , Estudio de Asociación del Genoma Completo , Neoplasias Renales , Proteoma , Transcriptoma , Carcinoma de Células Renales/genética , Humanos , Neoplasias Renales/genética , Proteoma/genética , Predisposición Genética a la Enfermedad , Regulación Neoplásica de la Expresión Génica , Polimorfismo de Nucleótido Simple , Perfilación de la Expresión Génica
4.
Am J Hum Genet ; 111(8): 1573-1587, 2024 Aug 08.
Artículo en Inglés | MEDLINE | ID: mdl-38925119

RESUMEN

Recent studies have highlighted the essential role of RNA splicing, a key mechanism of alternative RNA processing, in establishing connections between genetic variations and disease. Genetic loci influencing RNA splicing variations show considerable influence on complex traits, possibly surpassing those affecting total gene expression. Dysregulated RNA splicing has emerged as a major potential contributor to neurological and psychiatric disorders, likely due to the exceptionally high prevalence of alternatively spliced genes in the human brain. Nevertheless, establishing direct associations between genetically altered splicing and complex traits has remained an enduring challenge. We introduce Spliced-Transcriptome-Wide Associations (SpliTWAS) to integrate alternative splicing information with genome-wide association studies to pinpoint genes linked to traits through exon splicing events. We applied SpliTWAS to two schizophrenia (SCZ) RNA-sequencing datasets, BrainGVEX and CommonMind, revealing 137 and 88 trait-associated exons (in 84 and 67 genes), respectively. Enriched biological functions in the associated gene sets converged on neuronal function and development, immune cell activation, and cellular transport, which are highly relevant to SCZ. SpliTWAS variants impacted RNA-binding protein binding sites, revealing potential disruption of RNA-protein interactions affecting splicing. We extended the probabilistic fine-mapping method FOCUS to the exon level, identifying 36 genes and 48 exons as putatively causal for SCZ. We highlight VPS45 and APOPT1, where splicing of specific exons was associated with disease risk, eluding detection by conventional gene expression analysis. Collectively, this study supports the substantial role of alternative splicing in shaping the genetic basis of SCZ, providing a valuable approach for future investigations in this area.


Asunto(s)
Empalme Alternativo , Exones , Estudio de Asociación del Genoma Completo , Esquizofrenia , Transcriptoma , Humanos , Esquizofrenia/genética , Empalme Alternativo/genética , Exones/genética , Predisposición Genética a la Enfermedad , Empalme del ARN/genética , Polimorfismo de Nucleótido Simple
5.
Am J Hum Genet ; 111(3): 445-455, 2024 03 07.
Artículo en Inglés | MEDLINE | ID: mdl-38320554

RESUMEN

Regulation of transcription and translation are mechanisms through which genetic variants affect complex traits. Expression quantitative trait locus (eQTL) studies have been more successful at identifying cis-eQTL (within 1 Mb of the transcription start site) than trans-eQTL. Here, we tested the cis component of gene expression for association with observed plasma protein levels to identify cis- and trans-acting genes that regulate protein levels. We used transcriptome prediction models from 49 Genotype-Tissue Expression (GTEx) Project tissues to predict the cis component of gene expression and tested the predicted expression of every gene in every tissue for association with the observed abundance of 3,622 plasma proteins measured in 3,301 individuals from the INTERVAL study. We tested significant results for replication in 971 individuals from the Trans-omics for Precision Medicine (TOPMed) Multi-Ethnic Study of Atherosclerosis (MESA). We found 1,168 and 1,210 cis- and trans-acting associations that replicated in TOPMed (FDR < 0.05) with a median expected true positive rate (π1) across tissues of 0.806 and 0.390, respectively. The target proteins of trans-acting genes were enriched for transcription factor binding sites and autoimmune diseases in the GWAS catalog. Furthermore, we found a higher correlation between predicted expression and protein levels of the same underlying gene (R = 0.17) than observed expression (R = 0.10, p = 7.50 × 10-11). This indicates the cis-acting genetically regulated (heritable) component of gene expression is more consistent across tissues than total observed expression (genetics + environment) and is useful in uncovering the function of SNPs associated with complex traits.


Asunto(s)
Proteoma , Transcriptoma , Humanos , Transcriptoma/genética , Proteoma/genética , Herencia Multifactorial , Sitios de Carácter Cuantitativo/genética , Estudio de Asociación del Genoma Completo , Polimorfismo de Nucleótido Simple/genética
6.
Am J Hum Genet ; 111(6): 1084-1099, 2024 06 06.
Artículo en Inglés | MEDLINE | ID: mdl-38723630

RESUMEN

Transcriptome-wide association studies (TWASs) have investigated the role of genetically regulated transcriptional activity in the etiologies of breast and ovarian cancer. However, methods performed to date have focused on the regulatory effects of risk-associated SNPs thought to act in cis on a nearby target gene. With growing evidence for distal (trans) regulatory effects of variants on gene expression, we performed TWASs of breast and ovarian cancer using a Bayesian genome-wide TWAS method (BGW-TWAS) that considers effects of both cis- and trans-expression quantitative trait loci (eQTLs). We applied BGW-TWAS to whole-genome and RNA sequencing data in breast and ovarian tissues from the Genotype-Tissue Expression project to train expression imputation models. We applied these models to large-scale GWAS summary statistic data from the Breast Cancer and Ovarian Cancer Association Consortia to identify genes associated with risk of overall breast cancer, non-mucinous epithelial ovarian cancer, and 10 cancer subtypes. We identified 101 genes significantly associated with risk with breast cancer phenotypes and 8 with ovarian phenotypes. These loci include established risk genes and several novel candidate risk loci, such as ACAP3, whose associations are predominantly driven by trans-eQTLs. We replicated several associations using summary statistics from an independent GWAS of these cancer phenotypes. We further used genotype and expression data in normal and tumor breast tissue from the Cancer Genome Atlas to examine the performance of our trained expression imputation models. This work represents an in-depth look into the role of trans eQTLs in the complex molecular mechanisms underlying these diseases.


Asunto(s)
Neoplasias de la Mama , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Neoplasias Ováricas , Polimorfismo de Nucleótido Simple , Sitios de Carácter Cuantitativo , Humanos , Femenino , Neoplasias Ováricas/genética , Neoplasias Ováricas/patología , Neoplasias de la Mama/genética , Neoplasias de la Mama/patología , Teorema de Bayes , Transcriptoma , Regulación Neoplásica de la Expresión Génica
7.
Am J Hum Genet ; 111(7): 1448-1461, 2024 07 11.
Artículo en Inglés | MEDLINE | ID: mdl-38821058

RESUMEN

Both trio and population designs are popular study designs for identifying risk genetic variants in genome-wide association studies (GWASs). The trio design, as a family-based design, is robust to confounding due to population structure, whereas the population design is often more powerful due to larger sample sizes. Here, we propose KnockoffHybrid, a knockoff-based statistical method for hybrid analysis of both the trio and population designs. KnockoffHybrid provides a unified framework that brings together the advantages of both designs and produces powerful hybrid analysis while controlling the false discovery rate (FDR) in the presence of linkage disequilibrium and population structure. Furthermore, KnockoffHybrid has the flexibility to leverage different types of summary statistics for hybrid analyses, including expression quantitative trait loci (eQTL) and GWAS summary statistics. We demonstrate in simulations that KnockoffHybrid offers power gains over non-hybrid methods for the trio and population designs with the same number of cases while controlling the FDR with complex correlation among variants and population structure among subjects. In hybrid analyses of three trio cohorts for autism spectrum disorders (ASDs) from the Autism Speaks MSSNG, Autism Sequencing Consortium, and Autism Genome Project with GWAS summary statistics from the iPSYCH project and eQTL summary statistics from the MetaBrain project, KnockoffHybrid outperforms conventional methods by replicating several known risk genes for ASDs and identifying additional associations with variants in other genes, including the PRAME family genes involved in axon guidance and which may act as common targets for human speech/language evolution and related disorders.


Asunto(s)
Trastorno del Espectro Autista , Estudio de Asociación del Genoma Completo , Desequilibrio de Ligamiento , Sitios de Carácter Cuantitativo , Estudio de Asociación del Genoma Completo/métodos , Humanos , Trastorno del Espectro Autista/genética , Predisposición Genética a la Enfermedad , Polimorfismo de Nucleótido Simple , Simulación por Computador , Modelos Genéticos
8.
Hum Mol Genet ; 33(7): 624-635, 2024 Mar 20.
Artículo en Inglés | MEDLINE | ID: mdl-38129112

RESUMEN

Transcriptome-wide association studies (TWAS) integrate gene expression prediction models and genome-wide association studies (GWAS) to identify gene-trait associations. The power of TWAS is determined by the sample size of GWAS and the accuracy of the expression prediction model. Here, we present a new method, the Summary-level Unified Method for Modeling Integrated Transcriptome using Functional Annotations (SUMMIT-FA), which improves gene expression prediction accuracy by leveraging functional annotation resources and a large expression quantitative trait loci (eQTL) summary-level dataset. We build gene expression prediction models in whole blood using SUMMIT-FA with the comprehensive functional database MACIE and eQTL summary-level data from the eQTLGen consortium. We apply these models to GWAS for 24 complex traits and show that SUMMIT-FA identifies significantly more gene-trait associations and improves predictive power for identifying "silver standard" genes compared to several benchmark methods. We further conduct a simulation study to demonstrate the effectiveness of SUMMIT-FA.


Asunto(s)
Estudio de Asociación del Genoma Completo , Transcriptoma , Humanos , Transcriptoma/genética , Estudio de Asociación del Genoma Completo/métodos , Simulación por Computador , Sitios de Carácter Cuantitativo/genética , Fenotipo , Polimorfismo de Nucleótido Simple , Predisposición Genética a la Enfermedad
9.
Hum Mol Genet ; 33(4): 333-341, 2024 Feb 01.
Artículo en Inglés | MEDLINE | ID: mdl-37903058

RESUMEN

Transcriptome-wide association studies (TWAS) have identified many putative susceptibility genes for colorectal cancer (CRC) risk. However, susceptibility miRNAs, critical dysregulators of gene expression, remain unexplored. We genotyped DNA samples from 313 CRC East Asian patients and performed small RNA sequencing in their normal colon tissues distant from tumors to build genetic models for predicting miRNA expression. We applied these models and data from genome-wide association studies (GWAS) including 23 942 cases and 217 267 controls of East Asian ancestry to investigate associations of predicted miRNA expression with CRC risk. Perturbation experiments separately by promoting and inhibiting miRNAs expressions and further in vitro assays in both SW480 and HCT116 cells were conducted. At a Bonferroni-corrected threshold of P < 4.5 × 10-4, we identified two putative susceptibility miRNAs, miR-1307-5p and miR-192-3p, located in regions more than 500 kb away from any GWAS-identified risk variants in CRC. We observed that a high predicted expression of miR-1307-5p was associated with increased CRC risk, while a low predicted expression of miR-192-3p was associated with increased CRC risk. Our experimental results further provide strong evidence of their susceptible roles by showing that miR-1307-5p and miR-192-3p play a regulatory role, respectively, in promoting and inhibiting CRC cell proliferation, migration, and invasion, which was consistently observed in both SW480 and HCT116 cells. Our study provides additional insights into the biological mechanisms underlying CRC development.


Asunto(s)
Neoplasias Colorrectales , MicroARNs , Humanos , MicroARNs/genética , MicroARNs/metabolismo , Transcriptoma/genética , Estudio de Asociación del Genoma Completo , Neoplasias Colorrectales/metabolismo , Células HCT116 , Regulación Neoplásica de la Expresión Génica/genética , Proliferación Celular/genética
10.
Am J Hum Genet ; 110(1): 44-57, 2023 01 05.
Artículo en Inglés | MEDLINE | ID: mdl-36608684

RESUMEN

Integrative genetic association methods have shown great promise in post-GWAS (genome-wide association study) analyses, in which one of the most challenging tasks is identifying putative causal genes and uncovering molecular mechanisms of complex traits. Recent studies suggest that prevailing computational approaches, including transcriptome-wide association studies (TWASs) and colocalization analysis, are individually imperfect, but their joint usage can yield robust and powerful inference results. This paper presents INTACT, a computational framework to integrate probabilistic evidence from these distinct types of analyses and implicate putative causal genes. This procedure is flexible and can work with a wide range of existing integrative analysis approaches. It has the unique ability to quantify the uncertainty of implicated genes, enabling rigorous control of false-positive discoveries. Taking advantage of this highly desirable feature, we further propose an efficient algorithm, INTACT-GSE, for gene set enrichment analysis based on the integrated probabilistic evidence. We examine the proposed computational methods and illustrate their improved performance over the existing approaches through simulation studies. We apply the proposed methods to analyze the multi-tissue eQTL data from the GTEx project and eight large-scale complex- and molecular-trait GWAS datasets from multiple consortia and the UK Biobank. Overall, we find that the proposed methods markedly improve the existing putative gene implication methods and are particularly advantageous in evaluating and identifying key gene sets and biological pathways underlying complex traits.


Asunto(s)
Estudio de Asociación del Genoma Completo , Transcriptoma , Humanos , Transcriptoma/genética , Estudio de Asociación del Genoma Completo/métodos , Herencia Multifactorial/genética , Sitios de Carácter Cuantitativo/genética , Simulación por Computador , Polimorfismo de Nucleótido Simple/genética , Predisposición Genética a la Enfermedad
11.
Genet Epidemiol ; 2024 Sep 30.
Artículo en Inglés | MEDLINE | ID: mdl-39344923

RESUMEN

Transcriptome-wide association studies (TWAS) aim to uncover genotype-phenotype relationships through a two-stage procedure: predicting gene expression from genotypes using an expression quantitative trait locus (eQTL) data set, then testing the predicted expression for trait associations. Accurate gene expression prediction in stage 1 is crucial, as it directly impacts the power to identify associations in stage 2. Currently, the first stage of such studies is primarily conducted using linear models like elastic net regression, which fail to capture the nonlinear relationships inherent in biological systems. Deep learning methods have the potential to model such nonlinear effects, but have yet to demonstrably outperform linear methods at this task. To address this gap, we propose a new deep learning architecture to predict gene expression from genotypic variation across individuals. Our method utilizes a learnable input scaling layer in conjunction with a convolutional encoder to capture nonlinear effects and higher-order interactions without compromising on interpretability. We further augment this approach to allow for parameter sharing across multiple networks, enabling us to utilize prior information for individual variants in the form of functional annotations. Evaluations on real-world genomic data show that our method consistently outperforms elastic net regression across a large set of heritable genes. Furthermore, our model statistically significantly improved predictive performance by leveraging functional annotations, whereas elastic net regression failed to show equivalent gains when using the same information, suggesting that our method can capture nonlinear functional information beyond the capability of linear models.

12.
Genet Epidemiol ; 48(7): 291-309, 2024 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-38887957

RESUMEN

Instrumental variable (IV) analysis has been widely applied in epidemiology to infer causal relationships using observational data. Genetic variants can also be viewed as valid IVs in Mendelian randomization and transcriptome-wide association studies. However, most multivariate IV approaches cannot scale to high-throughput experimental data. Here, we leverage the flexibility of our previous work, a hierarchical model that jointly analyzes marginal summary statistics (hJAM), to a scalable framework (SHA-JAM) that can be applied to a large number of intermediates and a large number of correlated genetic variants-situations often encountered in modern experiments leveraging omic technologies. SHA-JAM aims to estimate the conditional effect for high-dimensional risk factors on an outcome by incorporating estimates from association analyses of single-nucleotide polymorphism (SNP)-intermediate or SNP-gene expression as prior information in a hierarchical model. Results from extensive simulation studies demonstrate that SHA-JAM yields a higher area under the receiver operating characteristics curve (AUC), a lower mean-squared error of the estimates, and a much faster computation speed, compared to an existing approach for similar analyses. In two applied examples for prostate cancer, we investigated metabolite and transcriptome associations, respectively, using summary statistics from a GWAS for prostate cancer with more than 140,000 men and high dimensional publicly available summary data for metabolites and transcriptomes.


Asunto(s)
Polimorfismo de Nucleótido Simple , Neoplasias de la Próstata , Humanos , Neoplasias de la Próstata/genética , Masculino , Estudio de Asociación del Genoma Completo/métodos , Modelos Estadísticos , Análisis de la Aleatorización Mendeliana , Curva ROC , Simulación por Computador
13.
Plant J ; 119(2): 844-860, 2024 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-38812347

RESUMEN

Transcriptome-wide association studies (TWAS) can provide single gene resolution for candidate genes in plants, complementing genome-wide association studies (GWAS) but efforts in plants have been met with, at best, mixed success. We generated expression data from 693 maize genotypes, measured in a common field experiment, sampled over a 2-h period to minimize diurnal and environmental effects, using full-length RNA-seq to maximize the accurate estimation of transcript abundance. TWAS could identify roughly 10 times as many genes likely to play a role in flowering time regulation as GWAS conducted data from the same experiment. TWAS using mature leaf tissue identified known true-positive flowering time genes known to act in the shoot apical meristem, and trait data from a new environment enabled the identification of additional flowering time genes without the need for new expression data. eQTL analysis of TWAS-tagged genes identified at least one additional known maize flowering time gene through trans-eQTL interactions. Collectively these results suggest the gene expression resource described here can link genes to functions across different plant phenotypes expressed in a range of tissues and scored in different experiments.


Asunto(s)
Flores , Regulación de la Expresión Génica de las Plantas , Estudio de Asociación del Genoma Completo , Sitios de Carácter Cuantitativo , Transcriptoma , Zea mays , Zea mays/genética , Zea mays/fisiología , Flores/genética , Flores/fisiología , Sitios de Carácter Cuantitativo/genética , Genotipo , Fenotipo , Genes de Plantas/genética , Hojas de la Planta/genética , Hojas de la Planta/fisiología , Hojas de la Planta/metabolismo , Perfilación de la Expresión Génica
14.
Am J Hum Genet ; 109(8): 1388-1404, 2022 08 04.
Artículo en Inglés | MEDLINE | ID: mdl-35931050

RESUMEN

Transcriptome-wide association studies (TWASs) are a powerful approach to identify genes whose expression is associated with complex disease risk. However, non-causal genes can exhibit association signals due to confounding by linkage disequilibrium (LD) patterns and eQTL pleiotropy at genomic risk regions, which necessitates fine-mapping of TWAS signals. Here, we present MA-FOCUS, a multi-ancestry framework for the improved identification of genes underlying traits of interest. We demonstrate that by leveraging differences in ancestry-specific patterns of LD and eQTL signals, MA-FOCUS consistently outperforms single-ancestry fine-mapping approaches with equivalent total sample sizes across multiple metrics. We perform TWASs for 15 blood traits using genome-wide summary statistics (average nEA = 511 k, nAA = 13 k) and lymphoblastoid cell line eQTL data from cohorts of primarily European and African continental ancestries. We recapitulate evidence demonstrating shared genetic architectures for eQTL and blood traits between the two ancestry groups and observe that gene-level effects correlate 20% more strongly across ancestries than SNP-level effects. Lastly, we perform fine-mapping using MA-FOCUS and find evidence that genes at TWAS risk regions are more likely to be shared across ancestries than they are to be ancestry specific. Using multiple lines of evidence to validate our findings, we find that gene sets produced by MA-FOCUS are more enriched in hematopoietic categories than alternative approaches (p = 2.36 × 10-15). Our work demonstrates that including and appropriately accounting for genetic diversity can drive more profound insights into the genetic architecture of complex traits.


Asunto(s)
Estudio de Asociación del Genoma Completo , Transcriptoma , Humanos , Desequilibrio de Ligamiento , Herencia Multifactorial/genética , Polimorfismo de Nucleótido Simple/genética , Transcriptoma/genética
15.
Am J Hum Genet ; 109(4): 669-679, 2022 04 07.
Artículo en Inglés | MEDLINE | ID: mdl-35263625

RESUMEN

One mechanism by which genetic factors influence complex traits and diseases is altering gene expression. Direct measurement of gene expression in relevant tissues is rarely tenable; however, genetically regulated gene expression (GReX) can be estimated using prediction models derived from large multi-omic datasets. These approaches have led to the discovery of many gene-trait associations, but whether models derived from predominantly European ancestry (EA) reference panels can map novel associations in ancestrally diverse populations remains unclear. We applied PrediXcan to impute GReX in 51,520 ancestrally diverse Population Architecture using Genomics and Epidemiology (PAGE) participants (35% African American, 45% Hispanic/Latino, 10% Asian, and 7% Hawaiian) across 25 key cardiometabolic traits and relevant tissues to identify 102 novel associations. We then compared associations in PAGE to those in a random subset of 50,000 White British participants from UK Biobank (UKBB50k) for height and body mass index (BMI). We identified 517 associations across 47 tissues in PAGE but not UKBB50k, demonstrating the importance of diverse samples in identifying trait-associated GReX. We observed that variants used in PrediXcan models were either more or less differentiated across continental-level populations than matched-control variants depending on the specific population reflecting sampling bias. Additionally, variants from identified genes specific to either PAGE or UKBB50k analyses were more ancestrally differentiated than those in genes detected in both analyses, underlining the value of population-specific discoveries. This suggests that while EA-derived transcriptome imputation models can identify new associations in non-EA populations, models derived from closely matched reference panels may yield further insights. Our findings call for more diversity in reference datasets of tissue-specific gene expression.


Asunto(s)
Enfermedades Cardiovasculares , Estudio de Asociación del Genoma Completo , Predisposición Genética a la Enfermedad , Humanos , Estilo de Vida , Polimorfismo de Nucleótido Simple , Transcriptoma
16.
Am J Hum Genet ; 109(5): 825-837, 2022 05 05.
Artículo en Inglés | MEDLINE | ID: mdl-35523146

RESUMEN

Transcriptome-wide association studies and colocalization analysis are popular computational approaches for integrating genetic-association data from molecular and complex traits. They show the unique ability to go beyond variant-level genetic-association evidence and implicate critical functional units, e.g., genes, in disease etiology. However, in practice, when the two approaches are applied to the same molecular and complex-trait data, the inference results can be markedly different. This paper systematically investigates the inferential reproducibility between the two approaches through theoretical derivation, numerical experiments, and analyses of four complex trait GWAS and GTEx eQTL data. We identify two classes of inconsistent inference results. We find that the first class of inconsistent results (i.e., genes with strong colocalization but weak transcriptome-wide association study [TWAS] signals) might suggest an interesting biological phenomenon, i.e., horizontal pleiotropy; thus, the two approaches are truly complementary. The inconsistency in the second class (i.e., genes with weak colocalization but strong TWAS signals) can be understood and effectively reconciled. To this end, we propose a computational approach for locus-level colocalization analysis. We demonstrate that the joint TWAS and locus-level colocalization analysis improves specificity and sensitivity for implicating biologically relevant genes.


Asunto(s)
Estudio de Asociación del Genoma Completo , Transcriptoma , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo/métodos , Humanos , Polimorfismo de Nucleótido Simple/genética , Sitios de Carácter Cuantitativo/genética , Reproducibilidad de los Resultados , Transcriptoma/genética
17.
Am J Hum Genet ; 109(3): 393-404, 2022 03 03.
Artículo en Inglés | MEDLINE | ID: mdl-35108496

RESUMEN

Identifying gene sets that are associated to disease can provide valuable biological knowledge, but a fundamental challenge of gene set analyses of GWAS data is linking disease-associated SNPs to genes. Transcriptome-wide association studies (TWASs) detect associations between the genetically predicted expression of a gene and disease risk, thus implicating candidate disease genes. However, causal disease genes at TWAS-associated loci generally remain unknown due to gene co-regulation, which leads to correlations across genes in predicted expression. We developed a method, gene co-regulation score (GCSC) regression, to identify gene sets that are enriched for disease heritability explained by predicted expression. GCSC regresses TWAS chi-square statistics on gene co-regulation scores reflecting correlations in predicted gene expression; a gene set is enriched for heritability if genes with high co-regulation to the set have higher TWAS chi-square statistics than genes with low co-regulation to the set, beyond what is expected based on co-regulation to all genes. We verified via simulations that GCSC is well calibrated and well powered. We applied GCSC to gene expression data from GTEx (48 tissues) and GWAS summary statistics for 43 independent diseases and complex traits analyzing a broad set of biological pathways and specifically expressed gene sets. We identified many enriched sets, recapitulating known biology. For Alzheimer disease, we detected evidence of an immune basis, and specifically a role for antigen presentation, in analyses of both biological pathways and specifically expressed gene sets. Our results highlight the advantages of leveraging gene co-regulation within the TWAS framework to identify enriched gene sets.


Asunto(s)
Estudio de Asociación del Genoma Completo , Sitios de Carácter Cuantitativo , Predisposición Genética a la Enfermedad , Humanos , Herencia Multifactorial , Polimorfismo de Nucleótido Simple/genética , Sitios de Carácter Cuantitativo/genética , Transcriptoma
18.
Am J Hum Genet ; 109(5): 783-801, 2022 05 05.
Artículo en Inglés | MEDLINE | ID: mdl-35334221

RESUMEN

Integrative analysis of genome-wide association studies (GWASs) and gene expression studies in the form of a transcriptome-wide association study (TWAS) has the potential to better elucidate the molecular mechanisms underlying disease etiology. Here we present a method, METRO, that can leverage gene expression data collected from multiple genetic ancestries to enhance TWASs. METRO incorporates expression prediction models constructed in different genetic ancestries through a likelihood-based inference framework, producing calibrated p values with substantially improved TWAS power. We illustrate the benefits of METRO in both simulations and applications to seven complex traits and diseases obtained from four GWASs. These GWASs include two of primarily European ancestry (n = 188,577 and 339,226) and two of primarily African ancestry (n = 42,752 and 23,827). In the real data applications, we leverage gene expression data measured on 1,032 African Americans and 801 European Americans from the Genetic Epidemiology Network of Arteriopathy (GENOA) study to identify a substantially larger number of gene-trait associations as compared to existing TWAS approaches. The benefits of METRO are most prominent in applications to GWASs of African ancestry where the sample size is much smaller than GWASs of European ancestry and where a more powerful TWAS method is crucial. Among the identified associations are high-density lipoprotein-associated genes including PLTP and PPARG that are critical for maintaining lipid homeostasis and the type II diabetes-associated gene MAPT that supports microtubule-associated protein tau as a key component underlying impaired insulin secretion.


Asunto(s)
Diabetes Mellitus Tipo 2 , Estudio de Asociación del Genoma Completo , Diabetes Mellitus Tipo 2/genética , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo/métodos , Humanos , Funciones de Verosimilitud , Polimorfismo de Nucleótido Simple/genética , Sitios de Carácter Cuantitativo/genética , Transcriptoma/genética
19.
Biostatistics ; 25(2): 468-485, 2024 Apr 15.
Artículo en Inglés | MEDLINE | ID: mdl-36610078

RESUMEN

Transcriptome-wide association studies (TWAS) have been increasingly applied to identify (putative) causal genes for complex traits and diseases. TWAS can be regarded as a two-sample two-stage least squares method for instrumental variable (IV) regression for causal inference. The standard TWAS (called TWAS-L) only considers a linear relationship between a gene's expression and a trait in stage 2, which may lose statistical power when not true. Recently, an extension of TWAS (called TWAS-LQ) considers both the linear and quadratic effects of a gene on a trait, which however is not flexible enough due to its parametric nature and may be low powered for nonquadratic nonlinear effects. On the other hand, a deep learning (DL) approach, called DeepIV, has been proposed to nonparametrically model a nonlinear effect in IV regression. However, it is both slow and unstable due to the ill-posed inverse problem of solving an integral equation with Monte Carlo approximations. Furthermore, in the original DeepIV approach, statistical inference, that is, hypothesis testing, was not studied. Here, we propose a novel DL approach, called DeLIVR, to overcome the major drawbacks of DeepIV, by estimating a related but different target function and including a hypothesis testing framework. We show through simulations that DeLIVR was both faster and more stable than DeepIV. We applied both parametric and DL approaches to the GTEx and UK Biobank data, showcasing that DeLIVR detected additional 8 and 7 genes nonlinearly associated with high-density lipoprotein (HDL) cholesterol and low-density lipoprotein (LDL) cholesterol, respectively, all of which would be missed by TWAS-L, TWAS-LQ, and DeepIV; these genes include BUD13 associated with HDL, SLC44A2 and GMIP with LDL, all supported by previous studies.


Asunto(s)
Aprendizaje Profundo , Transcriptoma , Humanos , Sitios de Carácter Cuantitativo , Fenotipo , Estudio de Asociación del Genoma Completo/métodos , Colesterol , Predisposición Genética a la Enfermedad , Polimorfismo de Nucleótido Simple
20.
J Allergy Clin Immunol ; 153(1): 122-131, 2024 01.
Artículo en Inglés | MEDLINE | ID: mdl-37742934

RESUMEN

BACKGROUND: Little is known about nasal epithelial gene expression and total IgE in youth. OBJECTIVE: We aimed to identify genes whose nasal epithelial expression differs by total IgE in youth, and group them into modules that could be mapped to airway epithelial cell types. METHODS: We conducted a transcriptome-wide association study of total IgE in 469 Puerto Ricans aged 9 to 20 years who participated in the Epigenetic Variation and Childhood Asthma in Puerto Ricans study, separately in all subjects and in those with asthma. We then attempted to replicate top findings for each analysis using data from 3 cohorts. Genes with a Benjamini-Hochberg-adjusted P value of less than .05 in the Epigenetic Variation and Childhood Asthma in Puerto Ricans study and a P value of less than .05 in the same direction of association in 1 or more replication cohort were considered differentially expressed genes (DEGs). DEGs for total IgE in subjects with asthma were further dissected into gene modules using coexpression analysis, and such modules were mapped to specific cell types in airway epithelia using public single-cell RNA-sequencing data. RESULTS: A higher number of DEGs for total IgE were identified in subjects with asthma (n = 1179 DEGs) than in all subjects (n = 631 DEGs). In subjects with asthma, DEGs were mapped to 11 gene modules. The top module for positive correlation with total IgE was mapped to myoepithelial and mucus secretory cells in lower airway epithelia and was regulated by IL-4, IL5, IL-13, and IL-33. Within this module, hub genes included CDH26, FETUB, NTRK2, CCBL1, CST1, and CST2. Furthermore, an enrichment analysis showed overrepresentation of genes in signaling pathways for synaptogenesis, IL-13, and ferroptosis, supporting interactions between interleukin- and acetylcholine-induced responses. CONCLUSIONS: Our findings for nasal epithelial gene expression support neuroimmune coregulation of total IgE in youth with asthma.


Asunto(s)
Asma , Interleucina-13 , Niño , Humanos , Adolescente , Interleucina-13/genética , Nariz , Transcriptoma , Inmunoglobulina E
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA