Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 169
Filter
Add more filters

Publication year range
1.
Am J Hum Genet ; 2024 Jun 19.
Article in English | MEDLINE | ID: mdl-38925120

ABSTRACT

Regulation of gene expression is a vital component of neurological homeostasis. Cataloging the consequences of endogenous gene expression on the physical structure and connectivity of the brain offers a means of unifying trait-associated genetic variation with trait-associated neurological features. We perform tissue-specific transcriptome-wide association studies (TWASs) on over 3,400 neuroimaging phenotypes in the UK Biobank (N = 33,224) using our joint-tissue imputation (JTI)-TWAS method. We identify highly significant associations between predicted expression for 7,192 genes and a wide variety of measures of the brain derived from magnetic resonance imaging (MRI). Our approach generates reproducible results in internal and external replication datasets. Genetically determined expression alone is sufficient for high-fidelity reconstruction of brain structure and organization. We demonstrate complementary benefits of cross-tissue and single-tissue analyses toward an integrated neurobiology and provide evidence that gene expression outside the central nervous system provides unique insights into brain health. As an application, we provide evidence suggesting that the genetically regulated expression of schizophrenia risk genes causally affects over 73% of neurological phenotypes that are altered in individuals with schizophrenia (as identified by neuroimaging studies). Imaging features associated with neuropsychiatric traits can provide valuable insights into underlying pathophysiology. By linking neuroimaging-derived phenotypes with expression levels of specific genes, this resource represents a powerful gene prioritization schema that can improve our understanding of brain function, development, and disease. The use of multiple different cortical and subcortical atlases in the resource facilitates direct integration of these data with findings from a diverse range of clinical neuroimaging studies.

2.
Am J Hum Genet ; 111(3): 562-583, 2024 Mar 07.
Article in English | MEDLINE | ID: mdl-38367620

ABSTRACT

Genetic variants are involved in the orchestration of alternative polyadenylation (APA) events, while the role of DNA methylation in regulating APA remains unclear. We generated a comprehensive atlas of APA quantitative trait methylation sites (apaQTMs) across 21 different types of cancer (1,612 to 60,219 acting in cis and 4,448 to 142,349 in trans). Potential causal apaQTMs in non-cancer samples were also identified. Mechanistically, we observed a strong enrichment of cis-apaQTMs near polyadenylation sites (PASs) and both cis- and trans-apaQTMs in proximity to transcription factor (TF) binding regions. Through the integration of ChIP-signals and RNA-seq data from cell lines, we have identified several regulators of APA events, acting either directly or indirectly, implicating novel functions of some important genes, such as TCF7L2, which is known for its involvement in type 2 diabetes and cancers. Furthermore, we have identified a vast number of QTMs that share the same putative causal CpG sites with five different cancer types, underscoring the roles of QTMs, including apaQTMs, in the process of tumorigenesis. DNA methylation is extensively involved in the regulation of APA events in human cancers. In an attempt to elucidate the potential underlying molecular mechanisms of APA by DNA methylation, our study paves the way for subsequent experimental validations into the intricate biological functions of DNA methylation in APA regulation and the pathogenesis of human cancers. To present a comprehensive catalog of apaQTM patterns, we introduce the Pancan-apaQTM database, available at https://pancan-apaqtm-zju.shinyapps.io/pancanaQTM/.


Subject(s)
Diabetes Mellitus, Type 2 , Neoplasms , Humans , Polyadenylation/genetics , Diabetes Mellitus, Type 2/genetics , Neoplasms/genetics , Neoplasms/pathology , Gene Expression Regulation , DNA Methylation/genetics , 3' Untranslated Regions
3.
Nature ; 599(7883): 136-140, 2021 11.
Article in English | MEDLINE | ID: mdl-34707288

ABSTRACT

Glutathione (GSH) is a small-molecule thiol that is abundant in all eukaryotes and has key roles in oxidative metabolism1. Mitochondria, as the major site of oxidative reactions, must maintain sufficient levels of GSH to perform protective and biosynthetic functions2. GSH is synthesized exclusively in the cytosol, yet the molecular machinery involved in mitochondrial GSH import remains unknown. Here, using organellar proteomics and metabolomics approaches, we identify SLC25A39, a mitochondrial membrane carrier of unknown function, as a regulator of GSH transport into mitochondria. Loss of SLC25A39 reduces mitochondrial GSH import and abundance without affecting cellular GSH levels. Cells lacking both SLC25A39 and its paralogue SLC25A40 exhibit defects in the activity and stability of proteins containing iron-sulfur clusters. We find that mitochondrial GSH import is necessary for cell proliferation in vitro and red blood cell development in mice. Heterologous expression of an engineered bifunctional bacterial GSH biosynthetic enzyme (GshF) in mitochondria enables mitochondrial GSH production and ameliorates the metabolic and proliferative defects caused by its depletion. Finally, GSH availability negatively regulates SLC25A39 protein abundance, coupling redox homeostasis to mitochondrial GSH import in mammalian cells. Our work identifies SLC25A39 as an essential and regulated component of the mitochondrial GSH-import machinery.


Subject(s)
Glutathione/metabolism , Mitochondria/metabolism , Mitochondrial Membrane Transport Proteins/metabolism , Animals , Biological Transport , Cell Proliferation , Cells, Cultured , Erythropoiesis , Glutathione/deficiency , Homeostasis , Humans , Iron-Sulfur Proteins/metabolism , Mice , Mitochondrial Membrane Transport Proteins/genetics , Oxidation-Reduction , Proteome , Proteomics
4.
Hum Mol Genet ; 31(18): 3191-3205, 2022 09 10.
Article in English | MEDLINE | ID: mdl-35157052

ABSTRACT

Type 2 diabetes is a complex, systemic disease affected by both genetic and environmental factors. Previous research has identified genetic variants associated with type 2 diabetes risk; however, gene regulatory changes underlying progression to metabolic dysfunction are still largely unknown. We investigated RNA expression changes that occur during diabetes progression using a two-stage approach. In our discovery stage, we compared changes in gene expression using two longitudinally collected blood samples from subjects whose fasting blood glucose transitioned to a level consistent with type 2 diabetes diagnosis between the time points against those who did not with a novel analytical network approach. Our network methodology identified 17 networks, one of which was significantly associated with transition status. This 822-gene network harbors many genes novel to the type 2 diabetes literature but is also significantly enriched for genes previously associated with type 2 diabetes. In the validation stage, we queried associations of genetically determined expression with diabetes-related traits in a large biobank with linked electronic health records. We observed a significant enrichment of genes in our identified network whose genetically determined expression is associated with type 2 diabetes and other metabolic traits and validated 31 genes that are not near previously reported type 2 diabetes loci. Finally, we provide additional functional support, which suggests that the genes in this network are regulated by enhancers that operate in human pancreatic islet cells. We present an innovative and systematic approach that identified and validated key gene expression changes associated with type 2 diabetes transition status and demonstrated their translational relevance in a large clinical resource.


Subject(s)
Diabetes Mellitus, Type 2 , Blood Glucose/genetics , Diabetes Mellitus, Type 2/genetics , Diabetes Mellitus, Type 2/metabolism , Gene Expression , Gene Expression Profiling , Gene Regulatory Networks/genetics , Genetic Association Studies , Humans , RNA
5.
Bioinformatics ; 39(1)2023 01 01.
Article in English | MEDLINE | ID: mdl-36413071

ABSTRACT

SUMMARY: Genomic data are often processed in batches and analyzed together to save time. However, it is challenging to combine multiple large VCFs and properly handle imputation quality and missing variants due to the limitations of available tools. To address these concerns, we developed IMMerge, a Python-based tool that takes advantage of multiprocessing to reduce running time. For the first time in a publicly available tool, imputation quality scores are correctly combined with Fisher's z transformation. AVAILABILITY AND IMPLEMENTATION: IMMerge is an open-source project under MIT license. Source code and user manual are available at https://github.com/belowlab/IMMerge.


Subject(s)
Genome , Genomics , Software
6.
BMC Genomics ; 24(1): 75, 2023 Feb 16.
Article in English | MEDLINE | ID: mdl-36797672

ABSTRACT

BACKGROUND: Exfoliation syndrome (XFS) is an age-related systemic disorder characterized by excessive production and progressive accumulation of abnormal extracellular material, with pathognomonic ocular manifestations. It is the most common cause of secondary glaucoma, resulting in widespread global blindness. The largest global meta-analysis of XFS in 123,457 multi-ethnic individuals from 24 countries identified seven loci with the strongest association signal in chr15q22-25 region near LOXL1. Expression analysis have so far correlated coding and a few non-coding variants in the region with LOXL1 expression levels, but functional effects of these variants is unclear. We hypothesize that analysis of the contribution of the genetically determined component of gene expression to XFS risk can provide a powerful method to elucidate potential roles of additional genes and clarify biology that underlie XFS. RESULTS: Transcriptomic Wide Association Studies (TWAS) using PrediXcan models trained in 48 GTEx tissues leveraging on results from the multi-ethnic and European ancestry GWAS were performed. To eliminate the possibility of false-positive results due to Linkage Disequilibrium (LD) contamination, we i) performed PrediXcan analysis in reduced models removing variants in LD with LOXL1 missense variants associated with XFS, and variants in LOXL1 models in both multiethnic and European ancestry individuals, ii) conducted conditional analysis of the significant signals in European ancestry individuals, and iii) filtered signals based on correlated gene expression, LD and shared eQTLs, iv) conducted expression validation analysis in human iris tissues. We observed twenty-eight genes in chr15q22-25 region that showed statistically significant associations, which were whittled down to ten genes after statistical validations. In experimental analysis, mRNA transcript levels for ARID3B, CD276, LOXL1, NEO1, SCAMP2, and UBL7 were significantly decreased in iris tissues from XFS patients compared to control samples. TWAS genes for XFS were significantly enriched for genes associated with inflammatory conditions. We also observed a higher incidence of XFS comorbidity with inflammatory and connective tissue diseases. CONCLUSION: Our results implicate a role for connective tissues and inflammation pathways in the etiology of XFS. Targeting the inflammatory pathway may be a potential therapeutic option to reduce progression in XFS.


Subject(s)
Exfoliation Syndrome , Humans , Exfoliation Syndrome/genetics , Exfoliation Syndrome/complications , Exfoliation Syndrome/metabolism , Amino Acid Oxidoreductases/genetics , RNA, Messenger , Mutation, Missense , Gene Expression , Polymorphism, Single Nucleotide , DNA-Binding Proteins/genetics , B7 Antigens/genetics
7.
Hum Mol Genet ; 31(2): 289-299, 2021 12 27.
Article in English | MEDLINE | ID: mdl-34387340

ABSTRACT

Alzheimer's disease (ad) adversely affects the health, quality of life and independence of patients. There is a critical need to identify novel blood gene biomarkers for ad risk assessment. We performed a transcriptome-wide association study to identify biomarker candidates for ad risk. We leveraged two sets of gene expression prediction models of blood developed using different reference panels and modeling strategies. By applying the prediction models to a meta-GWAS including 71 880 (proxy) cases and 383 378 (proxy) controls, we identified significant associations of genetically determined expression of 108 genes in blood with ad risk. Of these, 15 genes were differentially expressed between ad patients and controls with concordant directions in measured expression data. With evidence from the analyses based on both genetic instruments and directly measured expression levels, this study identifies 15 genes with strong support as biomarkers in blood for ad risk, which may enhance ad risk assessment and mechanism-focused studies.


Subject(s)
Alzheimer Disease , Alzheimer Disease/genetics , Genetic Markers , Genome-Wide Association Study , Humans , Polymorphism, Single Nucleotide/genetics , Quality of Life , Transcriptome/genetics
8.
Int J Cancer ; 150(1): 80-90, 2022 01 01.
Article in English | MEDLINE | ID: mdl-34520569

ABSTRACT

A large proportion of heritability for prostate cancer risk remains unknown. Transcriptome-wide association study combined with validation comparing overall levels will help to identify candidate genes potentially playing a role in prostate cancer development. Using data from the Genotype-Tissue Expression Project, we built genetic models to predict normal prostate tissue gene expression using the statistical framework PrediXcan, a modified version of the unified test for molecular signatures and Joint-Tissue Imputation. We applied these prediction models to the genetic data of 79 194 prostate cancer cases and 61 112 controls to investigate the associations of genetically determined gene expression with prostate cancer risk. Focusing on associated genes, we compared their expression in prostate tumor vs normal prostate tissue, compared methylation of CpG sites located at these loci in prostate tumor vs normal tissue, and assessed the correlations between the differentiated genes' expression and the methylation of corresponding CpG sites, by analyzing The Cancer Genome Atlas (TCGA) data. We identified 573 genes showing an association with prostate cancer risk at a false discovery rate (FDR) ≤ 0.05, including 451 novel genes and 122 previously reported genes. Of the 573 genes, 152 showed differential expression in prostate tumor vs normal tissue samples. At loci of 57 genes, 151 CpG sites showed differential methylation in prostate tumor vs normal tissue samples. Of these, 20 CpG sites were correlated with expression of 11 corresponding genes. In this TWAS, we identified novel candidate susceptibility genes for prostate cancer risk, providing new insights into prostate cancer genetics and biology.


Subject(s)
Biomarkers, Tumor/genetics , Epigenesis, Genetic , Gene Expression Regulation, Neoplastic , Genetic Predisposition to Disease , Polymorphism, Single Nucleotide , Prostatic Neoplasms/pathology , Transcriptome , Case-Control Studies , DNA Methylation , Follow-Up Studies , Genome-Wide Association Study , Humans , Male , Prognosis , Prostatic Neoplasms/epidemiology , Prostatic Neoplasms/genetics , Quantitative Trait Loci , United States/epidemiology
9.
Pharmacogenet Genomics ; 32(4): 144-151, 2022 06 01.
Article in English | MEDLINE | ID: mdl-35383711

ABSTRACT

OBJECTIVE: Prostacyclin infusion for pulmonary arterial hypertension (PAH) is an effective therapy with varied dosing requirements and clinical response. The major aim of this study was to determine new biologically-based predictors of prostacyclin treatment response heterogeneity. METHODS: Ninety-eight patients with hemodynamically defined PAH at two academic medical centers volunteered for registry studies. A stable dose of treprostinil was the quantitative phenotype for the genome-wide association study (GWAS). Candidate genes with the largest effect sizes and strongest statistical associations were further characterized with in silico and in-vitro assays to confirm mechanistic hypotheses. The clinical significance of these candidate predictors was assessed for mechanistically consistent physiologic effects in an independent cohort of patients. RESULTS: GWAS identified three loci for association with P < 10-6. All three loci had clinically significant effect sizes. Specific single-nucleotide polymorphisms (SNPs) at two of the loci: rs11078738 in phosphoribosylformylglycinamidine synthase and rs10023113 in CAMK2D encoded sequence changes with clear predicted consequences. Production of the primary mediator of prostacyclin-induced vasodilation, cyclic AMP, was reduced in human cell lines by the missense variant rs11078738 (p.L621P). Located in the promoter of CAMK2D, the allele of rs10023113 associated with a higher treprostinil dose has higher ventricular transcription of CAMK2δ. At initial diagnostic catheterization in a separate cohort of patients, the same allele of rs10023113 was associated with elevated right mean atrial and ventricular diastolic pressures. CONCLUSIONS: The quantitative phenotype of stable treprostinil dose identified two gene loci associated with pharmacodynamic response and right ventricular function in PAH worth further investigation.


Subject(s)
Hypertension, Pulmonary , Pulmonary Arterial Hypertension , Antihypertensive Agents , Epoprostenol/analogs & derivatives , Epoprostenol/therapeutic use , Familial Primary Pulmonary Hypertension/drug therapy , Genome-Wide Association Study , Humans , Hypertension, Pulmonary/diagnosis , Hypertension, Pulmonary/drug therapy , Hypertension, Pulmonary/genetics
10.
Am J Hum Genet ; 104(6): 1097-1115, 2019 06 06.
Article in English | MEDLINE | ID: mdl-31104770

ABSTRACT

Understanding the nature of the genetic regulation of gene expression promises to advance our understanding of the genetic basis of disease. However, the methodological impact of the use of local ancestry on high-dimensional omics analyses, including, most prominently, expression quantitative trait loci (eQTL) mapping and trait heritability estimation, in admixed populations remains critically underexplored. Here, we develop a statistical framework that characterizes the relationships among the determinants of the genetic architecture of an important class of molecular traits. We provide a computationally efficient approach to local ancestry analysis in eQTL mapping while increasing control of type I and type II error over traditional approaches. Applying our method to National Institute of General Medical Sciences (NIGMS) and Genotype-Tissue Expression (GTEx) datasets, we show that the use of local ancestry can improve eQTL mapping in admixed and multiethnic populations, respectively. We estimate the trait variance explained by ancestry by using local admixture relatedness between individuals. By using simulations of diverse genetic architectures and degrees of confounding, we show improved accuracy in estimating heritability when accounting for local ancestry similarity. Furthermore, we characterize the sparse versus polygenic components of gene expression in admixed individuals. Our study has important methodological implications for genetic analysis of omics traits across a range of genomic contexts, from a single variant to a prioritized region to the entire genome. Our findings highlight the importance of using local ancestry to better characterize the heritability of complex traits and to more accurately map genetic associations.


Subject(s)
Ethnicity/genetics , Gene Expression Regulation , Genetics, Population , Genome-Wide Association Study , Multifactorial Inheritance/genetics , Polymorphism, Single Nucleotide , Quantitative Trait Loci , Humans , Linkage Disequilibrium , Models, Genetic , Phenotype
11.
Am J Hum Genet ; 104(3): 503-519, 2019 03 07.
Article in English | MEDLINE | ID: mdl-30827500

ABSTRACT

Although the use of model systems for studying the mechanism of mutations that have a large effect is common, we highlight here the ways that zebrafish-model-system studies of a gene, GRIK5, that contributes to the polygenic liability to develop eye diseases have helped to illuminate a mechanism that implicates vascular biology in eye disease. A gene-expression prediction derived from a reference transcriptome panel applied to BioVU, a large electronic health record (EHR)-linked biobank at Vanderbilt University Medical Center, implicated reduced GRIK5 expression in diverse eye diseases. We tested the function of GRIK5 by depletion of its ortholog in zebrafish, and we observed reduced blood vessel numbers and integrity in the eye and increased vascular permeability. Analyses of EHRs in >2.6 million Vanderbilt subjects revealed significant comorbidity of eye and vascular diseases (relative risks 2-15); this comorbidity was confirmed in 150 million individuals from a large insurance claims dataset. Subsequent studies in >60,000 genotyped BioVU participants confirmed the association of reduced genetically predicted expression of GRIK5 with comorbid vascular and eye diseases. Our studies pioneer an approach that allows a rapid iteration of the discovery of gene-phenotype relationships to the primary genetic mechanism contributing to the pathophysiology of human disease. Our findings also add dimension to the understanding of the biology driven by glutamate receptors such as GRIK5 (also referred to as GLUK5 in protein form) and to mechanisms contributing to human eye diseases.


Subject(s)
Biological Specimen Banks , Electronic Health Records , Embryo, Nonmammalian/pathology , Eye Diseases/pathology , Gene Expression Regulation , Receptors, Kainic Acid/genetics , Vascular Diseases/pathology , Animals , Embryo, Nonmammalian/metabolism , Eye Diseases/genetics , Eye Diseases/metabolism , Genotype , Humans , Phenomics , Phenotype , Receptors, Kainic Acid/metabolism , Vascular Diseases/genetics , Vascular Diseases/metabolism , Zebrafish
12.
Bioinformatics ; 37(16): 2245-2249, 2021 Aug 25.
Article in English | MEDLINE | ID: mdl-33624746

ABSTRACT

MOTIVATION: Genome-wide association studies have successfully identified multiple independent genetic loci that harbour variants associated with human traits and diseases, but the exact causal genes are largely unknown. Common genetic risk variants are enriched in non-protein-coding regions of the genome and often affect gene expression (expression quantitative trait loci, eQTL) in a tissue-specific manner. To address this challenge, we developed a methodological framework, E-MAGMA, which converts genome-wide association summary statistics into gene-level statistics by assigning risk variants to their putative genes based on tissue-specific eQTL information. RESULTS: We compared E-MAGMA to three eQTL informed gene-based approaches using simulated phenotype data. Phenotypes were simulated based on eQTL reference data using GCTA for all genes with at least one eQTL at chromosome 1. We performed 10 simulations per gene. The eQTL-h2 (i.e. the proportion of variation explained by the eQTLs) was set at 1%, 2% and 5%. We found E-MAGMA outperforms other gene-based approaches across a range of simulated parameters (e.g. the number of identified causal genes). When applied to genome-wide association summary statistics for five neuropsychiatric disorders, E-MAGMA identified more putative candidate causal genes compared to other eQTL-based approaches. By integrating tissue-specific eQTL information, these results show E-MAGMA will help to identify novel candidate causal genes from genome-wide association summary statistics and thereby improve the understanding of the biological basis of complex disorders. AVAILABILITY AND IMPLEMENTATION: A tutorial and input files are made available in a github repository: https://github.com/eskederks/eMAGMA-tutorial. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

13.
Nature ; 536(7614): 41-47, 2016 08 04.
Article in English | MEDLINE | ID: mdl-27398621

ABSTRACT

The genetic architecture of common traits, including the number, frequency, and effect sizes of inherited variants that contribute to individual risk, has been long debated. Genome-wide association studies have identified scores of common variants associated with type 2 diabetes, but in aggregate, these explain only a fraction of the heritability of this disease. Here, to test the hypothesis that lower-frequency variants explain much of the remainder, the GoT2D and T2D-GENES consortia performed whole-genome sequencing in 2,657 European individuals with and without diabetes, and exome sequencing in 12,940 individuals from five ancestry groups. To increase statistical power, we expanded the sample size via genotyping and imputation in a further 111,548 subjects. Variants associated with type 2 diabetes after sequencing were overwhelmingly common and most fell within regions previously identified by genome-wide association studies. Comprehensive enumeration of sequence variation is necessary to identify functional alleles that provide important clues to disease pathophysiology, but large-scale sequencing does not support the idea that lower-frequency variants have a major role in predisposition to type 2 diabetes.


Subject(s)
Diabetes Mellitus, Type 2/genetics , Genetic Predisposition to Disease/genetics , Genetic Variation/genetics , Alleles , DNA Mutational Analysis , Europe/ethnology , Exome , Genome-Wide Association Study , Genotyping Techniques , Humans , Sample Size
14.
PLoS Genet ; 15(7): e1008245, 2019 07.
Article in English | MEDLINE | ID: mdl-31306407

ABSTRACT

Major depression is a common and severe psychiatric disorder with a highly polygenic genetic architecture. Genome-wide association studies have successfully identified multiple independent genetic loci that harbour variants associated with major depression, but the exact causal genes and biological mechanisms are largely unknown. Tissue-specific network approaches may identify molecular mechanisms underlying major depression and provide a biological substrate for integrative analyses. We provide a framework for the identification of individual risk genes and gene co-expression networks using genome-wide association summary statistics and gene expression information across multiple human brain tissues and whole blood. We developed a novel gene-based method called eMAGMA that leverages tissue-specific eQTL information to identify 99 biologically plausible risk genes associated with major depression, of which 58 are novel. Among these novel associations is Complement Factor 4A (C4A), recently implicated in schizophrenia through its role in synaptic pruning during postnatal development. Major depression risk genes were enriched in gene co-expression modules in multiple brain tissues and the implicated gene modules contained genes involved in synaptic signalling, neuronal development, and cell transport pathways. Modules enriched with major depression signals were strongly preserved across brain tissues, but were weakly preserved in whole blood, highlighting the importance of using disease-relevant tissues in genetic studies of psychiatric traits. We identified tissue-specific genes and gene co-expression networks associated with major depression. Our novel analytical framework can be used to gain fundamental insights into the functioning of the nervous system in major depression and other brain-related traits.


Subject(s)
Depressive Disorder, Major/genetics , Gene Expression Profiling/methods , Gene Regulatory Networks , Genome-Wide Association Study/methods , Brain Chemistry , Complement C4a/genetics , Gene Expression Regulation , Humans , Organ Specificity , Quantitative Trait Loci , Sequence Analysis, RNA
15.
Hum Mol Genet ; 28(7): 1212-1224, 2019 04 01.
Article in English | MEDLINE | ID: mdl-30624610

ABSTRACT

Interpretation of genetic association results is difficult because signals often lack biological context. To generate hypotheses of the functional genetic etiology of complex cardiometabolic traits, we estimated the genetically determined component of gene expression from common variants using PrediXcan (1) and determined genes with differential predicted expression by trait. PrediXcan imputes tissue-specific expression levels from genetic variation using variant-level effect on gene expression in transcriptome data. To explore the value of imputed genetically regulated gene expression (GReX) models across different ancestral populations, we evaluated imputed expression levels for predictive accuracy genome-wide in RNA sequence data in samples drawn from European-ancestry and African-ancestry populations and identified substantial predictive power using European-derived models in a non-European target population. We then tested the association of GReX on 15 cardiometabolic traits including blood lipid levels, body mass index, height, blood pressure, fasting glucose and insulin, RR interval, fibrinogen level, factor VII level and white blood cell and platelet counts in 15 755 individuals across three ancestry groups, resulting in 20 novel gene-phenotype associations reaching experiment-wide significance across ancestries. In addition, we identified 18 significant novel gene-phenotype associations in our ancestry-specific analyses. Top associations were assessed for additional support via query of S-PrediXcan (2) results derived from publicly available genome-wide association studies summary data. Collectively, these findings illustrate the utility of transcriptome-based imputation models for discovery of cardiometabolic effect genes in a diverse dataset.


Subject(s)
Forecasting/methods , Metabolome/genetics , Metabolome/physiology , Adult , Aged , Blood Pressure , Body Mass Index , Chromosome Mapping/methods , Ethnicity/genetics , Female , Genetic Association Studies/methods , Genome-Wide Association Study/methods , Humans , Male , Middle Aged , Multifactorial Inheritance/genetics , Phenotype , Polymorphism, Single Nucleotide/genetics , Transcriptome/genetics , White People/genetics
16.
Am J Med Genet B Neuropsychiatr Genet ; 186(3): 162-172, 2021 04.
Article in English | MEDLINE | ID: mdl-33369091

ABSTRACT

Genome-wide association studies have identified multiple genetic risk factors underlying susceptibility to substance use, however, the functional genes and biological mechanisms remain poorly understood. The discovery and characterization of risk genes can be facilitated by the integration of genome-wide association data and gene expression data across biologically relevant tissues and/or cell types to identify genes whose expression is altered by DNA sequence variation (expression quantitative trait loci; eQTLs). The integration of gene expression data can be extended to the study of genetic co-expression, under the biologically valid assumption that genes form co-expression networks to influence the manifestation of a disease or trait. Here, we integrate genome-wide association data with gene expression data from 13 brain tissues to identify candidate risk genes for 8 substance use phenotypes. We then test for the enrichment of candidate risk genes within tissue-specific gene co-expression networks to identify modules (or groups) of functionally related genes whose dysregulation is associated with variation in substance use. We identified eight gene modules in brain that were enriched with gene-based association signals for substance use phenotypes. For example, a single module of 40 co-expressed genes was enriched with gene-based associations for drinks per week and biological pathways involved in GABA synthesis, release, reuptake and degradation. Our study demonstrates the utility of eQTL and gene co-expression analysis to uncover novel biological mechanisms for substance use traits.


Subject(s)
Gene Regulatory Networks , Genetic Predisposition to Disease , Polymorphism, Single Nucleotide , Quantitative Trait Loci , Substance-Related Disorders/genetics , Gene Expression Profiling , Genetic Testing , Genome-Wide Association Study , Humans , Phenotype , Substance-Related Disorders/pathology
17.
Genet Med ; 22(7): 1191-1200, 2020 07.
Article in English | MEDLINE | ID: mdl-32296164

ABSTRACT

PURPOSE: The increasing use of electronic health records (EHRs) and biobanks offers unique opportunities to study Mendelian diseases. We described a novel approach to summarize clinical manifestations from patient EHRs into phenotypic evidence for cystic fibrosis (CF) with potential to alert unrecognized patients of the disease. METHODS: We estimated genetically predicted expression (GReX) of cystic fibrosis transmembrane conductance regulator (CFTR) and tested for association with clinical diagnoses in the Vanderbilt University biobank (N = 9142 persons of European descent with 71 cases of CF). The top associated EHR phenotypes were assessed in combination as a phenotype risk score (PheRS) for discriminating CF case status in an additional 2.8 million patients from Vanderbilt University Medical Center (VUMC) and 125,305 adult patients including 25,314 CF cases from MarketScan, an independent external cohort. RESULTS: GReX of CFTR was associated with EHR phenotypes consistent with CF. PheRS constructed using the EHR phenotypes and weights discovered by the genetic associations improved discriminative power for CF over the initially proposed PheRS in both VUMC and MarketScan. CONCLUSION: Our study demonstrates the power of EHRs for clinical description of CF and the benefits of using a genetics-informed weighing scheme in construction of a phenotype risk score. This research may find broad applications for phenomic studies of Mendelian disease genes.


Subject(s)
Cystic Fibrosis , Adult , Cystic Fibrosis/genetics , Cystic Fibrosis Transmembrane Conductance Regulator/genetics , Electronic Health Records , Humans , Mutation , Phenotype
18.
Nature ; 507(7492): 371-5, 2014 Mar 20.
Article in English | MEDLINE | ID: mdl-24646999

ABSTRACT

Genome-wide association studies (GWAS) have reproducibly associated variants within introns of FTO with increased risk for obesity and type 2 diabetes (T2D). Although the molecular mechanisms linking these noncoding variants with obesity are not immediately obvious, subsequent studies in mice demonstrated that FTO expression levels influence body mass and composition phenotypes. However, no direct connection between the obesity-associated variants and FTO expression or function has been made. Here we show that the obesity-associated noncoding sequences within FTO are functionally connected, at megabase distances, with the homeobox gene IRX3. The obesity-associated FTO region directly interacts with the promoters of IRX3 as well as FTO in the human, mouse and zebrafish genomes. Furthermore, long-range enhancers within this region recapitulate aspects of IRX3 expression, suggesting that the obesity-associated interval belongs to the regulatory landscape of IRX3. Consistent with this, obesity-associated single nucleotide polymorphisms are associated with expression of IRX3, but not FTO, in human brains. A direct link between IRX3 expression and regulation of body mass and composition is demonstrated by a reduction in body weight of 25 to 30% in Irx3-deficient mice, primarily through the loss of fat mass and increase in basal metabolic rate with browning of white adipose tissue. Finally, hypothalamic expression of a dominant-negative form of Irx3 reproduces the metabolic phenotypes of Irx3-deficient mice. Our data suggest that IRX3 is a functional long-range target of obesity-associated variants within FTO and represents a novel determinant of body mass and composition.


Subject(s)
Homeodomain Proteins/genetics , Introns/genetics , Mixed Function Oxygenases/genetics , Obesity/genetics , Oxo-Acid-Lyases/genetics , Proteins/genetics , Transcription Factors/genetics , Adipose Tissue/metabolism , Alpha-Ketoglutarate-Dependent Dioxygenase FTO , Animals , Basal Metabolism/genetics , Body Mass Index , Body Weight/genetics , Brain/metabolism , Diabetes Mellitus, Type 2/genetics , Diet , Genes, Dominant/genetics , Homeodomain Proteins/metabolism , Humans , Hypothalamus/metabolism , Male , Mice , Phenotype , Polymorphism, Single Nucleotide/genetics , Promoter Regions, Genetic/genetics , Thinness/genetics , Transcription Factors/deficiency , Transcription Factors/metabolism , Zebrafish/embryology , Zebrafish/genetics
19.
Genet Epidemiol ; 42(1): 49-63, 2018 02.
Article in English | MEDLINE | ID: mdl-29114909

ABSTRACT

BACKGROUND: Epistasis and gene-environment interactions are known to contribute significantly to variation of complex phenotypes in model organisms. However, their identification in human association studies remains challenging for myriad reasons. In the case of epistatic interactions, the large number of potential interacting sets of genes presents computational, multiple hypothesis correction, and other statistical power issues. In the case of gene-environment interactions, the lack of consistently measured environmental covariates in most disease studies precludes searching for interactions and creates difficulties for replicating studies. RESULTS: In this work, we develop a new statistical approach to address these issues that leverages genetic ancestry, defined as the proportion of ancestry derived from each ancestral population (e.g., the fraction of European/African ancestry in African Americans), in admixed populations. We applied our method to gene expression and methylation data from African American and Latino admixed individuals, respectively, identifying nine interactions that were significant at P<5×10-8. We show that two of the interactions in methylation data replicate, and the remaining six are significantly enriched for low P-values (P<1.8×10-6). CONCLUSION: We show that genetic ancestry can be a useful proxy for unknown and unmeasured covariates in the search for interaction effects. These results have important implications for our understanding of the genetic architecture of complex traits.


Subject(s)
Black People/genetics , Black or African American/genetics , Epistasis, Genetic/genetics , Gene-Environment Interaction , Hispanic or Latino/genetics , Models, Genetic , White People/genetics , DNA Methylation , Humans , Phenotype
20.
Am J Hum Genet ; 98(4): 697-708, 2016 Apr 07.
Article in English | MEDLINE | ID: mdl-27040689

ABSTRACT

Gene expression and its regulation can vary substantially across tissue types. In order to generate knowledge about gene expression in human tissues, the Genotype-Tissue Expression (GTEx) program has collected transcriptome data in a wide variety of tissue types from post-mortem donors. However, many tissue types are difficult to access and are not collected in every GTEx individual. Furthermore, in non-GTEx studies, the accessibility of certain tissue types greatly limits the feasibility and scale of studies of multi-tissue expression. In this work, we developed multi-tissue imputation methods to impute gene expression in uncollected or inaccessible tissues. Via simulation studies, we showed that the proposed methods outperform existing imputation methods in multi-tissue expression imputation and that incorporating imputed expression data can improve power to detect phenotype-expression correlations. By analyzing data from nine selected tissue types in the GTEx pilot project, we demonstrated that harnessing expression quantitative trait loci (eQTLs) and tissue-tissue expression-level correlations can aid imputation of transcriptome data from uncollected GTEx tissues. More importantly, we showed that by using GTEx data as a reference, one can impute expression levels in inaccessible tissues in non-GTEx expression studies.


Subject(s)
Gene Expression Regulation , Genotype , Quantitative Trait Loci , Transcriptome , Humans , Phenotype , Pilot Projects , Reproducibility of Results
SELECTION OF CITATIONS
SEARCH DETAIL