Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 225
Filter
Add more filters

Publication year range
1.
Nature ; 628(8006): 130-138, 2024 Apr.
Article in English | MEDLINE | ID: mdl-38448586

ABSTRACT

Genome-wide association analyses using high-throughput metabolomics platforms have led to novel insights into the biology of human metabolism1-7. This detailed knowledge of the genetic determinants of systemic metabolism has been pivotal for uncovering how genetic pathways influence biological mechanisms and complex diseases8-11. Here we present a genome-wide association study for 233 circulating metabolic traits quantified by nuclear magnetic resonance spectroscopy in up to 136,016 participants from 33 cohorts. We identify more than 400 independent loci and assign probable causal genes at two-thirds of these using manual curation of plausible biological candidates. We highlight the importance of sample and participant characteristics that can have significant effects on genetic associations. We use detailed metabolic profiling of lipoprotein- and lipid-associated variants to better characterize how known lipid loci and novel loci affect lipoprotein metabolism at a granular level. We demonstrate the translational utility of comprehensively phenotyped molecular data, characterizing the metabolic associations of intrahepatic cholestasis of pregnancy. Finally, we observe substantial genetic pleiotropy for multiple metabolic pathways and illustrate the importance of careful instrument selection in Mendelian randomization analysis, revealing a putative causal relationship between acetone and hypertension. Our publicly available results provide a foundational resource for the community to examine the role of metabolism across diverse diseases.


Subject(s)
Biomarkers , Genome-Wide Association Study , Metabolomics , Female , Humans , Pregnancy , Acetone/blood , Acetone/metabolism , Biomarkers/blood , Biomarkers/metabolism , Cholestasis, Intrahepatic/blood , Cholestasis, Intrahepatic/genetics , Cholestasis, Intrahepatic/metabolism , Cohort Studies , Genome-Wide Association Study/methods , Hypertension/blood , Hypertension/genetics , Hypertension/metabolism , Lipoproteins/genetics , Lipoproteins/metabolism , Magnetic Resonance Spectroscopy , Mendelian Randomization Analysis , Metabolic Networks and Pathways/genetics , Phenotype , Polymorphism, Single Nucleotide/genetics , Pregnancy Complications/blood , Pregnancy Complications/genetics , Pregnancy Complications/metabolism
2.
Am J Hum Genet ; 111(3): 428, 2024 Mar 07.
Article in English | MEDLINE | ID: mdl-38458165

ABSTRACT

This article is based on the address given by the author at the 2023 meeting of the American Society of Human Genetics (ASHG) in Washington, D.C. The video of the original address can be found at the ASHG website.


Subject(s)
Awards and Prizes , Genetics, Medical , United States , Humans , Leadership
3.
Nature ; 590(7845): 290-299, 2021 02.
Article in English | MEDLINE | ID: mdl-33568819

ABSTRACT

The Trans-Omics for Precision Medicine (TOPMed) programme seeks to elucidate the genetic architecture and biology of heart, lung, blood and sleep disorders, with the ultimate goal of improving diagnosis, treatment and prevention of these diseases. The initial phases of the programme focused on whole-genome sequencing of individuals with rich phenotypic data and diverse backgrounds. Here we describe the TOPMed goals and design as well as the available resources and early insights obtained from the sequence data. The resources include a variant browser, a genotype imputation server, and genomic and phenotypic data that are available through dbGaP (Database of Genotypes and Phenotypes)1. In the first 53,831 TOPMed samples, we detected more than 400 million single-nucleotide and insertion or deletion variants after alignment with the reference genome. Additional previously undescribed variants were detected through assembly of unmapped reads and customized analysis in highly variable loci. Among the more than 400 million detected variants, 97% have frequencies of less than 1% and 46% are singletons that are present in only one individual (53% among unrelated individuals). These rare variants provide insights into mutational processes and recent human evolutionary history. The extensive catalogue of genetic variation in TOPMed studies provides unique opportunities for exploring the contributions of rare and noncoding sequence variants to phenotypic variation. Furthermore, combining TOPMed haplotypes with modern imputation methods improves the power and reach of genome-wide association studies to include variants down to a frequency of approximately 0.01%.


Subject(s)
Genetic Variation/genetics , Genome, Human/genetics , Genomics , National Heart, Lung, and Blood Institute (U.S.) , Precision Medicine , Cytochrome P-450 CYP2D6/genetics , Haplotypes/genetics , Heterozygote , Humans , INDEL Mutation , Loss of Function Mutation , Mutagenesis , Phenotype , Polymorphism, Single Nucleotide , Population Density , Precision Medicine/standards , Quality Control , Sample Size , United States , Whole Genome Sequencing/standards
4.
Nature ; 582(7813): 577-581, 2020 06.
Article in English | MEDLINE | ID: mdl-32499649

ABSTRACT

Many common illnesses, for reasons that have not been identified, differentially affect men and women. For instance, the autoimmune diseases systemic lupus erythematosus (SLE) and Sjögren's syndrome affect nine times more women than men1, whereas schizophrenia affects men with greater frequency and severity relative to women2. All three illnesses have their strongest common genetic associations in the major histocompatibility complex (MHC) locus, an association that in SLE and Sjögren's syndrome has long been thought to arise from alleles of the human leukocyte antigen (HLA) genes at that locus3-6. Here we show that variation of the complement component 4 (C4) genes C4A and C4B, which are also at the MHC locus and have been linked to increased risk for schizophrenia7, generates 7-fold variation in risk for SLE and 16-fold variation in risk for Sjögren's syndrome among individuals with common C4 genotypes, with C4A protecting more strongly than C4B in both illnesses. The same alleles that increase risk for schizophrenia greatly reduce risk for SLE and Sjögren's syndrome. In all three illnesses, C4 alleles act more strongly in men than in women: common combinations of C4A and C4B generated 14-fold variation in risk for SLE, 31-fold variation in risk for Sjögren's syndrome, and 1.7-fold variation in schizophrenia risk among men (versus 6-fold, 15-fold and 1.26-fold variation in risk among women, respectively). At a protein level, both C4 and its effector C3 were present at higher levels in cerebrospinal fluid and plasma8,9 in men than in women among adults aged between 20 and 50 years, corresponding to the ages of differential disease vulnerability. Sex differences in complement protein levels may help to explain the more potent effects of C4 alleles in men, women's greater risk of SLE and Sjögren's syndrome and men's greater vulnerability to schizophrenia. These results implicate the complement system as a source of sexual dimorphism in vulnerability to diverse illnesses.


Subject(s)
Complement C3/genetics , Complement C4/genetics , Lupus Erythematosus, Systemic/genetics , Sex Characteristics , Sjogren's Syndrome/genetics , Adult , Alleles , Complement C3/analysis , Complement C3/cerebrospinal fluid , Complement C4/analysis , Complement C4/cerebrospinal fluid , Female , Genetic Predisposition to Disease , HLA Antigens/genetics , Haplotypes , Humans , Lupus Erythematosus, Systemic/blood , Lupus Erythematosus, Systemic/cerebrospinal fluid , Major Histocompatibility Complex/genetics , Male , Middle Aged , Sjogren's Syndrome/blood , Sjogren's Syndrome/cerebrospinal fluid , Young Adult
5.
Nature ; 582(7811): 240-245, 2020 06.
Article in English | MEDLINE | ID: mdl-32499647

ABSTRACT

Meta-analyses of genome-wide association studies (GWAS) have identified more than 240 loci that are associated with type 2 diabetes (T2D)1,2; however, most of these loci have been identified in analyses of individuals with European ancestry. Here, to examine T2D risk in East Asian individuals, we carried out a meta-analysis of GWAS data from 77,418 individuals with T2D and 356,122 healthy control individuals. In the main analysis, we identified 301 distinct association signals at 183 loci, and across T2D association models with and without consideration of body mass index and sex, we identified 61 loci that are newly implicated in predisposition to T2D. Common variants associated with T2D in both East Asian and European populations exhibited strongly correlated effect sizes. Previously undescribed associations include signals in or near GDAP1, PTF1A, SIX3, ALDH2, a microRNA cluster, and genes that affect the differentiation of muscle and adipose cells3. At another locus, expression quantitative trait loci at two overlapping T2D signals affect two genes-NKX6-3 and ANK1-in different tissues4-6. Association studies in diverse populations identify additional loci and elucidate disease-associated genes, biology, and pathways.


Subject(s)
Asian People/genetics , Diabetes Mellitus, Type 2/genetics , Genetic Predisposition to Disease , Aldehyde Dehydrogenase, Mitochondrial/genetics , Alleles , Ankyrins/genetics , Body Mass Index , Case-Control Studies , Europe/ethnology , Eye Proteins/genetics , Asia, Eastern/ethnology , Female , Genome-Wide Association Study , Homeodomain Proteins/genetics , Humans , Male , Nerve Tissue Proteins/genetics , RNA, Messenger/analysis , Transcription Factors/genetics , Transcription, Genetic , Homeobox Protein SIX3
6.
Am J Hum Genet ; 109(1): 66-80, 2022 01 06.
Article in English | MEDLINE | ID: mdl-34995504

ABSTRACT

Alternate splicing events can create isoforms that alter gene function, and genetic variants associated with alternate gene isoforms may reveal molecular mechanisms of disease. We used subcutaneous adipose tissue of 426 Finnish men from the METSIM study and identified splice junction quantitative trait loci (sQTLs) for 6,077 splice junctions (FDR < 1%). In the same individuals, we detected expression QTLs (eQTLs) for 59,443 exons and 15,397 genes (FDR < 1%). We identified 595 genes with an sQTL and exon eQTL but no gene eQTL, which could indicate potential isoform differences. Of the significant sQTL signals, 2,114 (39.8%) included at least one proxy variant (linkage disequilibrium r2 > 0.8) located within an intron spanned by the splice junction. We identified 203 sQTLs that colocalized with 141 genome-wide association study (GWAS) signals for cardiometabolic traits, including 25 signals for lipid traits, 24 signals for body mass index (BMI), and 12 signals for waist-hip ratio adjusted for BMI. Among all 141 GWAS signals colocalized with an sQTL, we detected 26 that also colocalized with an exon eQTL for an exon skipped by the sQTL splice junction. At a GWAS signal for high-density lipoprotein cholesterol colocalized with an NR1H3 sQTL splice junction, we show that the alternative splice product encodes an NR1H3 transcription factor that lacks a DNA binding domain and fails to activate transcription. Together, these results detect splicing events and candidate mechanisms that may contribute to gene function at GWAS loci.


Subject(s)
Alternative Splicing , Cardiometabolic Risk Factors , Gene Expression Regulation , Quantitative Trait Loci , Quantitative Trait, Heritable , Subcutaneous Fat/metabolism , Binding Sites , Cardiovascular Diseases/etiology , Cardiovascular Diseases/metabolism , Computational Biology/methods , Exons , Finland , Genes, Reporter , Genetic Association Studies , Genetic Predisposition to Disease , Genetics, Population , Genome-Wide Association Study/methods , High-Throughput Nucleotide Sequencing , Humans , Liver X Receptors/genetics , Male , Metabolic Syndrome/etiology , Metabolic Syndrome/metabolism , Molecular Sequence Annotation , Phenotype , Protein Isoforms/genetics , RNA Splice Sites , RNA-Binding Proteins
7.
Am J Hum Genet ; 109(9): 1653-1666, 2022 09 01.
Article in English | MEDLINE | ID: mdl-35981533

ABSTRACT

Understanding the genetic basis of human diseases and traits is dependent on the identification and accurate genotyping of genetic variants. Deep whole-genome sequencing (WGS), the gold standard technology for SNP and indel identification and genotyping, remains very expensive for most large studies. Here, we quantify the extent to which array genotyping followed by genotype imputation can approximate WGS in studies of individuals of African, Hispanic/Latino, and European ancestry in the US and of Finnish ancestry in Finland (a population isolate). For each study, we performed genotype imputation by using the genetic variants present on the Illumina Core, OmniExpress, MEGA, and Omni 2.5M arrays with the 1000G, HRC, and TOPMed imputation reference panels. Using the Omni 2.5M array and the TOPMed panel, ≥90% of bi-allelic single-nucleotide variants (SNVs) are well imputed (r2 > 0.8) down to minor-allele frequencies (MAFs) of 0.14% in African, 0.11% in Hispanic/Latino, 0.35% in European, and 0.85% in Finnish ancestries. There was little difference in TOPMed-based imputation quality among the arrays with >700k variants. Individual-level imputation quality varied widely between and within the three US studies. Imputation quality also varied across genomic regions, producing regions where even common (MAF > 5%) variants were consistently not well imputed across ancestries. The extent to which array genotyping and imputation can approximate WGS therefore depends on reference panel, genotype array, sample ancestry, and genomic location. Imputation quality by variant or genomic region can be queried with our new tool, RsqBrowser, now deployed on the Michigan Imputation Server.


Subject(s)
High-Throughput Nucleotide Sequencing , Polymorphism, Single Nucleotide , Gene Frequency/genetics , Genome-Wide Association Study , Genotype , Humans , Polymorphism, Single Nucleotide/genetics , Whole Genome Sequencing
8.
Am J Hum Genet ; 109(10): 1727-1741, 2022 10 06.
Article in English | MEDLINE | ID: mdl-36055244

ABSTRACT

Transcriptomics data have been integrated with genome-wide association studies (GWASs) to help understand disease/trait molecular mechanisms. The utility of metabolomics, integrated with transcriptomics and disease GWASs, to understand molecular mechanisms for metabolite levels or diseases has not been thoroughly evaluated. We performed probabilistic transcriptome-wide association and locus-level colocalization analyses to integrate transcriptomics results for 49 tissues in 706 individuals from the GTEx project, metabolomics results for 1,391 plasma metabolites in 6,136 Finnish men from the METSIM study, and GWAS results for 2,861 disease traits in 260,405 Finnish individuals from the FinnGen study. We found that genetic variants that regulate metabolite levels were more likely to influence gene expression and disease risk compared to the ones that do not. Integrating transcriptomics with metabolomics results prioritized 397 genes for 521 metabolites, including 496 previously identified gene-metabolite pairs with strong functional connections and suggested 33.3% of such gene-metabolite pairs shared the same causal variants with genetic associations of gene expression. Integrating transcriptomics and metabolomics individually with FinnGen GWAS results identified 1,597 genes for 790 disease traits. Integrating transcriptomics and metabolomics jointly with FinnGen GWAS results helped pinpoint metabolic pathways from genes to diseases. We identified putative causal effects of UGT1A1/UGT1A4 expression on gallbladder disorders through regulating plasma (E,E)-bilirubin levels, of SLC22A5 expression on nasal polyps and plasma carnitine levels through distinct pathways, and of LIPC expression on age-related macular degeneration through glycerophospholipid metabolic pathways. Our study highlights the power of integrating multiple sets of molecular traits and GWAS results to deepen understanding of disease pathophysiology.


Subject(s)
Genome-Wide Association Study , Transcriptome , Bilirubin , Carnitine , Glycerophospholipids , Humans , Male , Metabolomics , Quantitative Trait Loci/genetics , Solute Carrier Family 22 Member 5/genetics , Transcriptome/genetics
10.
Nature ; 572(7769): 323-328, 2019 08.
Article in English | MEDLINE | ID: mdl-31367044

ABSTRACT

Exome-sequencing studies have generally been underpowered to identify deleterious alleles with a large effect on complex traits as such alleles are mostly rare. Because the population of northern and eastern Finland has expanded considerably and in isolation following a series of bottlenecks, individuals of these populations have numerous deleterious alleles at a relatively high frequency. Here, using exome sequencing of nearly 20,000 individuals from these regions, we investigate the role of rare coding variants in clinically relevant quantitative cardiometabolic traits. Exome-wide association studies for 64 quantitative traits identified 26 newly associated deleterious alleles. Of these 26 alleles, 19 are either unique to or more than 20 times more frequent in Finnish individuals than in other Europeans and show geographical clustering comparable to Mendelian disease mutations that are characteristic of the Finnish population. We estimate that sequencing studies of populations without this unique history would require hundreds of thousands to millions of participants to achieve comparable association power.


Subject(s)
Exome Sequencing , Genetic Association Studies/methods , Genetic Predisposition to Disease/genetics , Genetic Variation/genetics , Quantitative Trait Loci/genetics , Alleles , Cholesterol, HDL/genetics , Cluster Analysis , Endpoint Determination , Finland , Geographic Mapping , Humans , Multifactorial Inheritance/genetics , Reproducibility of Results
11.
Genet Epidemiol ; 47(4): 303-313, 2023 06.
Article in English | MEDLINE | ID: mdl-36821788

ABSTRACT

Polygenic risk scores (PRS) quantify the genetic liability to disease and are calculated using an individual's genotype profile and disease-specific genome-wide association study (GWAS) summary statistics. Type 1 (T1D) and type 2 (T2D) diabetes both are determined in part by genetic loci. Correctly differentiating between types of diabetes is crucial for accurate diagnosis and treatment. PRS have the potential to address possible misclassification of T1D and T2D. Here we evaluated PRS models for T1D and T2D in European genetic ancestry participants from the UK Biobank (UKB) and then in the Michigan Genomics Initiative (MGI). Specifically, we investigated the utility of T1D and T2D PRS to discriminate between T1D, T2D, and controls in unrelated UKB individuals of European ancestry. We derived PRS models using external non-UKB GWAS. The T1D PRS model with the best discrimination between T1D cases and controls (area under the receiver operator curve [AUC] = 0.805) also yielded the best discrimination of T1D from T2D cases in the UKB (AUC = 0.792) and separation in MGI (AUC = 0.686). In contrast, the best T2D model did not discriminate between T1D and T2D cases (AUC = 0.527). Our analysis suggests that a T1D PRS model based on independent single nucleotide polymorphisms may help differentiate between T1D, T2D, and controls in individuals of European genetic ancestry.


Subject(s)
Diabetes Mellitus, Type 1 , Diabetes Mellitus, Type 2 , Humans , Diabetes Mellitus, Type 2/diagnosis , Diabetes Mellitus, Type 2/genetics , Diabetes Mellitus, Type 1/genetics , Genome-Wide Association Study , Genetic Predisposition to Disease , Models, Genetic , Risk Factors , Multifactorial Inheritance/genetics
12.
Am J Hum Genet ; 108(4): 669-681, 2021 04 01.
Article in English | MEDLINE | ID: mdl-33730541

ABSTRACT

Tests of association between a phenotype and a set of genes in a biological pathway can provide insights into the genetic architecture of complex phenotypes beyond those obtained from single-variant or single-gene association analysis. However, most existing gene set tests have limited power to detect gene set-phenotype association when a small fraction of the genes are associated with the phenotype and cannot identify the potentially "active" genes that might drive a gene set-based association. To address these issues, we have developed Gene set analysis Association Using Sparse Signals (GAUSS), a method for gene set association analysis that requires only GWAS summary statistics. For each significantly associated gene set, GAUSS identifies the subset of genes that have the maximal evidence of association and can best account for the gene set association. Using pre-computed correlation structure among test statistics from a reference panel, our p value calculation is substantially faster than other permutation- or simulation-based approaches. In simulations with varying proportions of causal genes, we find that GAUSS effectively controls type 1 error rate and has greater power than several existing methods, particularly when a small proportion of genes account for the gene set signal. Using GAUSS, we analyzed UK Biobank GWAS summary statistics for 10,679 gene sets and 1,403 binary phenotypes. We found that GAUSS is scalable and identified 13,466 phenotype and gene set association pairs. Within these gene sets, we identify an average of 17.2 (max = 405) genes that underlie these gene set associations.


Subject(s)
Biological Specimen Banks , Data Interpretation, Statistical , Databases, Genetic , Datasets as Topic , Genome-Wide Association Study/methods , Phenotype , ATP-Binding Cassette Transporters/genetics , Computer Simulation , Gene Expression/genetics , Humans , Research Design , Time Factors , United Kingdom , Web Browser
13.
Am J Hum Genet ; 108(4): 583-596, 2021 04 01.
Article in English | MEDLINE | ID: mdl-33798444

ABSTRACT

The contribution of genome structural variation (SV) to quantitative traits associated with cardiometabolic diseases remains largely unknown. Here, we present the results of a study examining genetic association between SVs and cardiometabolic traits in the Finnish population. We used sensitive methods to identify and genotype 129,166 high-confidence SVs from deep whole-genome sequencing (WGS) data of 4,848 individuals. We tested the 64,572 common and low-frequency SVs for association with 116 quantitative traits and tested candidate associations using exome sequencing and array genotype data from an additional 15,205 individuals. We discovered 31 genome-wide significant associations at 15 loci, including 2 loci at which SVs have strong phenotypic effects: (1) a deletion of the ALB promoter that is greatly enriched in the Finnish population and causes decreased serum albumin level in carriers (p = 1.47 × 10-54) and is also associated with increased levels of total cholesterol (p = 1.22 × 10-28) and 14 additional cholesterol-related traits, and (2) a multi-allelic copy number variant (CNV) at PDPR that is strongly associated with pyruvate (p = 4.81 × 10-21) and alanine (p = 6.14 × 10-12) levels and resides within a structurally complex genomic region that has accumulated many rearrangements over evolutionary time. We also confirmed six previously reported associations, including five led by stronger signals in single nucleotide variants (SNVs) and one linking recurrent HP gene deletion and cholesterol levels (p = 6.24 × 10-10), which was also found to be strongly associated with increased glycoprotein level (p = 3.53 × 10-35). Our study confirms that integrating SVs in trait-mapping studies will expand our knowledge of genetic factors underlying disease risk.


Subject(s)
Cardiovascular Diseases/genetics , Genomic Structural Variation/genetics , Alleles , Cholesterol/blood , DNA Copy Number Variations/genetics , Female , Finland , Genome, Human/genetics , Genotype , High-Throughput Nucleotide Sequencing , Humans , Male , Mitochondrial Proteins/genetics , Promoter Regions, Genetic/genetics , Pyruvate Dehydrogenase (Lipoamide)-Phosphatase/genetics , Pyruvic Acid/metabolism , Serum Albumin, Human/genetics
14.
Diabetologia ; 66(8): 1472-1480, 2023 08.
Article in English | MEDLINE | ID: mdl-37280435

ABSTRACT

AIMS/HYPOTHESIS: Determining how high BMI at different time points influences the risk of developing type 2 diabetes and affects insulin secretion and insulin sensitivity is critical. METHODS: By estimating childhood BMI in 441,761 individuals in the UK Biobank, we identified which genetic variants had larger effects on adulthood BMI than on childhood BMI, and vice versa. All genome-wide significant genetic variants were then used to separate the independent genetic effects of high childhood BMI from those of high adulthood BMI on the risk of type 2 diabetes and insulin-related phenotypes using Mendelian randomisation. We performed two-sample MR using external studies of type 2 diabetes, and oral and intravenous measures of insulin secretion and sensitivity. RESULTS: We found that a childhood BMI that was one standard deviation (1.97 kg/m2) higher than the mean, corrected for the independent genetic liability to adulthood BMI, was associated with a protective effect for seven measures of insulin sensitivity and secretion, including increased insulin sensitivity index (ß=0.15; 95% CI 0.067, 0.225; p=2.79×10-4) and reduced fasting glucose levels (ß=-0.053; 95% CI -0.089, -0.017; p=4.31×10-3). However, there was little to no evidence of a direct protective effect on type 2 diabetes (OR 0.94; 95% CI 0.85, 1.04; p=0.228) independently of genetic liability to adulthood BMI. CONCLUSIONS/INTERPRETATION: Our results provide evidence of the protective effect of higher childhood BMI on insulin secretion and sensitivity, which are crucial intermediate diabetes traits. However, we stress that our results should not currently lead to any change in public health or clinical practice, given the uncertainty regarding the biological pathway of these effects and the limitations of this type of study.


Subject(s)
Diabetes Mellitus, Type 2 , Insulin Resistance , Humans , Diabetes Mellitus, Type 2/genetics , Diabetes Mellitus, Type 2/metabolism , Insulin Resistance/genetics , Body Mass Index , Phenotype , Insulin/genetics , Mendelian Randomization Analysis , Genome-Wide Association Study , Polymorphism, Single Nucleotide
15.
Genome Res ; 30(2): 185-194, 2020 02.
Article in English | MEDLINE | ID: mdl-31980570

ABSTRACT

Detecting and estimating DNA sample contamination are important steps to ensure high-quality genotype calls and reliable downstream analysis. Existing methods rely on population allele frequency information for accurate estimation of contamination rates. Correctly specifying population allele frequencies for each individual in early stage of sequence analysis is impractical or even impossible for large-scale sequencing centers that simultaneously process samples from multiple studies across diverse populations. On the other hand, incorrectly specified allele frequencies may result in substantial bias in estimated contamination rates. For example, we observed that existing methods often fail to identify 10% contaminated samples at a typical 3% contamination exclusion threshold when genetic ancestry is misspecified. Such an incomplete screening of contaminated samples substantially inflates the estimated rate of genotyping errors even in deeply sequenced genomes and exomes. We propose a robust statistical method that accurately estimates DNA contamination and is agnostic to genetic ancestry of the intended or contaminating sample. Our method integrates the estimation of genetic ancestry and DNA contamination in a unified likelihood framework by leveraging individual-specific allele frequencies projected from reference genotypes onto principal component coordinates. Our method can also be used for estimating genetic ancestries, similar to LASER or TRACE, but simultaneously accounting for potential contamination. We demonstrate that our method robustly estimates contamination rates and genetic ancestries across populations and contamination scenarios. We further demonstrate that, in the presence of contamination, genetic ancestry inference can be substantially biased with existing methods that ignore contamination, while our method corrects for such biases.


Subject(s)
DNA Contamination , DNA/genetics , Genotype , Genotyping Techniques/standards , Alleles , Exome/genetics , Gene Frequency/genetics , Genetics, Population , Humans , Polymorphism, Single Nucleotide/genetics , Sequence Analysis, DNA
16.
Bioinformatics ; 38(2): 559-561, 2022 01 03.
Article in English | MEDLINE | ID: mdl-34459872

ABSTRACT

SUMMARY: Expression quantitative trait loci (eQTLs) characterize the associations between genetic variation and gene expression to provide insights into tissue-specific gene regulation. Interactive visualization of tissue-specific eQTLs or splice QTLs (sQTLs) can facilitate our understanding of functional variants relevant to disease-related traits. However, combining the multi-dimensional nature of eQTLs/sQTLs into a concise and informative visualization is challenging. Existing QTL visualization tools provide useful ways to summarize the unprecedented scale of transcriptomic data but are not necessarily tailored to answer questions about the functional interpretations of trait-associated variants or other variants of interest. We developed FIVEx, an interactive eQTL/sQTL browser with an intuitive interface tailored to the functional interpretation of associated variants. It features the ability to navigate seamlessly between different data views while providing relevant tissue- and locus-specific information to offer users a better understanding of population-scale multi-tissue transcriptomic profiles. Our implementation of the FIVEx browser on the EBI eQTL catalogue, encompassing 16 publicly available RNA-seq studies, provides important insights for understanding potential tissue-specific regulatory mechanisms underlying trait-associated signals. AVAILABILITY AND IMPLEMENTATION: A FIVEx instance visualizing EBI eQTL catalogue data can be found at https://fivex.sph.umich.edu. Its source code is open source under an MIT license at https://github.com/statgen/fivex. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Genome-Wide Association Study , Quantitative Trait Loci , Genome-Wide Association Study/methods , Gene Expression Profiling/methods , Software , Transcriptome
17.
PLoS Genet ; 16(12): e1009060, 2020 12.
Article in English | MEDLINE | ID: mdl-33320851

ABSTRACT

Gene-based association tests aggregate genotypes across multiple variants for each gene, providing an interpretable gene-level analysis framework for genome-wide association studies (GWAS). Early gene-based test applications often focused on rare coding variants; a more recent wave of gene-based methods, e.g. TWAS, use eQTLs to interrogate regulatory associations. Regulatory variants are expected to be particularly valuable for gene-based analysis, since most GWAS associations to date are non-coding. However, identifying causal genes from regulatory associations remains challenging and contentious. Here, we present a statistical framework and computational tool to integrate heterogeneous annotations with GWAS summary statistics for gene-based analysis, applied with comprehensive coding and tissue-specific regulatory annotations. We compare power and accuracy identifying causal genes across single-annotation, omnibus, and annotation-agnostic gene-based tests in simulation studies and an analysis of 128 traits from the UK Biobank, and find that incorporating heterogeneous annotations in gene-based association analysis increases power and performance identifying causal genes.


Subject(s)
Genome-Wide Association Study/methods , Molecular Sequence Annotation/methods , Algorithms , Genome-Wide Association Study/standards , Humans , Molecular Sequence Annotation/standards , Polymorphism, Genetic , Quantitative Trait Loci , Reproducibility of Results
18.
PLoS Genet ; 16(9): e1009019, 2020 09.
Article in English | MEDLINE | ID: mdl-32915782

ABSTRACT

Loci identified in genome-wide association studies (GWAS) can include multiple distinct association signals. We sought to identify the molecular basis of multiple association signals for adiponectin, a hormone involved in glucose regulation secreted almost exclusively from adipose tissue, identified in the Metabolic Syndrome in Men (METSIM) study. With GWAS data for 9,262 men, four loci were significantly associated with adiponectin: ADIPOQ, CDH13, IRS1, and PBRM1. We performed stepwise conditional analyses to identify distinct association signals, a subset of which are also nearly independent (lead variant pairwise r2<0.01). Two loci exhibited allelic heterogeneity, ADIPOQ and CDH13. Of seven association signals at the ADIPOQ locus, two signals colocalized with adipose tissue expression quantitative trait loci (eQTLs) for three transcripts: trait-increasing alleles at one signal were associated with increased ADIPOQ and LINC02043, while trait-increasing alleles at the other signal were associated with decreased ADIPOQ-AS1. In reporter assays, adiponectin-increasing alleles at two signals showed corresponding directions of effect on transcriptional activity. Putative mechanisms for the seven ADIPOQ signals include a missense variant (ADIPOQ G90S), a splice variant, a promoter variant, and four enhancer variants. Of two association signals at the CDH13 locus, the first signal consisted of promoter variants, including the lead adipose tissue eQTL variant for CDH13, while a second signal included a distal intron 1 enhancer variant that showed ~2-fold allelic differences in transcriptional reporter activity. Fine-mapping and experimental validation demonstrated that multiple, distinct association signals at these loci can influence multiple transcripts through multiple molecular mechanisms.


Subject(s)
Adiponectin/genetics , Adiponectin/metabolism , Adipose Tissue/metabolism , Alleles , Cadherins/genetics , Cadherins/metabolism , DNA-Binding Proteins/genetics , DNA-Binding Proteins/metabolism , Gene Frequency/genetics , Genetic Predisposition to Disease , Genome-Wide Association Study/methods , Humans , Insulin Receptor Substrate Proteins/genetics , Insulin Receptor Substrate Proteins/metabolism , Male , Metabolic Syndrome/genetics , Phenotype , Polymorphism, Single Nucleotide/genetics , Quantitative Trait Loci/genetics , Regulatory Sequences, Nucleic Acid , Transcription Factors/genetics , Transcription Factors/metabolism
19.
Am J Hum Genet ; 105(4): 773-787, 2019 10 03.
Article in English | MEDLINE | ID: mdl-31564431

ABSTRACT

Genome-wide association studies (GWASs) have identified thousands of genetic loci associated with cardiometabolic traits including type 2 diabetes (T2D), lipid levels, body fat distribution, and adiposity, although most causal genes remain unknown. We used subcutaneous adipose tissue RNA-seq data from 434 Finnish men from the METSIM study to identify 9,687 primary and 2,785 secondary cis-expression quantitative trait loci (eQTL; <1 Mb from TSS, FDR < 1%). Compared to primary eQTL signals, secondary eQTL signals were located further from transcription start sites, had smaller effect sizes, and were less enriched in adipose tissue regulatory elements compared to primary signals. Among 2,843 cardiometabolic GWAS signals, 262 colocalized by LD and conditional analysis with 318 transcripts as primary and conditionally distinct secondary cis-eQTLs, including some across ancestries. Of cardiometabolic traits examined for adipose tissue eQTL colocalizations, waist-hip ratio (WHR) and circulating lipid traits had the highest percentage of colocalized eQTLs (15% and 14%, respectively). Among alleles associated with increased cardiometabolic GWAS risk, approximately half (53%) were associated with decreased gene expression level. Mediation analyses of colocalized genes and cardiometabolic traits within the 434 individuals provided further evidence that gene expression influences variant-trait associations. These results identify hundreds of candidate genes that may act in adipose tissue to influence cardiometabolic traits.


Subject(s)
Adipose Tissue/metabolism , Diabetes Mellitus, Type 2/genetics , Gene Expression , Obesity/genetics , Alleles , Body Mass Index , Finland , Genome-Wide Association Study , Humans , Male , Quantitative Trait Loci , Waist-Hip Ratio
20.
Am J Hum Genet ; 105(1): 15-28, 2019 07 03.
Article in English | MEDLINE | ID: mdl-31178129

ABSTRACT

Circulating levels of adiponectin, an adipocyte-secreted protein associated with cardiovascular and metabolic risk, are highly heritable. To gain insights into the biology that regulates adiponectin levels, we performed an exome array meta-analysis of 265,780 genetic variants in 67,739 individuals of European, Hispanic, African American, and East Asian ancestry. We identified 20 loci associated with adiponectin, including 11 that had been reported previously (p < 2 × 10-7). Comparison of exome array variants to regional linkage disequilibrium (LD) patterns and prior genome-wide association study (GWAS) results detected candidate variants (r2 > .60) spanning as much as 900 kb. To identify potential genes and mechanisms through which the previously unreported association signals act to affect adiponectin levels, we assessed cross-trait associations, expression quantitative trait loci in subcutaneous adipose, and biological pathways of nearby genes. Eight of the nine loci were also associated (p < 1 × 10-4) with at least one obesity or lipid trait. Candidate genes include PRKAR2A, PTH1R, and HDAC9, which have been suggested to play roles in adipocyte differentiation or bone marrow adipose tissue. Taken together, these findings provide further insights into the processes that influence circulating adiponectin levels.


Subject(s)
Adiponectin/genetics , Adipose Tissue/pathology , Exome/genetics , Genetic Predisposition to Disease , Lipids/analysis , Obesity/etiology , Polymorphism, Single Nucleotide , Adipose Tissue/metabolism , Adolescent , Adult , Black or African American/genetics , Aged , Aged, 80 and over , Female , Hispanic or Latino/genetics , Humans , Male , Middle Aged , Obesity/pathology , Phenotype , Quantitative Trait Loci , White People/genetics , Young Adult
SELECTION OF CITATIONS
SEARCH DETAIL