Search | VHL Regional Portal

1.

Imputation Server PGS: an automated approach to calculate polygenic risk scores on imputation servers.

Forer, Lukas; Taliun, Daniel; LeFaive, Jonathon; Smith, Albert V; Boughton, Andrew P; Coassin, Stefan; Lamina, Claudia; Kronenberg, Florian; Fuchsberger, Christian; Schönherr, Sebastian.

Nucleic Acids Res ; 2024 May 06.

Article in English | MEDLINE | ID: mdl-38709879

ABSTRACT

Polygenic scores (PGS) enable the prediction of genetic predisposition for a wide range of traits and diseases by calculating the weighted sum of allele dosages for genetic variants associated with the trait or disease in question. Present approaches for calculating PGS from genotypes are often inefficient and labor-intensive, limiting transferability into clinical applications. Here, we present 'Imputation Server PGS', an extension of the Michigan Imputation Server designed to automate a standardized calculation of polygenic scores based on imputed genotypes. This extends the widely used Michigan Imputation Server with new functionality, bringing the simplicity and efficiency of modern imputation to the PGS field. The service currently supports over 4489 published polygenic scores from publicly available repositories and provides extensive quality control, including ancestry estimation to report population stratification. An interactive report empowers users to screen and compare thousands of scores in a fast and intuitive way. Imputation Server PGS provides a user-friendly web service, facilitating the application of polygenic scores to a wide range of genetic studies and is freely available at https://imputationserver.sph.umich.edu.

2.

Genetic drivers of heterogeneity in type 2 diabetes pathophysiology.

Suzuki, Ken; Hatzikotoulas, Konstantinos; Southam, Lorraine; Taylor, Henry J; Yin, Xianyong; Lorenz, Kim M; Mandla, Ravi; Huerta-Chagoya, Alicia; Melloni, Giorgio E M; Kanoni, Stavroula; Rayner, Nigel W; Bocher, Ozvan; Arruda, Ana Luiza; Sonehara, Kyuto; Namba, Shinichi; Lee, Simon S K; Preuss, Michael H; Petty, Lauren E; Schroeder, Philip; Vanderwerff, Brett; Kals, Mart; Bragg, Fiona; Lin, Kuang; Guo, Xiuqing; Zhang, Weihua; Yao, Jie; Kim, Young Jin; Graff, Mariaelisa; Takeuchi, Fumihiko; Nano, Jana; Lamri, Amel; Nakatochi, Masahiro; Moon, Sanghoon; Scott, Robert A; Cook, James P; Lee, Jung-Jin; Pan, Ian; Taliun, Daniel; Parra, Esteban J; Chai, Jin-Fang; Bielak, Lawrence F; Tabara, Yasuharu; Hai, Yang; Thorleifsson, Gudmar; Grarup, Niels; Sofer, Tamar; Wuttke, Matthias; Sarnowski, Chloé; Gieger, Christian; Nousome, Darryl.

Nature ; 627(8003): 347-357, 2024 Mar.

Article in English | MEDLINE | ID: mdl-38374256

ABSTRACT

Type 2 diabetes (T2D) is a heterogeneous disease that develops through diverse pathophysiological processes1,2 and molecular mechanisms that are often specific to cell type3,4. Here, to characterize the genetic contribution to these processes across ancestry groups, we aggregate genome-wide association study data from 2,535,601 individuals (39.7% not of European ancestry), including 428,452 cases of T2D. We identify 1,289 independent association signals at genome-wide significance (P < 5 × 10-8) that map to 611 loci, of which 145 loci are, to our knowledge, previously unreported. We define eight non-overlapping clusters of T2D signals that are characterized by distinct profiles of cardiometabolic trait associations. These clusters are differentially enriched for cell-type-specific regions of open chromatin, including pancreatic islets, adipocytes, endothelial cells and enteroendocrine cells. We build cluster-specific partitioned polygenic scores5 in a further 279,552 individuals of diverse ancestry, including 30,288 cases of T2D, and test their association with T2D-related vascular outcomes. Cluster-specific partitioned polygenic scores are associated with coronary artery disease, peripheral artery disease and end-stage diabetic nephropathy across ancestry groups, highlighting the importance of obesity-related processes in the development of vascular outcomes. Our findings show the value of integrating multi-ancestry genome-wide association study data with single-cell epigenomics to disentangle the aetiological heterogeneity that drives the development and progression of T2D. This might offer a route to optimize global access to genetically informed diabetes care.

Subject(s)

Diabetes Mellitus, Type 2 , Disease Progression , Genetic Predisposition to Disease , Genome-Wide Association Study , Humans , Adipocytes/metabolism , Chromatin/genetics , Chromatin/metabolism , Coronary Artery Disease/complications , Coronary Artery Disease/genetics , Diabetes Mellitus, Type 2/classification , Diabetes Mellitus, Type 2/complications , Diabetes Mellitus, Type 2/genetics , Diabetes Mellitus, Type 2/pathology , Diabetes Mellitus, Type 2/physiopathology , Diabetic Nephropathies/complications , Diabetic Nephropathies/genetics , Endothelial Cells/metabolism , Enteroendocrine Cells , Epigenomics , Genetic Predisposition to Disease/genetics , Islets of Langerhans/metabolism , Multifactorial Inheritance/genetics , Peripheral Arterial Disease/complications , Peripheral Arterial Disease/genetics , Single-Cell Analysis

3.

A cost-effective sequencing method for genetic studies combining high-depth whole exome and low-depth whole genome.

Bhérer, Claude; Eveleigh, Robert; Trajanoska, Katerina; St-Cyr, Janick; Paccard, Antoine; Nadukkalam Ravindran, Praveen; Caron, Elizabeth; Bader Asbah, Nimara; McClelland, Peyton; Wei, Clare; Baumgartner, Iris; Schindewolf, Marc; Döring, Yvonne; Perley, Danielle; Lefebvre, François; Lepage, Pierre; Bourgey, Mathieu; Bourque, Guillaume; Ragoussis, Jiannis; Mooser, Vincent; Taliun, Daniel.

NPJ Genom Med ; 9(1): 8, 2024 Feb 07.

Article in English | MEDLINE | ID: mdl-38326393

ABSTRACT

Whole genome sequencing (WGS) at high-depth (30X) allows the accurate discovery of variants in the coding and non-coding DNA regions and helps elucidate the genetic underpinnings of human health and diseases. Yet, due to the prohibitive cost of high-depth WGS, most large-scale genetic association studies use genotyping arrays or high-depth whole exome sequencing (WES). Here we propose a cost-effective method which we call "Whole Exome Genome Sequencing" (WEGS), that combines low-depth WGS and high-depth WES with up to 8 samples pooled and sequenced simultaneously (multiplexed). We experimentally assess the performance of WEGS with four different depth of coverage and sample multiplexing configurations. We show that the optimal WEGS configurations are 1.7-2.0 times cheaper than standard WES (no-plexing), 1.8-2.1 times cheaper than high-depth WGS, reach similar recall and precision rates in detecting coding variants as WES, and capture more population-specific variants in the rest of the genome that are difficult to recover when using genotype imputation methods. We apply WEGS to 862 patients with peripheral artery disease and show that it directly assesses more known disease-associated variants than a typical genotyping array and thousands of non-imputable variants per disease-associated locus.

4.

HLA allele-calling using multi-ancestry whole-exome sequencing from the UK Biobank identifies 129 novel associations in 11 autoimmune diseases.

Butler-Laporte, Guillaume; Farjoun, Joseph; Nakanishi, Tomoko; Lu, Tianyuan; Abner, Erik; Chen, Yiheng; Hultström, Michael; Metspalu, Andres; Milani, Lili; Mägi, Reedik; Nelis, Mari; Hudjashov, Georgi; Yoshiji, Satoshi; Ilboudo, Yann; Liang, Kevin Y H; Su, Chen-Yang; Willet, Julian D S; Esko, Tõnu; Zhou, Sirui; Forgetta, Vincenzo; Taliun, Daniel; Richards, J Brent.

Commun Biol ; 6(1): 1113, 2023 11 03.

Article in English | MEDLINE | ID: mdl-37923823

ABSTRACT

The human leukocyte antigen (HLA) region on chromosome 6 is strongly associated with many immune-mediated and infection-related diseases. Due to its highly polymorphic nature and complex linkage disequilibrium patterns, traditional genetic association studies of single nucleotide polymorphisms do not perform well in this region. Instead, the field has adopted the assessment of the association of HLA alleles (i.e., entire HLA gene haplotypes) with disease. Often based on genotyping arrays, these association studies impute HLA alleles, decreasing accuracy and thus statistical power for rare alleles and in non-European ancestries. Here, we use whole-exome sequencing (WES) from 454,824 UK Biobank (UKB) participants to directly call HLA alleles using the HLA-HD algorithm. We show this method is more accurate than imputing HLA alleles and harness the improved statistical power to identify 360 associations for 11 auto-immune phenotypes (at least 129 likely novel), leading to better insights into the specific coding polymorphisms that underlie these diseases. We show that HLA alleles with synonymous variants, often overlooked in HLA studies, can significantly influence these phenotypes. Lastly, we show that HLA sequencing may improve polygenic risk scores accuracy across ancestries. These findings allow better characterization of the role of the HLA region in human disease.

Subject(s)

Autoimmune Diseases , Biological Specimen Banks , Humans , Alleles , Exome Sequencing , Genetic Predisposition to Disease , Autoimmune Diseases/genetics , HLA Antigens/genetics , Histocompatibility Antigens Class I/genetics , Histocompatibility Antigens Class II , Polymorphism, Single Nucleotide , United Kingdom

5.

From target discovery to clinical drug development with human genetics.

Trajanoska, Katerina; Bhérer, Claude; Taliun, Daniel; Zhou, Sirui; Richards, J Brent; Mooser, Vincent.

Nature ; 620(7975): 737-745, 2023 Aug.

Article in English | MEDLINE | ID: mdl-37612393

ABSTRACT

The substantial investments in human genetics and genomics made over the past three decades were anticipated to result in many innovative therapies. Here we investigate the extent to which these expectations have been met, excluding cancer treatments. In our search, we identified 40 germline genetic observations that led directly to new targets and subsequently to novel approved therapies for 36 rare and 4 common conditions. The median time between genetic target discovery and drug approval was 25 years. Most of the genetically driven therapies for rare diseases compensate for disease-causing loss-of-function mutations. The therapies approved for common conditions are all inhibitors designed to pharmacologically mimic the natural, disease-protective effects of rare loss-of-function variants. Large biobank-based genetic studies have the power to identify and validate a large number of new drug targets. Genetics can also assist in the clinical development phase of drugs-for example, by selecting individuals who are most likely to respond to investigational therapies. This approach to drug development requires investments into large, diverse cohorts of deeply phenotyped individuals with appropriate consent for genetically assisted trials. A robust framework that facilitates responsible, sustainable benefit sharing will be required to capture the full potential of human genetics and genomics and bring effective and safe innovative therapies to patients quickly.

Subject(s)

Drug Development , Human Genetics , Molecular Targeted Therapy , Humans , Drug Approval/statistics & numerical data , Drug Development/statistics & numerical data , Therapies, Investigational/statistics & numerical data , Molecular Targeted Therapy/methods , Molecular Targeted Therapy/statistics & numerical data , Rare Diseases/genetics , Rare Diseases/therapy , Germ-Line Mutation , Time Factors

6.

Multi-ancestry genome-wide study in >2.5 million individuals reveals heterogeneity in mechanistic pathways of type 2 diabetes and complications.

Suzuki, Ken; Hatzikotoulas, Konstantinos; Southam, Lorraine; Taylor, Henry J; Yin, Xianyong; Lorenz, Kim M; Mandla, Ravi; Huerta-Chagoya, Alicia; Rayner, Nigel W; Bocher, Ozvan; Arruda, Ana Luiza de S V; Sonehara, Kyuto; Namba, Shinichi; Lee, Simon S K; Preuss, Michael H; Petty, Lauren E; Schroeder, Philip; Vanderwerff, Brett; Kals, Mart; Bragg, Fiona; Lin, Kuang; Guo, Xiuqing; Zhang, Weihua; Yao, Jie; Kim, Young Jin; Graff, Mariaelisa; Takeuchi, Fumihiko; Nano, Jana; Lamri, Amel; Nakatochi, Masahiro; Moon, Sanghoon; Scott, Robert A; Cook, James P; Lee, Jung-Jin; Pan, Ian; Taliun, Daniel; Parra, Esteban J; Chai, Jin-Fang; Bielak, Lawrence F; Tabara, Yasuharu; Hai, Yang; Thorleifsson, Gudmar; Grarup, Niels; Sofer, Tamar; Wuttke, Matthias; Sarnowski, Chloé; Gieger, Christian; Nousome, Darryl; Trompet, Stella; Kwak, Soo-Heon.

medRxiv ; 2023 Mar 31.

Article in English | MEDLINE | ID: mdl-37034649

ABSTRACT

Type 2 diabetes (T2D) is a heterogeneous disease that develops through diverse pathophysiological processes. To characterise the genetic contribution to these processes across ancestry groups, we aggregate genome-wide association study (GWAS) data from 2,535,601 individuals (39.7% non-European ancestry), including 428,452 T2D cases. We identify 1,289 independent association signals at genome-wide significance (P<5×10-8) that map to 611 loci, of which 145 loci are previously unreported. We define eight non-overlapping clusters of T2D signals characterised by distinct profiles of cardiometabolic trait associations. These clusters are differentially enriched for cell-type specific regions of open chromatin, including pancreatic islets, adipocytes, endothelial, and enteroendocrine cells. We build cluster-specific partitioned genetic risk scores (GRS) in an additional 137,559 individuals of diverse ancestry, including 10,159 T2D cases, and test their association with T2D-related vascular outcomes. Cluster-specific partitioned GRS are more strongly associated with coronary artery disease and end-stage diabetic nephropathy than an overall T2D GRS across ancestry groups, highlighting the importance of obesity-related processes in the development of vascular outcomes. Our findings demonstrate the value of integrating multi-ancestry GWAS with single-cell epigenomics to disentangle the aetiological heterogeneity driving the development and progression of T2D, which may offer a route to optimise global access to genetically-informed diabetes care.

7.

The Type 2 Diabetes Knowledge Portal: An open access genetic resource dedicated to type 2 diabetes and related traits.

Costanzo, Maria C; von Grotthuss, Marcin; Massung, Jeffrey; Jang, Dongkeun; Caulkins, Lizz; Koesterer, Ryan; Gilbert, Clint; Welch, Ryan P; Kudtarkar, Parul; Hoang, Quy; Boughton, Andrew P; Singh, Preeti; Sun, Ying; Duby, Marc; Moriondo, Annie; Nguyen, Trang; Smadbeck, Patrick; Alexander, Benjamin R; Brandes, MacKenzie; Carmichael, Mary; Dornbos, Peter; Green, Todd; Huellas-Bruskiewicz, Kenneth C; Ji, Yue; Kluge, Alexandria; McMahon, Aoife C; Mercader, Josep M; Ruebenacker, Oliver; Sengupta, Sebanti; Spalding, Dylan; Taliun, Daniel; Smith, Philip; Thomas, Melissa K; Akolkar, Beena; Brosnan, M Julia; Cherkas, Andriy; Chu, Audrey Y; Fauman, Eric B; Fox, Caroline S; Kamphaus, Tania Nayak; Miller, Melissa R; Nguyen, Lynette; Parsa, Afshin; Reilly, Dermot F; Ruetten, Hartmut; Wholley, David; Zaghloul, Norann A; Abecasis, Gonçalo R; Altshuler, David; Keane, Thomas M.

Cell Metab ; 35(4): 695-710.e6, 2023 04 04.

Article in English | MEDLINE | ID: mdl-36963395

ABSTRACT

Associations between human genetic variation and clinical phenotypes have become a foundation of biomedical research. Most repositories of these data seek to be disease-agnostic and therefore lack disease-focused views. The Type 2 Diabetes Knowledge Portal (T2DKP) is a public resource of genetic datasets and genomic annotations dedicated to type 2 diabetes (T2D) and related traits. Here, we seek to make the T2DKP more accessible to prospective users and more useful to existing users. First, we evaluate the T2DKP's comprehensiveness by comparing its datasets with those of other repositories. Second, we describe how researchers unfamiliar with human genetic data can begin using and correctly interpreting them via the T2DKP. Third, we describe how existing users can extend their current workflows to use the full suite of tools offered by the T2DKP. We finally discuss the lessons offered by the T2DKP toward the goal of democratizing access to complex disease genetic results.

Subject(s)

Diabetes Mellitus, Type 2 , Humans , Diabetes Mellitus, Type 2/genetics , Access to Information , Prospective Studies , Genomics/methods , Phenotype

8.

Multi-ancestry genetic study of type 2 diabetes highlights the power of diverse populations for discovery and translation.

Mahajan, Anubha; Spracklen, Cassandra N; Zhang, Weihua; Ng, Maggie C Y; Petty, Lauren E; Kitajima, Hidetoshi; Yu, Grace Z; Rüeger, Sina; Speidel, Leo; Kim, Young Jin; Horikoshi, Momoko; Mercader, Josep M; Taliun, Daniel; Moon, Sanghoon; Kwak, Soo-Heon; Robertson, Neil R; Rayner, Nigel W; Loh, Marie; Kim, Bong-Jo; Chiou, Joshua; Miguel-Escalada, Irene; Della Briotta Parolo, Pietro; Lin, Kuang; Bragg, Fiona; Preuss, Michael H; Takeuchi, Fumihiko; Nano, Jana; Guo, Xiuqing; Lamri, Amel; Nakatochi, Masahiro; Scott, Robert A; Lee, Jung-Jin; Huerta-Chagoya, Alicia; Graff, Mariaelisa; Chai, Jin-Fang; Parra, Esteban J; Yao, Jie; Bielak, Lawrence F; Tabara, Yasuharu; Hai, Yang; Steinthorsdottir, Valgerdur; Cook, James P; Kals, Mart; Grarup, Niels; Schmidt, Ellen M; Pan, Ian; Sofer, Tamar; Wuttke, Matthias; Sarnowski, Chloe; Gieger, Christian.

Nat Genet ; 54(5): 560-572, 2022 05.

Article in English | MEDLINE | ID: mdl-35551307

ABSTRACT

We assembled an ancestrally diverse collection of genome-wide association studies (GWAS) of type 2 diabetes (T2D) in 180,834 affected individuals and 1,159,055 controls (48.9% non-European descent) through the Diabetes Meta-Analysis of Trans-Ethnic association studies (DIAMANTE) Consortium. Multi-ancestry GWAS meta-analysis identified 237 loci attaining stringent genome-wide significance (P < 5 × 10-9), which were delineated to 338 distinct association signals. Fine-mapping of these signals was enhanced by the increased sample size and expanded population diversity of the multi-ancestry meta-analysis, which localized 54.4% of T2D associations to a single variant with >50% posterior probability. This improved fine-mapping enabled systematic assessment of candidate causal genes and molecular mechanisms through which T2D associations are mediated, laying the foundations for functional investigations. Multi-ancestry genetic risk scores enhanced transferability of T2D prediction across diverse populations. Our study provides a step toward more effective clinical translation of T2D GWAS to improve global health for all, irrespective of genetic background.

Subject(s)

Diabetes Mellitus, Type 2 , Genome-Wide Association Study , Diabetes Mellitus, Type 2/epidemiology , Ethnicity , Genetic Predisposition to Disease , Humans , Polymorphism, Single Nucleotide/genetics , Risk Factors

9.

LocusZoom.js: interactive and embeddable visualization of genetic association study results.

Boughton, Andrew P; Welch, Ryan P; Flickinger, Matthew; VandeHaar, Peter; Taliun, Daniel; Abecasis, Gonçalo R; Boehnke, Michael.

Bioinformatics ; 37(18): 3017-3018, 2021 09 29.

Article in English | MEDLINE | ID: mdl-33734315

ABSTRACT

SUMMARY: LocusZoom.js is a JavaScript library for creating interactive web-based visualizations of genetic association study results. It can display one or more traits in the context of relevant biological data (such as gene models and other genomic annotation), and allows interactive refinement of analysis models (by selecting linkage disequilibrium reference panels, identifying sets of likely causal variants, or comparisons to the GWAS catalog). It can be embedded in web pages to enable data sharing and exploration. Views can be customized and extended to display other data types such as phenome-wide association study (PheWAS) results, chromatin co-accessibility, or eQTL measurements. A new web upload service harmonizes datasets, adds annotations, and makes it easy to explore user-provided result sets. AVAILABILITY AND IMPLEMENTATION: LocusZoom.js is open-source software under a permissive MIT license. Code and documentation are available at: https://github.com/statgen/locuszoom/. Installable packages for all versions are also distributed via NPM. Additional features are provided as standalone libraries to promote reuse. Use with your own GWAS results at https://my.locuszoom.org/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Subject(s)

Genomics , Software , Genome , Genetic Association Studies , Documentation

10.

Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program.

Taliun, Daniel; Harris, Daniel N; Kessler, Michael D; Carlson, Jedidiah; Szpiech, Zachary A; Torres, Raul; Taliun, Sarah A Gagliano; Corvelo, André; Gogarten, Stephanie M; Kang, Hyun Min; Pitsillides, Achilleas N; LeFaive, Jonathon; Lee, Seung-Been; Tian, Xiaowen; Browning, Brian L; Das, Sayantan; Emde, Anne-Katrin; Clarke, Wayne E; Loesch, Douglas P; Shetty, Amol C; Blackwell, Thomas W; Smith, Albert V; Wong, Quenna; Liu, Xiaoming; Conomos, Matthew P; Bobo, Dean M; Aguet, François; Albert, Christine; Alonso, Alvaro; Ardlie, Kristin G; Arking, Dan E; Aslibekyan, Stella; Auer, Paul L; Barnard, John; Barr, R Graham; Barwick, Lucas; Becker, Lewis C; Beer, Rebecca L; Benjamin, Emelia J; Bielak, Lawrence F; Blangero, John; Boehnke, Michael; Bowden, Donald W; Brody, Jennifer A; Burchard, Esteban G; Cade, Brian E; Casella, James F; Chalazan, Brandon; Chasman, Daniel I; Chen, Yii-Der Ida.

Nature ; 590(7845): 290-299, 2021 02.

Article in English | MEDLINE | ID: mdl-33568819

ABSTRACT

The Trans-Omics for Precision Medicine (TOPMed) programme seeks to elucidate the genetic architecture and biology of heart, lung, blood and sleep disorders, with the ultimate goal of improving diagnosis, treatment and prevention of these diseases. The initial phases of the programme focused on whole-genome sequencing of individuals with rich phenotypic data and diverse backgrounds. Here we describe the TOPMed goals and design as well as the available resources and early insights obtained from the sequence data. The resources include a variant browser, a genotype imputation server, and genomic and phenotypic data that are available through dbGaP (Database of Genotypes and Phenotypes)1. In the first 53,831 TOPMed samples, we detected more than 400 million single-nucleotide and insertion or deletion variants after alignment with the reference genome. Additional previously undescribed variants were detected through assembly of unmapped reads and customized analysis in highly variable loci. Among the more than 400 million detected variants, 97% have frequencies of less than 1% and 46% are singletons that are present in only one individual (53% among unrelated individuals). These rare variants provide insights into mutational processes and recent human evolutionary history. The extensive catalogue of genetic variation in TOPMed studies provides unique opportunities for exploring the contributions of rare and noncoding sequence variants to phenotypic variation. Furthermore, combining TOPMed haplotypes with modern imputation methods improves the power and reach of genome-wide association studies to include variants down to a frequency of approximately 0.01%.

Subject(s)

Genetic Variation/genetics , Genome, Human/genetics , Genomics , National Heart, Lung, and Blood Institute (U.S.) , Precision Medicine , Cytochrome P-450 CYP2D6/genetics , Haplotypes/genetics , Heterozygote , Humans , INDEL Mutation , Loss of Function Mutation , Mutagenesis , Phenotype , Polymorphism, Single Nucleotide , Population Density , Precision Medicine/standards , Quality Control , Sample Size , United States , Whole Genome Sequencing/standards

11.

Cancer PRSweb: An Online Repository with Polygenic Risk Scores for Major Cancer Traits and Their Evaluation in Two Independent Biobanks.

Fritsche, Lars G; Patil, Snehal; Beesley, Lauren J; VandeHaar, Peter; Salvatore, Maxwell; Ma, Ying; Peng, Robert B; Taliun, Daniel; Zhou, Xiang; Mukherjee, Bhramar.

Am J Hum Genet ; 107(5): 815-836, 2020 11 05.

Article in English | MEDLINE | ID: mdl-32991828

ABSTRACT

To facilitate scientific collaboration on polygenic risk scores (PRSs) research, we created an extensive PRS online repository for 35 common cancer traits integrating freely available genome-wide association studies (GWASs) summary statistics from three sources: published GWASs, the NHGRI-EBI GWAS Catalog, and UK Biobank-based GWASs. Our framework condenses these summary statistics into PRSs using various approaches such as linkage disequilibrium pruning/p value thresholding (fixed or data-adaptively optimized thresholds) and penalized, genome-wide effect size weighting. We evaluated the PRSs in two biobanks: the Michigan Genomics Initiative (MGI), a longitudinal biorepository effort at Michigan Medicine, and the population-based UK Biobank (UKB). For each PRS construct, we provide measures on predictive performance and discrimination. Besides PRS evaluation, the Cancer-PRSweb platform features construct downloads and phenome-wide PRS association study results (PRS-PheWAS) for predictive PRSs. We expect this integrated platform to accelerate PRS-related cancer research.

Subject(s)

Biological Specimen Banks/statistics & numerical data , Genetic Predisposition to Disease , Genome, Human , Genomics/methods , Multifactorial Inheritance , Neoplasms/genetics , Adult , Aged , Female , Genome-Wide Association Study , Humans , Internet , Linkage Disequilibrium , Male , Middle Aged , Neoplasms/classification , Neoplasms/diagnosis , Neoplasms/epidemiology , Phenotype , Quantitative Trait, Heritable , Risk Factors , United Kingdom/epidemiology , United States/epidemiology

12.

Exploring and visualizing large-scale genetic associations by using PheWeb.

Gagliano Taliun, Sarah A; VandeHaar, Peter; Boughton, Andrew P; Welch, Ryan P; Taliun, Daniel; Schmidt, Ellen M; Zhou, Wei; Nielsen, Jonas B; Willer, Cristen J; Lee, Seunggeun; Fritsche, Lars G; Boehnke, Michael; Abecasis, Gonçalo R.

Nat Genet ; 52(6): 550-552, 2020 06.

Article in English | MEDLINE | ID: mdl-32504056

Subject(s)

Data Visualization , Genome-Wide Association Study , Software , Urinary Bladder Neoplasms/genetics , Humans , Polymorphism, Single Nucleotide , User-Computer Interface

13.

De novo mutations across 1,465 diverse genomes reveal mutational insights and reductions in the Amish founder population.

Kessler, Michael D; Loesch, Douglas P; Perry, James A; Heard-Costa, Nancy L; Taliun, Daniel; Cade, Brian E; Wang, Heming; Daya, Michelle; Ziniti, John; Datta, Soma; Celedón, Juan C; Soto-Quiros, Manuel E; Avila, Lydiana; Weiss, Scott T; Barnes, Kathleen; Redline, Susan S; Vasan, Ramachandran S; Johnson, Andrew D; Mathias, Rasika A; Hernandez, Ryan; Wilson, James G; Nickerson, Deborah A; Abecasis, Goncalo; Browning, Sharon R; Zöllner, Sebastian; O'Connell, Jeffrey R; Mitchell, Braxton D; O'Connor, Timothy D.

Proc Natl Acad Sci U S A ; 117(5): 2560-2569, 2020 02 04.

Article in English | MEDLINE | ID: mdl-31964835

ABSTRACT

De novo mutations (DNMs), or mutations that appear in an individual despite not being seen in their parents, are an important source of genetic variation whose impact is relevant to studies of human evolution, genetics, and disease. Utilizing high-coverage whole-genome sequencing data as part of the Trans-Omics for Precision Medicine (TOPMed) Program, we called 93,325 single-nucleotide DNMs across 1,465 trios from an array of diverse human populations, and used them to directly estimate and analyze DNM counts, rates, and spectra. We find a significant positive correlation between local recombination rate and local DNM rate, and that DNM rate explains a substantial portion (8.98 to 34.92%, depending on the model) of the genome-wide variation in population-level genetic variation from 41K unrelated TOPMed samples. Genome-wide heterozygosity does correlate with DNM rate, but only explains <1% of variation. While we are underpowered to see small differences, we do not find significant differences in DNM rate between individuals of European, African, and Latino ancestry, nor across ancestrally distinct segments within admixed individuals. However, we did find significantly fewer DNMs in Amish individuals, even when compared with other Europeans, and even after accounting for parental age and sequencing center. Specifically, we found significant reductions in the number of CâA and TâC mutations in the Amish, which seem to underpin their overall reduction in DNMs. Finally, we calculated near-zero estimates of narrow sense heritability (h2), which suggest that variation in DNM rate is significantly shaped by nonadditive genetic effects and the environment.

Subject(s)

Amish/genetics , Genome, Human , Adult , Cohort Studies , DNA Mutational Analysis , Female , Genetics, Population , Heterozygote , Humans , Male , Mutation , Pedigree , Whole Genome Sequencing , Young Adult

14.

emeraLD: rapid linkage disequilibrium estimation with massive datasets.

Quick, Corbin; Fuchsberger, Christian; Taliun, Daniel; Abecasis, Gonçalo; Boehnke, Michael; Kang, Hyun Min.

Bioinformatics ; 35(1): 164-166, 2019 01 01.

Article in English | MEDLINE | ID: mdl-30204848

ABSTRACT

Summary: Estimating linkage disequilibrium (LD) is essential for a wide range of summary statistics-based association methods for genome-wide association studies. Large genetic datasets, e.g. the TOPMed WGS project and UK Biobank, enable more accurate and comprehensive LD estimates, but increase the computational burden of LD estimation. Here, we describe emeraLD (Efficient Methods for Estimation and Random Access of LD), a computational tool that leverages sparsity and haplotype structure to estimate LD up to 2 orders of magnitude faster than current tools. Availability and implementation: emeraLD is implemented in C++, and is open source under GPLv3. Source code and documentation are freely available at http://github.com/statgen/emeraLD. Supplementary information: Supplementary data are available at Bioinformatics online.

Subject(s)

Genome-Wide Association Study , Linkage Disequilibrium , Software , Computational Biology , Haplotypes

15.

Fine-mapping type 2 diabetes loci to single-variant resolution using high-density imputation and islet-specific epigenome maps.

Mahajan, Anubha; Taliun, Daniel; Thurner, Matthias; Robertson, Neil R; Torres, Jason M; Rayner, N William; Payne, Anthony J; Steinthorsdottir, Valgerdur; Scott, Robert A; Grarup, Niels; Cook, James P; Schmidt, Ellen M; Wuttke, Matthias; Sarnowski, Chloé; Mägi, Reedik; Nano, Jana; Gieger, Christian; Trompet, Stella; Lecoeur, Cécile; Preuss, Michael H; Prins, Bram Peter; Guo, Xiuqing; Bielak, Lawrence F; Below, Jennifer E; Bowden, Donald W; Chambers, John Campbell; Kim, Young Jin; Ng, Maggie C Y; Petty, Lauren E; Sim, Xueling; Zhang, Weihua; Bennett, Amanda J; Bork-Jensen, Jette; Brummett, Chad M; Canouil, Mickaël; Ec Kardt, Kai-Uwe; Fischer, Krista; Kardia, Sharon L R; Kronenberg, Florian; Läll, Kristi; Liu, Ching-Ti; Locke, Adam E; Luan, Jian'an; Ntalla, Ioanna; Nylander, Vibe; Schönherr, Sebastian; Schurmann, Claudia; Yengo, Loïc; Bottinger, Erwin P; Brandslund, Ivan.

Nat Genet ; 50(11): 1505-1513, 2018 11.

Article in English | MEDLINE | ID: mdl-30297969

ABSTRACT

We expanded GWAS discovery for type 2 diabetes (T2D) by combining data from 898,130 European-descent individuals (9% cases), after imputation to high-density reference panels. With these data, we (i) extend the inventory of T2D-risk variants (243 loci, 135 newly implicated in T2D predisposition, comprising 403 distinct association signals); (ii) enrich discovery of lower-frequency risk alleles (80 index variants with minor allele frequency <5%, 14 with estimated allelic odds ratio >2); (iii) substantially improve fine-mapping of causal variants (at 51 signals, one variant accounted for >80% posterior probability of association (PPA)); (iv) extend fine-mapping through integration of tissue-specific epigenomic information (islet regulatory annotations extend the number of variants with PPA >80% to 73); (v) highlight validated therapeutic targets (18 genes with associations attributable to coding variants); and (vi) demonstrate enhanced potential for clinical translation (genome-wide chip heritability explains 18% of T2D risk; individuals in the extremes of a T2D polygenic risk score differ more than ninefold in prevalence).

Subject(s)

Chromosome Mapping/methods , Diabetes Mellitus, Type 2/genetics , Epigenesis, Genetic , Genome, Human/genetics , Islets of Langerhans/metabolism , Polymorphism, Single Nucleotide , Body Mass Index , Case-Control Studies , Diabetes Mellitus, Type 2/epidemiology , Diabetes Mellitus, Type 2/pathology , Female , Gene Frequency , Genetic Loci/genetics , Genetic Predisposition to Disease , Genome-Wide Association Study , High-Throughput Screening Assays/methods , Humans , Islets of Langerhans/pathology , Linkage Disequilibrium , Male , Meta-Analysis as Topic , Sex Factors , White People/genetics

16.

Imputation-Aware Tag SNP Selection To Improve Power for Large-Scale, Multi-ethnic Association Studies.

Wojcik, Genevieve L; Fuchsberger, Christian; Taliun, Daniel; Welch, Ryan; Martin, Alicia R; Shringarpure, Suyash; Carlson, Christopher S; Abecasis, Goncalo; Kang, Hyun Min; Boehnke, Michael; Bustamante, Carlos D; Gignoux, Christopher R; Kenny, Eimear E.

G3 (Bethesda) ; 8(10): 3255-3267, 2018 10 03.

Article in English | MEDLINE | ID: mdl-30131328

ABSTRACT

The emergence of very large cohorts in genomic research has facilitated a focus on genotype-imputation strategies to power rare variant association. These strategies have benefited from improvements in imputation methods and association tests, however little attention has been paid to ways in which array design can increase rare variant association power. Therefore, we developed a novel framework to select tag SNPs using the reference panel of 26 populations from Phase 3 of the 1000 Genomes Project. We evaluate tag SNP performance via mean imputed r2 at untyped sites using leave-one-out internal validation and standard imputation methods, rather than pairwise linkage disequilibrium. Moving beyond pairwise metrics allows us to account for haplotype diversity across the genome for improve imputation accuracy and demonstrates population-specific biases from pairwise estimates. We also examine array design strategies that contrast multi-ethnic cohorts vs. single populations, and show a boost in performance for the former can be obtained by prioritizing tag SNPs that contribute information across multiple populations simultaneously. Using our framework, we demonstrate increased imputation accuracy for rare variants (frequency < 1%) by 0.5-3.1% for an array of one million sites and 0.7-7.1% for an array of 500,000 sites, depending on the population. Finally, we show how recent explosive growth in non-African populations means tag SNPs capture on average 30% fewer other variants than in African populations. The unified framework presented here will enable investigators to make informed decisions for the design of new arrays, and help empower the next phase of rare variant association for global health.

Subject(s)

Ethnicity/genetics , Genetic Association Studies , Genetics, Population , Polymorphism, Single Nucleotide , Selection, Genetic , Computational Biology/methods , Databases, Nucleic Acid , Genome-Wide Association Study , Humans , Linkage Disequilibrium , Models, Genetic , Reproducibility of Results

17.

Refining the accuracy of validated target identification through coding variant fine-mapping in type 2 diabetes.

Mahajan, Anubha; Wessel, Jennifer; Willems, Sara M; Zhao, Wei; Robertson, Neil R; Chu, Audrey Y; Gan, Wei; Kitajima, Hidetoshi; Taliun, Daniel; Rayner, N William; Guo, Xiuqing; Lu, Yingchang; Li, Man; Jensen, Richard A; Hu, Yao; Huo, Shaofeng; Lohman, Kurt K; Zhang, Weihua; Cook, James P; Prins, Bram Peter; Flannick, Jason; Grarup, Niels; Trubetskoy, Vassily Vladimirovich; Kravic, Jasmina; Kim, Young Jin; Rybin, Denis V; Yaghootkar, Hanieh; Müller-Nurasyid, Martina; Meidtner, Karina; Li-Gao, Ruifang; Varga, Tibor V; Marten, Jonathan; Li, Jin; Smith, Albert Vernon; An, Ping; Ligthart, Symen; Gustafsson, Stefan; Malerba, Giovanni; Demirkan, Ayse; Tajes, Juan Fernandez; Steinthorsdottir, Valgerdur; Wuttke, Matthias; Lecoeur, Cécile; Preuss, Michael; Bielak, Lawrence F; Graff, Marielisa; Highland, Heather M; Justice, Anne E; Liu, Dajiang J; Marouli, Eirini.

Nat Genet ; 50(4): 559-571, 2018 04.

Article in English | MEDLINE | ID: mdl-29632382

ABSTRACT

We aggregated coding variant data for 81,412 type 2 diabetes cases and 370,832 controls of diverse ancestry, identifying 40 coding variant association signals (P < 2.2 × 10-7); of these, 16 map outside known risk-associated loci. We make two important observations. First, only five of these signals are driven by low-frequency variants: even for these, effect sizes are modest (odds ratio ≤1.29). Second, when we used large-scale genome-wide association data to fine-map the associated variants in their regional context, accounting for the global enrichment of complex trait associations in coding sequence, compelling evidence for coding variant causality was obtained for only 16 signals. At 13 others, the associated coding variants clearly represent 'false leads' with potential to generate erroneous mechanistic inference. Coding variant associations offer a direct route to biological insight for complex diseases and identification of validated therapeutic targets; however, appropriate mechanistic inference requires careful specification of their causal contribution to disease predisposition.

Subject(s)

Diabetes Mellitus, Type 2/genetics , Alleles , Chromosome Mapping/statistics & numerical data , Diabetes Mellitus, Type 2/classification , Diabetes Mellitus, Type 2/physiopathology , Female , Genetic Predisposition to Disease , Genetic Variation , Genome-Wide Association Study/statistics & numerical data , Humans , Male , White People/genetics , Exome Sequencing/statistics & numerical data

18.

Corrigendum: 1000 Genomes-based meta-analysis identifies 10 novel loci for kidney function.

Gorski, Mathias; Most, Peter J van der; Teumer, Alexander; Chu, Audrey Y; Li, Man; Mijatovic, Vladan; Nolte, Ilja M; Cocca, Massimiliano; Taliun, Daniel; Gomez, Felicia; Li, Yong; Tayo, Bamidele; Tin, Adrienne; Feitosa, Mary F; Aspelund, Thor; Attia, John; Biffar, Reiner; Bochud, Murielle; Boerwinkle, Eric; Borecki, Ingrid; Bottinger, Erwin P; Chen, Ming-Huei; Chouraki, Vincent; Ciullo, Marina; Coresh, Josef; Cornelis, Marilyn C; Curhan, Gary C; Adamo, Adamo Pio d'; Dehghan, Abbas; Dengler, Laura; Ding, Jingzhong; Eiriksdottir, Gudny; Endlich, Karlhans; Enroth, Stefan; Esko, Tõnu; Franco, Oscar H; Gasparini, Paolo; Gieger, Christian; Girotto, Giorgia; Gottesman, Omri; Gudnason, Vilmundur; Gyllensten, Ulf; Hancock, Stephen J; Harris, Tamara B; Helmer, Catherine; Höllerer, Simon; Hofer, Edith; Hofman, Albert; Holliday, Elizabeth G; Homuth, Georg.

Sci Rep ; 7: 46835, 2017 05 26.

Article in English | MEDLINE | ID: mdl-28548086

ABSTRACT

This corrects the article DOI: 10.1038/srep45040.

19.

1000 Genomes-based meta-analysis identifies 10 novel loci for kidney function.

Gorski, Mathias; van der Most, Peter J; Teumer, Alexander; Chu, Audrey Y; Li, Man; Mijatovic, Vladan; Nolte, Ilja M; Cocca, Massimiliano; Taliun, Daniel; Gomez, Felicia; Li, Yong; Tayo, Bamidele; Tin, Adrienne; Feitosa, Mary F; Aspelund, Thor; Attia, John; Biffar, Reiner; Bochud, Murielle; Boerwinkle, Eric; Borecki, Ingrid; Bottinger, Erwin P; Chen, Ming-Huei; Chouraki, Vincent; Ciullo, Marina; Coresh, Josef; Cornelis, Marilyn C; Curhan, Gary C; d'Adamo, Adamo Pio; Dehghan, Abbas; Dengler, Laura; Ding, Jingzhong; Eiriksdottir, Gudny; Endlich, Karlhans; Enroth, Stefan; Esko, Tõnu; Franco, Oscar H; Gasparini, Paolo; Gieger, Christian; Girotto, Giorgia; Gottesman, Omri; Gudnason, Vilmundur; Gyllensten, Ulf; Hancock, Stephen J; Harris, Tamara B; Helmer, Catherine; Höllerer, Simon; Hofer, Edith; Hofman, Albert; Holliday, Elizabeth G; Homuth, Georg.

Sci Rep ; 7: 45040, 2017 04 28.

Article in English | MEDLINE | ID: mdl-28452372

ABSTRACT

HapMap imputed genome-wide association studies (GWAS) have revealed >50 loci at which common variants with minor allele frequency >5% are associated with kidney function. GWAS using more complete reference sets for imputation, such as those from The 1000 Genomes project, promise to identify novel loci that have been missed by previous efforts. To investigate the value of such a more complete variant catalog, we conducted a GWAS meta-analysis of kidney function based on the estimated glomerular filtration rate (eGFR) in 110,517 European ancestry participants using 1000 Genomes imputed data. We identified 10 novel loci with p-value < 5 × 10-8 previously missed by HapMap-based GWAS. Six of these loci (HOXD8, ARL15, PIK3R1, EYA4, ASTN2, and EPB41L3) are tagged by common SNPs unique to the 1000 Genomes reference panel. Using pathway analysis, we identified 39 significant (FDR < 0.05) genes and 127 significantly (FDR < 0.05) enriched gene sets, which were missed by our previous analyses. Among those, the 10 identified novel genes are part of pathways of kidney development, carbohydrate metabolism, cardiac septum development and glucose metabolism. These results highlight the utility of re-imputing from denser reference panels, until whole-genome sequencing becomes feasible in large samples.

Subject(s)

Computational Biology/methods , Genetic Loci , Kidney/physiology , Gene Frequency , Genome, Human , Genome-Wide Association Study , Genotyping Techniques , Humans , Polymorphism, Single Nucleotide

20.

LASER server: ancestry tracing with genotypes or sequence reads.

Taliun, Daniel; Chothani, Sonia P; Schönherr, Sebastian; Forer, Lukas; Boehnke, Michael; Abecasis, Gonçalo R; Wang, Chaolong.

Bioinformatics ; 33(13): 2056-2058, 2017 Jul 01.

Article in English | MEDLINE | ID: mdl-28200055

ABSTRACT

SUMMARY: To enable direct comparison of ancestry background in different studies, we developed LASER to estimate individual ancestry by placing either sezquenced or genotyped samples in a common ancestry space, regardless of the sequencing strategy or genotyping array used to characterize each sample. Here we describe the LASER server to facilitate application of the method to a wide range of genetic studies. The server provides genetic ancestry estimation for different geographic regions and user-friendly interactive visualization of the results. AVAILABILITY AND IMPLEMENTATION: The LASER server is freely accessible at http://laser.sph.umich.edu/. CONTACT: dtaliun@umich.edu or wangcl@gis.a-star.edu.sg. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Subject(s)

Genetic Variation , Phylogeography/methods , Population Groups/genetics , Sequence Analysis, DNA/methods , Software , Humans

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL