Search | VHL CLAP/WR-PAHO/WHO

1.

Genetic drivers of heterogeneity in type 2 diabetes pathophysiology.

Suzuki, Ken; Hatzikotoulas, Konstantinos; Southam, Lorraine; Taylor, Henry J; Yin, Xianyong; Lorenz, Kim M; Mandla, Ravi; Huerta-Chagoya, Alicia; Melloni, Giorgio E M; Kanoni, Stavroula; Rayner, Nigel W; Bocher, Ozvan; Arruda, Ana Luiza; Sonehara, Kyuto; Namba, Shinichi; Lee, Simon S K; Preuss, Michael H; Petty, Lauren E; Schroeder, Philip; Vanderwerff, Brett; Kals, Mart; Bragg, Fiona; Lin, Kuang; Guo, Xiuqing; Zhang, Weihua; Yao, Jie; Kim, Young Jin; Graff, Mariaelisa; Takeuchi, Fumihiko; Nano, Jana; Lamri, Amel; Nakatochi, Masahiro; Moon, Sanghoon; Scott, Robert A; Cook, James P; Lee, Jung-Jin; Pan, Ian; Taliun, Daniel; Parra, Esteban J; Chai, Jin-Fang; Bielak, Lawrence F; Tabara, Yasuharu; Hai, Yang; Thorleifsson, Gudmar; Grarup, Niels; Sofer, Tamar; Wuttke, Matthias; Sarnowski, Chloé; Gieger, Christian; Nousome, Darryl.

Nature ; 627(8003): 347-357, 2024 Mar.

Article in English | MEDLINE | ID: mdl-38374256

ABSTRACT

Type 2 diabetes (T2D) is a heterogeneous disease that develops through diverse pathophysiological processes1,2 and molecular mechanisms that are often specific to cell type3,4. Here, to characterize the genetic contribution to these processes across ancestry groups, we aggregate genome-wide association study data from 2,535,601 individuals (39.7% not of European ancestry), including 428,452 cases of T2D. We identify 1,289 independent association signals at genome-wide significance (P < 5 × 10-8) that map to 611 loci, of which 145 loci are, to our knowledge, previously unreported. We define eight non-overlapping clusters of T2D signals that are characterized by distinct profiles of cardiometabolic trait associations. These clusters are differentially enriched for cell-type-specific regions of open chromatin, including pancreatic islets, adipocytes, endothelial cells and enteroendocrine cells. We build cluster-specific partitioned polygenic scores5 in a further 279,552 individuals of diverse ancestry, including 30,288 cases of T2D, and test their association with T2D-related vascular outcomes. Cluster-specific partitioned polygenic scores are associated with coronary artery disease, peripheral artery disease and end-stage diabetic nephropathy across ancestry groups, highlighting the importance of obesity-related processes in the development of vascular outcomes. Our findings show the value of integrating multi-ancestry genome-wide association study data with single-cell epigenomics to disentangle the aetiological heterogeneity that drives the development and progression of T2D. This might offer a route to optimize global access to genetically informed diabetes care.

Subject(s)

Diabetes Mellitus, Type 2 , Disease Progression , Genetic Predisposition to Disease , Genome-Wide Association Study , Humans , Adipocytes/metabolism , Chromatin/genetics , Chromatin/metabolism , Coronary Artery Disease/complications , Coronary Artery Disease/genetics , Diabetes Mellitus, Type 2/classification , Diabetes Mellitus, Type 2/complications , Diabetes Mellitus, Type 2/genetics , Diabetes Mellitus, Type 2/pathology , Diabetes Mellitus, Type 2/physiopathology , Diabetic Nephropathies/complications , Diabetic Nephropathies/genetics , Endothelial Cells/metabolism , Enteroendocrine Cells , Epigenomics , Genetic Predisposition to Disease/genetics , Islets of Langerhans/metabolism , Multifactorial Inheritance/genetics , Peripheral Arterial Disease/complications , Peripheral Arterial Disease/genetics , Single-Cell Analysis

2.

From target discovery to clinical drug development with human genetics.

Trajanoska, Katerina; Bhérer, Claude; Taliun, Daniel; Zhou, Sirui; Richards, J Brent; Mooser, Vincent.

Nature ; 620(7975): 737-745, 2023 Aug.

Article in English | MEDLINE | ID: mdl-37612393

ABSTRACT

The substantial investments in human genetics and genomics made over the past three decades were anticipated to result in many innovative therapies. Here we investigate the extent to which these expectations have been met, excluding cancer treatments. In our search, we identified 40 germline genetic observations that led directly to new targets and subsequently to novel approved therapies for 36 rare and 4 common conditions. The median time between genetic target discovery and drug approval was 25 years. Most of the genetically driven therapies for rare diseases compensate for disease-causing loss-of-function mutations. The therapies approved for common conditions are all inhibitors designed to pharmacologically mimic the natural, disease-protective effects of rare loss-of-function variants. Large biobank-based genetic studies have the power to identify and validate a large number of new drug targets. Genetics can also assist in the clinical development phase of drugs-for example, by selecting individuals who are most likely to respond to investigational therapies. This approach to drug development requires investments into large, diverse cohorts of deeply phenotyped individuals with appropriate consent for genetically assisted trials. A robust framework that facilitates responsible, sustainable benefit sharing will be required to capture the full potential of human genetics and genomics and bring effective and safe innovative therapies to patients quickly.

Subject(s)

Drug Development , Human Genetics , Molecular Targeted Therapy , Humans , Drug Approval/statistics & numerical data , Drug Development/statistics & numerical data , Therapies, Investigational/statistics & numerical data , Molecular Targeted Therapy/methods , Molecular Targeted Therapy/statistics & numerical data , Rare Diseases/genetics , Rare Diseases/therapy , Germ-Line Mutation , Time Factors

3.

Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program.

Taliun, Daniel; Harris, Daniel N; Kessler, Michael D; Carlson, Jedidiah; Szpiech, Zachary A; Torres, Raul; Taliun, Sarah A Gagliano; Corvelo, André; Gogarten, Stephanie M; Kang, Hyun Min; Pitsillides, Achilleas N; LeFaive, Jonathon; Lee, Seung-Been; Tian, Xiaowen; Browning, Brian L; Das, Sayantan; Emde, Anne-Katrin; Clarke, Wayne E; Loesch, Douglas P; Shetty, Amol C; Blackwell, Thomas W; Smith, Albert V; Wong, Quenna; Liu, Xiaoming; Conomos, Matthew P; Bobo, Dean M; Aguet, François; Albert, Christine; Alonso, Alvaro; Ardlie, Kristin G; Arking, Dan E; Aslibekyan, Stella; Auer, Paul L; Barnard, John; Barr, R Graham; Barwick, Lucas; Becker, Lewis C; Beer, Rebecca L; Benjamin, Emelia J; Bielak, Lawrence F; Blangero, John; Boehnke, Michael; Bowden, Donald W; Brody, Jennifer A; Burchard, Esteban G; Cade, Brian E; Casella, James F; Chalazan, Brandon; Chasman, Daniel I; Chen, Yii-Der Ida.

Nature ; 590(7845): 290-299, 2021 02.

Article in English | MEDLINE | ID: mdl-33568819

ABSTRACT

The Trans-Omics for Precision Medicine (TOPMed) programme seeks to elucidate the genetic architecture and biology of heart, lung, blood and sleep disorders, with the ultimate goal of improving diagnosis, treatment and prevention of these diseases. The initial phases of the programme focused on whole-genome sequencing of individuals with rich phenotypic data and diverse backgrounds. Here we describe the TOPMed goals and design as well as the available resources and early insights obtained from the sequence data. The resources include a variant browser, a genotype imputation server, and genomic and phenotypic data that are available through dbGaP (Database of Genotypes and Phenotypes)1. In the first 53,831 TOPMed samples, we detected more than 400 million single-nucleotide and insertion or deletion variants after alignment with the reference genome. Additional previously undescribed variants were detected through assembly of unmapped reads and customized analysis in highly variable loci. Among the more than 400 million detected variants, 97% have frequencies of less than 1% and 46% are singletons that are present in only one individual (53% among unrelated individuals). These rare variants provide insights into mutational processes and recent human evolutionary history. The extensive catalogue of genetic variation in TOPMed studies provides unique opportunities for exploring the contributions of rare and noncoding sequence variants to phenotypic variation. Furthermore, combining TOPMed haplotypes with modern imputation methods improves the power and reach of genome-wide association studies to include variants down to a frequency of approximately 0.01%.

Subject(s)

Genetic Variation/genetics , Genome, Human/genetics , Genomics , National Heart, Lung, and Blood Institute (U.S.) , Precision Medicine , Cytochrome P-450 CYP2D6/genetics , Haplotypes/genetics , Heterozygote , Humans , INDEL Mutation , Loss of Function Mutation , Mutagenesis , Phenotype , Polymorphism, Single Nucleotide , Population Density , Precision Medicine/standards , Quality Control , Sample Size , United States , Whole Genome Sequencing/standards

4.

Imputation Server PGS: an automated approach to calculate polygenic risk scores on imputation servers.

Forer, Lukas; Taliun, Daniel; LeFaive, Jonathon; Smith, Albert V; Boughton, Andrew P; Coassin, Stefan; Lamina, Claudia; Kronenberg, Florian; Fuchsberger, Christian; Schönherr, Sebastian.

Nucleic Acids Res ; 52(W1): W70-W77, 2024 Jul 05.

Article in English | MEDLINE | ID: mdl-38709879

ABSTRACT

Polygenic scores (PGS) enable the prediction of genetic predisposition for a wide range of traits and diseases by calculating the weighted sum of allele dosages for genetic variants associated with the trait or disease in question. Present approaches for calculating PGS from genotypes are often inefficient and labor-intensive, limiting transferability into clinical applications. Here, we present 'Imputation Server PGS', an extension of the Michigan Imputation Server designed to automate a standardized calculation of polygenic scores based on imputed genotypes. This extends the widely used Michigan Imputation Server with new functionality, bringing the simplicity and efficiency of modern imputation to the PGS field. The service currently supports over 4489 published polygenic scores from publicly available repositories and provides extensive quality control, including ancestry estimation to report population stratification. An interactive report empowers users to screen and compare thousands of scores in a fast and intuitive way. Imputation Server PGS provides a user-friendly web service, facilitating the application of polygenic scores to a wide range of genetic studies and is freely available at https://imputationserver.sph.umich.edu.

Subject(s)

Genetic Predisposition to Disease , Multifactorial Inheritance , Software , Multifactorial Inheritance/genetics , Humans , Internet , Genome-Wide Association Study/methods , Polymorphism, Single Nucleotide , Genotype , Alleles , Genetic Risk Score

5.

Cancer PRSweb: An Online Repository with Polygenic Risk Scores for Major Cancer Traits and Their Evaluation in Two Independent Biobanks.

Fritsche, Lars G; Patil, Snehal; Beesley, Lauren J; VandeHaar, Peter; Salvatore, Maxwell; Ma, Ying; Peng, Robert B; Taliun, Daniel; Zhou, Xiang; Mukherjee, Bhramar.

Am J Hum Genet ; 107(5): 815-836, 2020 11 05.

Article in English | MEDLINE | ID: mdl-32991828

ABSTRACT

To facilitate scientific collaboration on polygenic risk scores (PRSs) research, we created an extensive PRS online repository for 35 common cancer traits integrating freely available genome-wide association studies (GWASs) summary statistics from three sources: published GWASs, the NHGRI-EBI GWAS Catalog, and UK Biobank-based GWASs. Our framework condenses these summary statistics into PRSs using various approaches such as linkage disequilibrium pruning/p value thresholding (fixed or data-adaptively optimized thresholds) and penalized, genome-wide effect size weighting. We evaluated the PRSs in two biobanks: the Michigan Genomics Initiative (MGI), a longitudinal biorepository effort at Michigan Medicine, and the population-based UK Biobank (UKB). For each PRS construct, we provide measures on predictive performance and discrimination. Besides PRS evaluation, the Cancer-PRSweb platform features construct downloads and phenome-wide PRS association study results (PRS-PheWAS) for predictive PRSs. We expect this integrated platform to accelerate PRS-related cancer research.

Subject(s)

Biological Specimen Banks/statistics & numerical data , Genetic Predisposition to Disease , Genome, Human , Genomics/methods , Multifactorial Inheritance , Neoplasms/genetics , Adult , Aged , Female , Genome-Wide Association Study , Humans , Internet , Linkage Disequilibrium , Male , Middle Aged , Neoplasms/classification , Neoplasms/diagnosis , Neoplasms/epidemiology , Phenotype , Quantitative Trait, Heritable , Risk Factors , United Kingdom/epidemiology , United States/epidemiology

6.

De novo mutations across 1,465 diverse genomes reveal mutational insights and reductions in the Amish founder population.

Kessler, Michael D; Loesch, Douglas P; Perry, James A; Heard-Costa, Nancy L; Taliun, Daniel; Cade, Brian E; Wang, Heming; Daya, Michelle; Ziniti, John; Datta, Soma; Celedón, Juan C; Soto-Quiros, Manuel E; Avila, Lydiana; Weiss, Scott T; Barnes, Kathleen; Redline, Susan S; Vasan, Ramachandran S; Johnson, Andrew D; Mathias, Rasika A; Hernandez, Ryan; Wilson, James G; Nickerson, Deborah A; Abecasis, Goncalo; Browning, Sharon R; Zöllner, Sebastian; O'Connell, Jeffrey R; Mitchell, Braxton D; O'Connor, Timothy D.

Proc Natl Acad Sci U S A ; 117(5): 2560-2569, 2020 02 04.

Article in English | MEDLINE | ID: mdl-31964835

ABSTRACT

De novo mutations (DNMs), or mutations that appear in an individual despite not being seen in their parents, are an important source of genetic variation whose impact is relevant to studies of human evolution, genetics, and disease. Utilizing high-coverage whole-genome sequencing data as part of the Trans-Omics for Precision Medicine (TOPMed) Program, we called 93,325 single-nucleotide DNMs across 1,465 trios from an array of diverse human populations, and used them to directly estimate and analyze DNM counts, rates, and spectra. We find a significant positive correlation between local recombination rate and local DNM rate, and that DNM rate explains a substantial portion (8.98 to 34.92%, depending on the model) of the genome-wide variation in population-level genetic variation from 41K unrelated TOPMed samples. Genome-wide heterozygosity does correlate with DNM rate, but only explains <1% of variation. While we are underpowered to see small differences, we do not find significant differences in DNM rate between individuals of European, African, and Latino ancestry, nor across ancestrally distinct segments within admixed individuals. However, we did find significantly fewer DNMs in Amish individuals, even when compared with other Europeans, and even after accounting for parental age and sequencing center. Specifically, we found significant reductions in the number of CâA and TâC mutations in the Amish, which seem to underpin their overall reduction in DNMs. Finally, we calculated near-zero estimates of narrow sense heritability (h2), which suggest that variation in DNM rate is significantly shaped by nonadditive genetic effects and the environment.

Subject(s)

Amish/genetics , Genome, Human , Adult , Cohort Studies , DNA Mutational Analysis , Female , Genetics, Population , Heterozygote , Humans , Male , Mutation , Pedigree , Whole Genome Sequencing , Young Adult

7.

LocusZoom.js: interactive and embeddable visualization of genetic association study results.

Boughton, Andrew P; Welch, Ryan P; Flickinger, Matthew; VandeHaar, Peter; Taliun, Daniel; Abecasis, Gonçalo R; Boehnke, Michael.

Bioinformatics ; 37(18): 3017-3018, 2021 09 29.

Article in English | MEDLINE | ID: mdl-33734315

ABSTRACT

SUMMARY: LocusZoom.js is a JavaScript library for creating interactive web-based visualizations of genetic association study results. It can display one or more traits in the context of relevant biological data (such as gene models and other genomic annotation), and allows interactive refinement of analysis models (by selecting linkage disequilibrium reference panels, identifying sets of likely causal variants, or comparisons to the GWAS catalog). It can be embedded in web pages to enable data sharing and exploration. Views can be customized and extended to display other data types such as phenome-wide association study (PheWAS) results, chromatin co-accessibility, or eQTL measurements. A new web upload service harmonizes datasets, adds annotations, and makes it easy to explore user-provided result sets. AVAILABILITY AND IMPLEMENTATION: LocusZoom.js is open-source software under a permissive MIT license. Code and documentation are available at: https://github.com/statgen/locuszoom/. Installable packages for all versions are also distributed via NPM. Additional features are provided as standalone libraries to promote reuse. Use with your own GWAS results at https://my.locuszoom.org/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Subject(s)

Genomics , Software , Genome , Genetic Association Studies , Documentation

8.

emeraLD: rapid linkage disequilibrium estimation with massive datasets.

Quick, Corbin; Fuchsberger, Christian; Taliun, Daniel; Abecasis, Gonçalo; Boehnke, Michael; Kang, Hyun Min.

Bioinformatics ; 35(1): 164-166, 2019 01 01.

Article in English | MEDLINE | ID: mdl-30204848

ABSTRACT

Summary: Estimating linkage disequilibrium (LD) is essential for a wide range of summary statistics-based association methods for genome-wide association studies. Large genetic datasets, e.g. the TOPMed WGS project and UK Biobank, enable more accurate and comprehensive LD estimates, but increase the computational burden of LD estimation. Here, we describe emeraLD (Efficient Methods for Estimation and Random Access of LD), a computational tool that leverages sparsity and haplotype structure to estimate LD up to 2 orders of magnitude faster than current tools. Availability and implementation: emeraLD is implemented in C++, and is open source under GPLv3. Source code and documentation are freely available at http://github.com/statgen/emeraLD. Supplementary information: Supplementary data are available at Bioinformatics online.

Subject(s)

Genome-Wide Association Study , Linkage Disequilibrium , Software , Computational Biology , Haplotypes

9.

LASER server: ancestry tracing with genotypes or sequence reads.

Taliun, Daniel; Chothani, Sonia P; Schönherr, Sebastian; Forer, Lukas; Boehnke, Michael; Abecasis, Gonçalo R; Wang, Chaolong.

Bioinformatics ; 33(13): 2056-2058, 2017 Jul 01.

Article in English | MEDLINE | ID: mdl-28200055

ABSTRACT

SUMMARY: To enable direct comparison of ancestry background in different studies, we developed LASER to estimate individual ancestry by placing either sezquenced or genotyped samples in a common ancestry space, regardless of the sequencing strategy or genotyping array used to characterize each sample. Here we describe the LASER server to facilitate application of the method to a wide range of genetic studies. The server provides genetic ancestry estimation for different geographic regions and user-friendly interactive visualization of the results. AVAILABILITY AND IMPLEMENTATION: The LASER server is freely accessible at http://laser.sph.umich.edu/. CONTACT: dtaliun@umich.edu or wangcl@gis.a-star.edu.sg. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Subject(s)

Genetic Variation , Phylogeography/methods , Population Groups/genetics , Sequence Analysis, DNA/methods , Software , Humans

10.

FamAgg: an R package to evaluate familial aggregation of traits in large pedigrees.

Rainer, Johannes; Taliun, Daniel; D'Elia, Yuri; Pattaro, Cristian; Domingues, Francisco S; Weichenberger, Christian X.

Bioinformatics ; 32(10): 1583-5, 2016 05 15.

Article in English | MEDLINE | ID: mdl-26803158

ABSTRACT

UNLABELLED: Familial aggregation analysis is the first fundamental step to perform when assessing the extent of genetic background of a disease. However, there is a lack of software to analyze the familial clustering of complex phenotypes in very large pedigrees. Such pedigrees can be utilized to calculate measures that express trait aggregation on both the family and individual level, providing valuable directions in choosing families for detailed follow-up studies. We developed FamAgg, an open source R package that contains both established and novel methods to investigate familial aggregation of traits in large pedigrees. We demonstrate its use and interpretation by analyzing a publicly available cancer dataset with more than 20 000 participants distributed across approximately 400 families. AVAILABILITY AND IMPLEMENTATION: The FamAgg package is freely available at the Bioconductor repository, http://www.bioconductor.org/packages/FamAgg CONTACT: Christian.Weichenberger@eurac.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Subject(s)

Software , Pedigree

11.

Meta-analysis of genome-wide association studies identifies two loci associated with circulating osteoprotegerin levels.

Kwan, Johnny S H; Hsu, Yi-Hsiang; Cheung, Ching-Lung; Dupuis, Josée; Saint-Pierre, Aude; Eriksson, Joel; Handelman, Samuel K; Aragaki, Aaron; Karasik, David; Pramstaller, Peter P; Kooperberg, Charles; Lacroix, Andrea Z; Larson, Martin G; Lau, Kam-Shing; Lorentzon, Mattias; Pichler, Irene; Sham, Pak C; Taliun, Daniel; Vandenput, Liesbeth; Kiel, Douglas P; Hicks, Andrew A; Jackson, Rebecca D; Ohlsson, Claes; Benjamin, Emelia J; Kung, Annie W C.

Hum Mol Genet ; 23(24): 6684-93, 2014 Dec 15.

Article in English | MEDLINE | ID: mdl-25080503

ABSTRACT

Osteoprotegerin (OPG) is involved in bone homeostasis and tumor cell survival. Circulating OPG levels are also important biomarkers of various clinical traits, such as cancers and atherosclerosis. OPG levels were measured in serum or in plasma. In a meta-analysis of genome-wide association studies in up to 10 336 individuals from European and Asian origin, we discovered that variants >100 kb upstream of the TNFRSF11B gene encoding OPG and another new locus on chromosome 17q11.2 were significantly associated with OPG variation. We also identified a suggestive locus on chromosome 14q21.2 associated with the trait. Moreover, we estimated that over half of the heritability of OPG levels could be explained by all variants examined in our study. Our findings provide further insight into the genetic regulation of circulating OPG levels.

Subject(s)

Chromosomes, Human, Pair 14/chemistry , Chromosomes, Human, Pair 17/chemistry , Genetic Loci , Osteoprotegerin/genetics , Polymorphism, Genetic , Quantitative Trait, Heritable , Asian People , Female , Genome, Human , Genome-Wide Association Study , Humans , Male , Osteoprotegerin/blood , White People

12.

Genome-wide association and functional follow-up reveals new loci for kidney function.

Pattaro, Cristian; Köttgen, Anna; Teumer, Alexander; Garnaas, Maija; Böger, Carsten A; Fuchsberger, Christian; Olden, Matthias; Chen, Ming-Huei; Tin, Adrienne; Taliun, Daniel; Li, Man; Gao, Xiaoyi; Gorski, Mathias; Yang, Qiong; Hundertmark, Claudia; Foster, Meredith C; O'Seaghdha, Conall M; Glazer, Nicole; Isaacs, Aaron; Liu, Ching-Ti; Smith, Albert V; O'Connell, Jeffrey R; Struchalin, Maksim; Tanaka, Toshiko; Li, Guo; Johnson, Andrew D; Gierman, Hinco J; Feitosa, Mary; Hwang, Shih-Jen; Atkinson, Elizabeth J; Lohman, Kurt; Cornelis, Marilyn C; Johansson, Åsa; Tönjes, Anke; Dehghan, Abbas; Chouraki, Vincent; Holliday, Elizabeth G; Sorice, Rossella; Kutalik, Zoltan; Lehtimäki, Terho; Esko, Tõnu; Deshmukh, Harshal; Ulivi, Sheila; Chu, Audrey Y; Murgia, Federico; Trompet, Stella; Imboden, Medea; Kollerits, Barbara; Pistis, Giorgio; Harris, Tamara B.

PLoS Genet ; 8(3): e1002584, 2012.

Article in English | MEDLINE | ID: mdl-22479191

ABSTRACT

Chronic kidney disease (CKD) is an important public health problem with a genetic component. We performed genome-wide association studies in up to 130,600 European ancestry participants overall, and stratified for key CKD risk factors. We uncovered 6 new loci in association with estimated glomerular filtration rate (eGFR), the primary clinical measure of CKD, in or near MPPED2, DDX1, SLC47A1, CDK12, CASP9, and INO80. Morpholino knockdown of mpped2 and casp9 in zebrafish embryos revealed podocyte and tubular abnormalities with altered dextran clearance, suggesting a role for these genes in renal function. By providing new insights into genes that regulate renal function, these results could further our understanding of the pathogenesis of CKD.

Subject(s)

Genome-Wide Association Study , Glomerular Filtration Rate/genetics , Kidney Failure, Chronic/genetics , Kidney/physiopathology , Zebrafish/genetics , ATPases Associated with Diverse Cellular Activities , Black or African American/genetics , Aged , Animals , Caspase 9/genetics , Cyclin-Dependent Kinases/genetics , DEAD-box RNA Helicases/genetics , DNA Helicases/genetics , DNA-Binding Proteins , Female , Follow-Up Studies , Gene Knockdown Techniques , Humans , Kidney Failure, Chronic/pathology , Male , Middle Aged , Phosphoric Diester Hydrolases/genetics , White People/genetics

13.

Efficient haplotype block recognition of very long and dense genetic sequences.

Taliun, Daniel; Gamper, Johann; Pattaro, Cristian.

BMC Bioinformatics ; 15: 10, 2014 Jan 14.

Article in English | MEDLINE | ID: mdl-24423111

ABSTRACT

BACKGROUND: The new sequencing technologies enable to scan very long and dense genetic sequences, obtaining datasets of genetic markers that are an order of magnitude larger than previously available. Such genetic sequences are characterized by common alleles interspersed with multiple rarer alleles. This situation has renewed the interest for the identification of haplotypes carrying the rare risk alleles. However, large scale explorations of the linkage-disequilibrium (LD) pattern to identify haplotype blocks are not easy to perform, because traditional algorithms have at least Θ(n2) time and memory complexity. RESULTS: We derived three incremental optimizations of the widely used haplotype block recognition algorithm proposed by Gabriel et al. in 2002. Our most efficient solution, called MIG ++, has only Θ(n) memory complexity and, on a genome-wide scale, it omits >80% of the calculations, which makes it an order of magnitude faster than the original algorithm. Differently from the existing software, the MIG ++ analyzes the LD between SNPs at any distance, avoiding restrictions on the maximal block length. The haplotype block partition of the entire HapMap II CEPH dataset was obtained in 457 hours. By replacing the standard likelihood-based D' variance estimator with an approximated estimator, the runtime was further improved. While producing a coarser partition, the approximate method allowed to obtain the full-genome haplotype block partition of the entire 1000 Genomes Project CEPH dataset in 44 hours, with no restrictions on allele frequency or long-range correlations. These experiments showed that LD-based haplotype blocks can span more than one million base-pairs in both HapMap II and 1000 Genomes datasets. An application to the North American Rheumatoid Arthritis Consortium (NARAC) dataset shows how the MIG ++ can support genome-wide haplotype association studies. CONCLUSIONS: The MIG ++ enables to perform LD-based haplotype block recognition on genetic sequences of any length and density. In the new generation sequencing era, this can help identify haplotypes that carry rare variants of interest. The low computational requirements open the possibility to include the haplotype block structure into genome-wide association scans, downstream analyses, and visual interfaces for online genome browsers.

Subject(s)

Computational Biology/methods , Haplotypes/genetics , Sequence Analysis, DNA/methods , Software , Algorithms , Arthritis, Rheumatoid/genetics , Gene Frequency , Genome/genetics , Humans , Linkage Disequilibrium/genetics , Models, Genetic

14.

Importance of different types of prior knowledge in selecting genome-wide findings for follow-up.

Minelli, Cosetta; De Grandi, Alessandro; Weichenberger, Christian X; Gögele, Martin; Modenese, Mirko; Attia, John; Barrett, Jennifer H; Boehnke, Michael; Borsani, Giuseppe; Casari, Giorgio; Fox, Caroline S; Freina, Thomas; Hicks, Andrew A; Marroni, Fabio; Parmigiani, Giovanni; Pastore, Andrea; Pattaro, Cristian; Pfeufer, Arne; Ruggeri, Fabrizio; Schwienbacher, Christine; Taliun, Daniel; Pramstaller, Peter P; Domingues, Francisco S; Thompson, John R.

Genet Epidemiol ; 37(2): 205-13, 2013 Feb.

Article in English | MEDLINE | ID: mdl-23307621

ABSTRACT

Biological plausibility and other prior information could help select genome-wide association (GWA) findings for further follow-up, but there is no consensus on which types of knowledge should be considered or how to weight them. We used experts' opinions and empirical evidence to estimate the relative importance of 15 types of information at the single-nucleotide polymorphism (SNP) and gene levels. Opinions were elicited from 10 experts using a two-round Delphi survey. Empirical evidence was obtained by comparing the frequency of each type of characteristic in SNPs established as being associated with seven disease traits through GWA meta-analysis and independent replication, with the corresponding frequency in a randomly selected set of SNPs. SNP and gene characteristics were retrieved using a specially developed bioinformatics tool. Both the expert and the empirical evidence rated previous association in a meta-analysis or more than one study as conferring the highest relative probability of true association, whereas previous association in a single study ranked much lower. High relative probabilities were also observed for location in a functional protein domain, although location in a region evolutionarily conserved in vertebrates was ranked high by the data but not by the experts. Our empirical evidence did not support the importance attributed by the experts to whether the gene encodes a protein in a pathway or shows interactions relevant to the trait. Our findings provide insight into the selection and weighting of different types of knowledge in SNP or gene prioritization, and point to areas requiring further research.

Subject(s)

Follow-Up Studies , Genetic Research , Polymorphism, Single Nucleotide , Computational Biology/methods , Genome-Wide Association Study , Humans , Meta-Analysis as Topic , Probability

15.

SNP prioritization using a Bayesian probability of association.

Thompson, John R; Gögele, Martin; Weichenberger, Christian X; Modenese, Mirko; Attia, John; Barrett, Jennifer H; Boehnke, Michael; De Grandi, Alessandro; Domingues, Francisco S; Hicks, Andrew A; Marroni, Fabio; Pattaro, Cristian; Ruggeri, Fabrizio; Borsani, Giuseppe; Casari, Giorgio; Parmigiani, Giovanni; Pastore, Andrea; Pfeufer, Arne; Schwienbacher, Christine; Taliun, Daniel; Fox, Caroline S; Pramstaller, Peter P; Minelli, Cosetta.

Genet Epidemiol ; 37(2): 214-21, 2013 Feb.

Article in English | MEDLINE | ID: mdl-23280596

ABSTRACT

Prioritization is the process whereby a set of possible candidate genes or SNPs is ranked so that the most promising can be taken forward into further studies. In a genome-wide association study, prioritization is usually based on the P-values alone, but researchers sometimes take account of external annotation information about the SNPs such as whether the SNP lies close to a good candidate gene. Using external information in this way is inherently subjective and is often not formalized, making the analysis difficult to reproduce. Building on previous work that has identified 14 important types of external information, we present an approximate Bayesian analysis that produces an estimate of the probability of association. The calculation combines four sources of information: the genome-wide data, SNP information derived from bioinformatics databases, empirical SNP weights, and the researchers' subjective prior opinions. The calculation is fast enough that it can be applied to millions of SNPS and although it does rely on subjective judgments, those judgments are made explicit so that the final SNP selection can be reproduced. We show that the resulting probability of association is intuitively more appealing than the P-value because it is easier to interpret and it makes allowance for the power of the study. We illustrate the use of the probability of association for SNP prioritization by applying it to a meta-analysis of kidney function genome-wide association studies and demonstrate that SNP selection performs better using the probability of association compared with P-values alone.

Subject(s)

Bayes Theorem , Genome-Wide Association Study , Polymorphism, Single Nucleotide , Databases, Genetic , Humans , Kidney/physiology , Meta-Analysis as Topic , Models, Genetic , Probability

16.

Integration of genome-wide association studies with biological knowledge identifies six novel genes related to kidney function.

Chasman, Daniel I; Fuchsberger, Christian; Pattaro, Cristian; Teumer, Alexander; Böger, Carsten A; Endlich, Karlhans; Olden, Matthias; Chen, Ming-Huei; Tin, Adrienne; Taliun, Daniel; Li, Man; Gao, Xiaoyi; Gorski, Mathias; Yang, Qiong; Hundertmark, Claudia; Foster, Meredith C; O'Seaghdha, Conall M; Glazer, Nicole; Isaacs, Aaron; Liu, Ching-Ti; Smith, Albert V; O'Connell, Jeffrey R; Struchalin, Maksim; Tanaka, Toshiko; Li, Guo; Johnson, Andrew D; Gierman, Hinco J; Feitosa, Mary F; Hwang, Shih-Jen; Atkinson, Elizabeth J; Lohman, Kurt; Cornelis, Marilyn C; Johansson, Asa; Tönjes, Anke; Dehghan, Abbas; Lambert, Jean-Charles; Holliday, Elizabeth G; Sorice, Rossella; Kutalik, Zoltan; Lehtimäki, Terho; Esko, Tõnu; Deshmukh, Harshal; Ulivi, Sheila; Chu, Audrey Y; Murgia, Federico; Trompet, Stella; Imboden, Medea; Coassin, Stefan; Pistis, Giorgio; Harris, Tamara B.

Hum Mol Genet ; 21(24): 5329-43, 2012 Dec 15.

Article in English | MEDLINE | ID: mdl-22962313

ABSTRACT

In conducting genome-wide association studies (GWAS), analytical approaches leveraging biological information may further understanding of the pathophysiology of clinical traits. To discover novel associations with estimated glomerular filtration rate (eGFR), a measure of kidney function, we developed a strategy for integrating prior biological knowledge into the existing GWAS data for eGFR from the CKDGen Consortium. Our strategy focuses on single nucleotide polymorphism (SNPs) in genes that are connected by functional evidence, determined by literature mining and gene ontology (GO) hierarchies, to genes near previously validated eGFR associations. It then requires association thresholds consistent with multiple testing, and finally evaluates novel candidates by independent replication. Among the samples of European ancestry, we identified a genome-wide significant SNP in FBXL20 (P = 5.6 × 10(-9)) in meta-analysis of all available data, and additional SNPs at the INHBC, LRP2, PLEKHA1, SLC3A2 and SLC7A6 genes meeting multiple-testing corrected significance for replication and overall P-values of 4.5 × 10(-4)-2.2 × 10(-7). Neither the novel PLEKHA1 nor FBXL20 associations, both further supported by association with eGFR among African Americans and with transcript abundance, would have been implicated by eGFR candidate gene approaches. LRP2, encoding the megalin receptor, was identified through connection with the previously known eGFR gene DAB2 and extends understanding of the megalin system in kidney function. These findings highlight integration of existing genome-wide association data with independent biological knowledge to uncover novel candidate eGFR associations, including candidates lacking known connections to kidney-specific pathways. The strategy may also be applicable to other clinical phenotypes, although more testing will be needed to assess its potential for discovery in general.

Subject(s)

Genome-Wide Association Study/methods , Polymorphism, Single Nucleotide/genetics , Amino Acid Transport Systems, Basic/genetics , Fusion Regulatory Protein 1, Heavy Chain/genetics , Genetic Predisposition to Disease/genetics , Glomerular Filtration Rate/genetics , Glomerular Filtration Rate/physiology , Humans , Inhibin-beta Subunits/genetics , Intracellular Signaling Peptides and Proteins/genetics , Low Density Lipoprotein Receptor-Related Protein-2/genetics , Membrane Proteins/genetics

17.

Common variants in Mendelian kidney disease genes and their association with renal function.

Parsa, Afshin; Fuchsberger, Christian; Köttgen, Anna; O'Seaghdha, Conall M; Pattaro, Cristian; de Andrade, Mariza; Chasman, Daniel I; Teumer, Alexander; Endlich, Karlhans; Olden, Matthias; Chen, Ming-Huei; Tin, Adrienne; Kim, Young J; Taliun, Daniel; Li, Man; Feitosa, Mary; Gorski, Mathias; Yang, Qiong; Hundertmark, Claudia; Foster, Meredith C; Glazer, Nicole; Isaacs, Aaron; Rao, Madhumathi; Smith, Albert V; O'Connell, Jeffrey R; Struchalin, Maksim; Tanaka, Toshiko; Li, Guo; Hwang, Shih-Jen; Atkinson, Elizabeth J; Lohman, Kurt; Cornelis, Marilyn C; Johansson, Asa; Tönjes, Anke; Dehghan, Abbas; Couraki, Vincent; Holliday, Elizabeth G; Sorice, Rossella; Kutalik, Zoltan; Lehtimäki, Terho; Esko, Tõnu; Deshmukh, Harshal; Ulivi, Sheila; Chu, Audrey Y; Murgia, Federico; Trompet, Stella; Imboden, Medea; Kollerits, Barbara; Pistis, Giorgio; Harris, Tamara B.

J Am Soc Nephrol ; 24(12): 2105-17, 2013 Dec.

Article in English | MEDLINE | ID: mdl-24029420

ABSTRACT

Many common genetic variants identified by genome-wide association studies for complex traits map to genes previously linked to rare inherited Mendelian disorders. A systematic analysis of common single-nucleotide polymorphisms (SNPs) in genes responsible for Mendelian diseases with kidney phenotypes has not been performed. We thus developed a comprehensive database of genes for Mendelian kidney conditions and evaluated the association between common genetic variants within these genes and kidney function in the general population. Using the Online Mendelian Inheritance in Man database, we identified 731 unique disease entries related to specific renal search terms and confirmed a kidney phenotype in 218 of these entries, corresponding to mutations in 258 genes. We interrogated common SNPs (minor allele frequency >5%) within these genes for association with the estimated GFR in 74,354 European-ancestry participants from the CKDGen Consortium. However, the top four candidate SNPs (rs6433115 at LRP2, rs1050700 at TSC1, rs249942 at PALB2, and rs9827843 at ROBO2) did not achieve significance in a stage 2 meta-analysis performed in 56,246 additional independent individuals, indicating that these common SNPs are not associated with estimated GFR. The effect of less common or rare variants in these genes on kidney function in the general population and disease-specific cohorts requires further research.

Subject(s)

Genetic Variation , Kidney/physiology , Mendelian Randomization Analysis , Polymorphism, Single Nucleotide , Renal Insufficiency, Chronic/genetics , White People/genetics , Databases, Genetic , Gene Frequency , Genome-Wide Association Study , Humans , Phenotype

18.

A cost-effective sequencing method for genetic studies combining high-depth whole exome and low-depth whole genome.

Bhérer, Claude; Eveleigh, Robert; Trajanoska, Katerina; St-Cyr, Janick; Paccard, Antoine; Nadukkalam Ravindran, Praveen; Caron, Elizabeth; Bader Asbah, Nimara; McClelland, Peyton; Wei, Clare; Baumgartner, Iris; Schindewolf, Marc; Döring, Yvonne; Perley, Danielle; Lefebvre, François; Lepage, Pierre; Bourgey, Mathieu; Bourque, Guillaume; Ragoussis, Jiannis; Mooser, Vincent; Taliun, Daniel.

NPJ Genom Med ; 9(1): 8, 2024 Feb 07.

Article in English | MEDLINE | ID: mdl-38326393

ABSTRACT

Whole genome sequencing (WGS) at high-depth (30X) allows the accurate discovery of variants in the coding and non-coding DNA regions and helps elucidate the genetic underpinnings of human health and diseases. Yet, due to the prohibitive cost of high-depth WGS, most large-scale genetic association studies use genotyping arrays or high-depth whole exome sequencing (WES). Here we propose a cost-effective method which we call "Whole Exome Genome Sequencing" (WEGS), that combines low-depth WGS and high-depth WES with up to 8 samples pooled and sequenced simultaneously (multiplexed). We experimentally assess the performance of WEGS with four different depth of coverage and sample multiplexing configurations. We show that the optimal WEGS configurations are 1.7-2.0 times cheaper than standard WES (no-plexing), 1.8-2.1 times cheaper than high-depth WGS, reach similar recall and precision rates in detecting coding variants as WES, and capture more population-specific variants in the rest of the genome that are difficult to recover when using genotype imputation methods. We apply WEGS to 862 patients with peripheral artery disease and show that it directly assesses more known disease-associated variants than a typical genotyping array and thousands of non-imputable variants per disease-associated locus.

19.

GWAtoolbox: an R package for fast quality control and handling of genome-wide association studies meta-analysis data.

Fuchsberger, Christian; Taliun, Daniel; Pramstaller, Peter P; Pattaro, Cristian.

Bioinformatics ; 28(3): 444-5, 2012 Feb 01.

Article in English | MEDLINE | ID: mdl-22155946

ABSTRACT

SUMMARY: The GWAtoolbox is an R package that standardizes and accelerates the handling of data from genome-wide association studies (GWAS), particularly in the context of large-scale GWAS meta-analyses. A key feature of GWAtoolbox is its ability to perform quality control (QC) of any number of files in a matter of minutes. The implemented workflow has been structured to check three particular data quality aspects: (i) data formatting, (ii) quality of the GWAS results and (iii) data consistency across studies. Output consists of an extensive list of quality statistics and plots which allow inspection of individual files and between-study comparison to identify systematic bias. AVAILABILITY: http://www.eurac.edu/GWAtoolbox CONTACT: cfuchsb@umich.edu; daniel.taliun@eurac.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Subject(s)

Genome-Wide Association Study , Software , HapMap Project , Humans , Meta-Analysis as Topic , Quality Control

20.

HLA allele-calling using multi-ancestry whole-exome sequencing from the UK Biobank identifies 129 novel associations in 11 autoimmune diseases.

Butler-Laporte, Guillaume; Farjoun, Joseph; Nakanishi, Tomoko; Lu, Tianyuan; Abner, Erik; Chen, Yiheng; Hultström, Michael; Metspalu, Andres; Milani, Lili; Mägi, Reedik; Nelis, Mari; Hudjashov, Georgi; Yoshiji, Satoshi; Ilboudo, Yann; Liang, Kevin Y H; Su, Chen-Yang; Willet, Julian D S; Esko, Tõnu; Zhou, Sirui; Forgetta, Vincenzo; Taliun, Daniel; Richards, J Brent.

Commun Biol ; 6(1): 1113, 2023 11 03.

Article in English | MEDLINE | ID: mdl-37923823

ABSTRACT

The human leukocyte antigen (HLA) region on chromosome 6 is strongly associated with many immune-mediated and infection-related diseases. Due to its highly polymorphic nature and complex linkage disequilibrium patterns, traditional genetic association studies of single nucleotide polymorphisms do not perform well in this region. Instead, the field has adopted the assessment of the association of HLA alleles (i.e., entire HLA gene haplotypes) with disease. Often based on genotyping arrays, these association studies impute HLA alleles, decreasing accuracy and thus statistical power for rare alleles and in non-European ancestries. Here, we use whole-exome sequencing (WES) from 454,824 UK Biobank (UKB) participants to directly call HLA alleles using the HLA-HD algorithm. We show this method is more accurate than imputing HLA alleles and harness the improved statistical power to identify 360 associations for 11 auto-immune phenotypes (at least 129 likely novel), leading to better insights into the specific coding polymorphisms that underlie these diseases. We show that HLA alleles with synonymous variants, often overlooked in HLA studies, can significantly influence these phenotypes. Lastly, we show that HLA sequencing may improve polygenic risk scores accuracy across ancestries. These findings allow better characterization of the role of the HLA region in human disease.

Subject(s)

Autoimmune Diseases , Biological Specimen Banks , Humans , Alleles , Exome Sequencing , Genetic Predisposition to Disease , Autoimmune Diseases/genetics , HLA Antigens/genetics , Histocompatibility Antigens Class I/genetics , Histocompatibility Antigens Class II , Polymorphism, Single Nucleotide , United Kingdom

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL