Your browser doesn't support javascript.
loading
Population-wide copy number variation calling using variant call format files from 6,898 individuals.
Png, Grace; Suveges, Daniel; Park, Young-Chan; Walter, Klaudia; Kundu, Kousik; Ntalla, Ioanna; Tsafantakis, Emmanouil; Karaleftheri, Maria; Dedoussis, George; Zeggini, Eleftheria; Gilly, Arthur.
Affiliation
  • Png G; Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, United Kingdom.
  • Suveges D; Department of Medical Genetics, University of Cambridge, Cambridge, United Kingdom.
  • Park YC; Institute of Translational Genomics, Helmholtz Zentrum München-German Research Center for Environmental Health, Neuherberg, Germany.
  • Walter K; Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, United Kingdom.
  • Kundu K; European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, United Kingdom.
  • Ntalla I; Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, United Kingdom.
  • Tsafantakis E; Department of Medical Genetics, University of Cambridge, Cambridge, United Kingdom.
  • Karaleftheri M; Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, United Kingdom.
  • Dedoussis G; Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, United Kingdom.
  • Zeggini E; William Harvey Research Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London, United Kingdom.
  • Gilly A; Anogia Medical Centre, Anogia, Greece.
Genet Epidemiol ; 44(1): 79-89, 2020 01.
Article in En | MEDLINE | ID: mdl-31520489
ABSTRACT
Copy number variants (CNVs) play an important role in a number of human diseases, but the accurate calling of CNVs remains challenging. Most current approaches to CNV detection use raw read alignments, which are computationally intensive to process. We use a regression tree-based approach to call germline CNVs from whole-genome sequencing (WGS, >18x) variant call sets in 6,898 samples across four European cohorts, and describe a rich large variation landscape comprising 1,320 CNVs. Eighty-one percent of detected events have been previously reported in the Database of Genomic Variants. Twenty-three percent of high-quality deletions affect entire genes, and we recapitulate known events such as the GSTM1 and RHD gene deletions. We test for association between the detected deletions and 275 protein levels in 1,457 individuals to assess the potential clinical impact of the detected CNVs. We describe complex CNV patterns underlying an association with levels of the CCL3 protein (MAF = 0.15, p = 3.6x10-12 ) at the CCL3L3 locus, and a novel cis-association between a low-frequency NOMO1 deletion and NOMO1 protein levels (MAF = 0.02, p = 2.2x10-7 ). This study demonstrates that existing population-wide WGS call sets can be mined for germline CNVs with minimal computational overhead, delivering insight into a less well-studied, yet potentially impactful class of genetic variant.
Subject(s)
Key words

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: Genome, Human / DNA Copy Number Variations / Genetics, Population Limits: Humans Language: En Journal: Genet Epidemiol Journal subject: EPIDEMIOLOGIA / GENETICA MEDICA Year: 2020 Document type: Article Affiliation country: Reino Unido

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: Genome, Human / DNA Copy Number Variations / Genetics, Population Limits: Humans Language: En Journal: Genet Epidemiol Journal subject: EPIDEMIOLOGIA / GENETICA MEDICA Year: 2020 Document type: Article Affiliation country: Reino Unido