Your browser doesn't support javascript.
loading
A crowdsourced set of curated structural variants for the human genome.
Chapman, Lesley M; Spies, Noah; Pai, Patrick; Lim, Chun Shen; Carroll, Andrew; Narzisi, Giuseppe; Watson, Christopher M; Proukakis, Christos; Clarke, Wayne E; Nariai, Naoki; Dawson, Eric; Jones, Garan; Blankenberg, Daniel; Brueffer, Christian; Xiao, Chunlin; Kolora, Sree Rohit Raj; Alexander, Noah; Wolujewicz, Paul; Ahmed, Azza E; Smith, Graeme; Shehreen, Saadlee; Wenger, Aaron M; Salit, Marc; Zook, Justin M.
Afiliação
  • Chapman LM; Biosystems and Biomaterials Division, Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, Maryland, United States of America.
  • Spies N; Biosystems and Biomaterials Division, Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, Maryland, United States of America.
  • Pai P; The Joint Initiative for Metrology in Biology, Stanford University, Stanford, California, United States of America.
  • Lim CS; Departments of Genetics and Pathology, Stanford University, Stanford, California, United States of America.
  • Carroll A; University of Maryland - College Park, College Park, Maryland, United States of America.
  • Narzisi G; Department of Biochemistry, School of Biomedical Sciences, University of Otago, Dunedin, New Zealand.
  • Watson CM; DNAnexus Inc, Mountain View, California, United States of America.
  • Proukakis C; New York Genome Center, New York, New York, United States of America.
  • Clarke WE; School of Medicine, University of Leeds, Saint James's University Hospital, Leeds, Leeds, United Kingdom.
  • Nariai N; Yorkshire Regional Genetics Service, The Leeds Teaching Hospitals NHS Trust, Saint James's University Hospital, Leeds, United Kingdom.
  • Dawson E; University College London, Institute of Neurology, London, United Kingdom.
  • Jones G; New York Genome Center, New York, New York, United States of America.
  • Blankenberg D; Illumina, Inc. San Diego, California, United States of America.
  • Brueffer C; Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Rockville, Maryland, United States of America.
  • Xiao C; Department of Genetics, University of Cambridge, Cambridge, United Kingdom.
  • Kolora SRR; University of Exeter Medical School, Epidemiology and Public Health Group, Barrack Road, Exeter, Devon, United Kingdom.
  • Alexander N; Genomic Medicine Institute Lerner Research Institute Cleveland Clinic, Cleveland, Ohio, United States of America.
  • Wolujewicz P; Division of Oncology and Pathology, Department of Clinical Sciences Lund, Lund University, Lund, Sweden.
  • Ahmed AE; National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States of America.
  • Smith G; German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Leipzig, Germany.
  • Shehreen S; Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, Universität Leipzig, Leipzig, Germany.
  • Wenger AM; Molecular Evolution and Systematics of Animals, Institute of Biology, University of Leipzig, Leipzig, Germany.
  • Salit M; Molecular Biology Institute, University of California Los Angeles, Los Angeles, California, United States of America.
  • Zook JM; Weill Cornell, Belfer Research Building, New York, New York, United States of America.
PLoS Comput Biol ; 16(6): e1007933, 2020 06.
Article em En | MEDLINE | ID: mdl-32559231
ABSTRACT
A high quality benchmark for small variants encompassing 88 to 90% of the reference genome has been developed for seven Genome in a Bottle (GIAB) reference samples. However a reliable benchmark for large indels and structural variants (SVs) is more challenging. In this study, we manually curated 1235 SVs, which can ultimately be used to evaluate SV callers or train machine learning models. We developed a crowdsourcing app-SVCurator-to help GIAB curators manually review large indels and SVs within the human genome, and report their genotype and size accuracy. SVCurator displays images from short, long, and linked read sequencing data from the GIAB Ashkenazi Jewish Trio son [NIST RM 8391/HG002]. We asked curators to assign labels describing SV type (deletion or insertion), size accuracy, and genotype for 1235 putative insertions and deletions sampled from different size bins between 20 and 892,149 bp. 'Expert' curators were 93% concordant with each other, and 37 of the 61 curators had at least 78% concordance with a set of 'expert' curators. The curators were least concordant for complex SVs and SVs that had inaccurate breakpoints or size predictions. After filtering events with low concordance among curators, we produced high confidence labels for 935 events. The SVCurator crowdsourced labels were 94.5% concordant with the heuristic-based draft benchmark SV callset from GIAB. We found that curators can successfully evaluate putative SVs when given evidence from multiple sequencing technologies.
Assuntos

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Genoma Humano / Variação Estrutural do Genoma Tipo de estudo: Prognostic_studies Limite: Humans Idioma: En Ano de publicação: 2020 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Genoma Humano / Variação Estrutural do Genoma Tipo de estudo: Prognostic_studies Limite: Humans Idioma: En Ano de publicação: 2020 Tipo de documento: Article