Your browser doesn't support javascript.
loading
Parliament2: Accurate structural variant calling at scale.
Zarate, Samantha; Carroll, Andrew; Mahmoud, Medhat; Krasheninina, Olga; Jun, Goo; Salerno, William J; Schatz, Michael C; Boerwinkle, Eric; Gibbs, Richard A; Sedlazeck, Fritz J.
Afiliación
  • Zarate S; DNAnexus, 1975 W El Camino Real #204, Mountain View, CA 94040, USA.
  • Carroll A; Department of Computer Science, 3400 N. Charles St. Johns Hopkins University, Baltimore, MD 21218, USA.
  • Mahmoud M; DNAnexus, 1975 W El Camino Real #204, Mountain View, CA 94040, USA.
  • Krasheninina O; Human Genome Sequencing Center, One Baylor Plaza, Baylor College of Medicine, Houston, TX 77030, USA.
  • Jun G; Human Genome Sequencing Center, One Baylor Plaza, Baylor College of Medicine, Houston, TX 77030, USA.
  • Salerno WJ; Human Genetics Center, 1200 Pressler Street, University of Texas Health Science Center at Houston, Houston, TX 77040, USA.
  • Schatz MC; Human Genome Sequencing Center, One Baylor Plaza, Baylor College of Medicine, Houston, TX 77030, USA.
  • Boerwinkle E; Department of Computer Science, 3400 N. Charles St. Johns Hopkins University, Baltimore, MD 21218, USA.
  • Gibbs RA; Human Genome Sequencing Center, One Baylor Plaza, Baylor College of Medicine, Houston, TX 77030, USA.
  • Sedlazeck FJ; Human Genetics Center, 1200 Pressler Street, University of Texas Health Science Center at Houston, Houston, TX 77040, USA.
Gigascience ; 9(12)2020 12 21.
Article en En | MEDLINE | ID: mdl-33347570
BACKGROUND: Structural variants (SVs) are critical contributors to genetic diversity and genomic disease. To predict the phenotypic impact of SVs, there is a need for better estimates of both the occurrence and frequency of SVs, preferably from large, ethnically diverse cohorts. Thus, the current standard approach requires the use of short paired-end reads, which remain challenging to detect, especially at the scale of hundreds to thousands of samples. FINDINGS: We present Parliament2, a consensus SV framework that leverages multiple best-in-class methods to identify high-quality SVs from short-read DNA sequence data at scale. Parliament2 incorporates pre-installed SV callers that are optimized for efficient execution in parallel to reduce the overall runtime and costs. We demonstrate the accuracy of Parliament2 when applied to data from NovaSeq and HiSeq X platforms with the Genome in a Bottle (GIAB) SV call set across all size classes. The reported quality score per SV is calibrated across different SV types and size classes. Parliament2 has the highest F1 score (74.27%) measured across the independent gold standard from GIAB. We illustrate the compute performance by processing all 1000 Genomes samples (2,691 samples) in <1 day on GRCH38. Parliament2 improves the runtime performance of individual methods and is open source (https://github.com/slzarate/parliament2), and a Docker image, as well as a WDL implementation, is available. CONCLUSION: Parliament2 provides both a highly accurate single-sample SV call set from short-read DNA sequence data and enables cost-efficient application over cloud or cluster environments, processing thousands of samples.
Asunto(s)
Palabras clave

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Programas Informáticos / Genómica Tipo de estudio: Prognostic_studies Límite: Humans Idioma: En Revista: Gigascience Año: 2020 Tipo del documento: Article País de afiliación: Estados Unidos

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Programas Informáticos / Genómica Tipo de estudio: Prognostic_studies Límite: Humans Idioma: En Revista: Gigascience Año: 2020 Tipo del documento: Article País de afiliación: Estados Unidos
...