Fast and accurate genomic analyses using genome graphs.
Nat Genet
; 51(2): 354-362, 2019 02.
Article
in En
| MEDLINE
| ID: mdl-30643257
ABSTRACT
The human reference genome serves as the foundation for genomics by providing a scaffold for alignment of sequencing reads, but currently only reflects a single consensus haplotype, thus impairing analysis accuracy. Here we present a graph reference genome implementation that enables read alignment across 2,800 diploid genomes encompassing 12.6 million SNPs and 4.0 million insertions and deletions (indels). The pipeline processes one whole-genome sequencing sample in 6.5 h using a system with 36 CPU cores. We show that using a graph genome reference improves read mapping sensitivity and produces a 0.5% increase in variant calling recall, with unaffected specificity. Structural variations incorporated into a graph genome can be genotyped accurately under a unified framework. Finally, we show that iterative augmentation of graph genomes yields incremental gains in variant calling accuracy. Our implementation is an important advance toward fulfilling the promise of graph genomes to radically enhance the scalability and accuracy of genomic analyses.
Full text:
1
Database:
MEDLINE
Main subject:
Genome, Human
Limits:
Humans
Language:
En
Journal:
Nat Genet
Journal subject:
GENETICA MEDICA
Year:
2019
Type:
Article
Affiliation country:
United States