Your browser doesn't support javascript.
loading
HiCanu: accurate assembly of segmental duplications, satellites, and allelic variants from high-fidelity long reads.
Nurk, Sergey; Walenz, Brian P; Rhie, Arang; Vollger, Mitchell R; Logsdon, Glennis A; Grothe, Robert; Miga, Karen H; Eichler, Evan E; Phillippy, Adam M; Koren, Sergey.
  • Nurk S; Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland 20894, USA.
  • Walenz BP; Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland 20894, USA.
  • Rhie A; Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland 20894, USA.
  • Vollger MR; Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA.
  • Logsdon GA; Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA.
  • Grothe R; Pacific Biosciences, Menlo Park, California 94025, USA.
  • Miga KH; UC Santa Cruz Genomics Institute, University of California, Santa Cruz, California 95064, USA.
  • Eichler EE; Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA.
  • Phillippy AM; Howard Hughes Medical Institute, University of Washington, Seattle, Washington 98195, USA.
  • Koren S; Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland 20894, USA.
Genome Res ; 30(9): 1291-1305, 2020 09.
Article en En | MEDLINE | ID: mdl-32801147
ABSTRACT
Complete and accurate genome assemblies form the basis of most downstream genomic analyses and are of critical importance. Recent genome assembly projects have relied on a combination of noisy long-read sequencing and accurate short-read sequencing, with the former offering greater assembly continuity and the latter providing higher consensus accuracy. The recently introduced Pacific Biosciences (PacBio) HiFi sequencing technology bridges this divide by delivering long reads (>10 kbp) with high per-base accuracy (>99.9%). Here we present HiCanu, a modification of the Canu assembler designed to leverage the full potential of HiFi reads via homopolymer compression, overlap-based error correction, and aggressive false overlap filtering. We benchmark HiCanu with a focus on the recovery of haplotype diversity, major histocompatibility complex (MHC) variants, satellite DNAs, and segmental duplications. For diploid human genomes sequenced to 30× HiFi coverage, HiCanu achieved superior accuracy and allele recovery compared to the current state of the art. On the effectively haploid CHM13 human cell line, HiCanu achieved an NG50 contig size of 77 Mbp with a per-base consensus accuracy of 99.999% (QV50), surpassing recent assemblies of high-coverage, ultralong Oxford Nanopore Technologies (ONT) reads in terms of both accuracy and continuity. This HiCanu assembly correctly resolves 337 out of 341 validation BACs sampled from known segmental duplications and provides the first preliminary assemblies of nine complete human centromeric regions. Although gaps and errors still remain within the most challenging regions of the genome, these results represent a significant advance toward the complete assembly of human genomes.
Asunto(s)

Texto completo: 1 Banco de datos: MEDLINE Asunto principal: Variación Genética / Análisis de Secuencia de ADN / Secuenciación de Nucleótidos de Alto Rendimiento Tipo de estudio: Evaluation_studies Límite: Animals / Humans Idioma: En Año: 2020 Tipo del documento: Article

Texto completo: 1 Banco de datos: MEDLINE Asunto principal: Variación Genética / Análisis de Secuencia de ADN / Secuenciación de Nucleótidos de Alto Rendimiento Tipo de estudio: Evaluation_studies Límite: Animals / Humans Idioma: En Año: 2020 Tipo del documento: Article