Your browser doesn't support javascript.
loading
Gapless assembly of complete human and plant chromosomes using only nanopore sequencing.
Koren, Sergey; Bao, Zhigui; Guarracino, Andrea; Ou, Shujun; Goodwin, Sara; Jenike, Katharine M; Lucas, Julian; McNulty, Brandy; Park, Jimin; Rautiainen, Mikko; Rhie, Arang; Roelofs, Dick; Schneiders, Harrie; Vrijenhoek, Ilse; Nijbroek, Koen; Ware, Doreen; Schatz, Michael C; Garrison, Erik; Huang, Sanwen; McCombie, W Richard; Miga, Karen H; Wittenberg, Alexander H J; Phillippy, Adam M.
Afiliação
  • Koren S; Genome Informatics Section, Center for Genomics and Data Science Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA.
  • Bao Z; Department of Molecular Biology, Max Planck Institute for Biology Tübingen, Tübingen, BadenWürttemberg, Germany.
  • Guarracino A; Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China.
  • Ou S; Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, Tennessee, USA.
  • Goodwin S; Human Technopole, Milan, Italy.
  • Jenike KM; Ohio State University, Columbus, OH, USA.
  • Lucas J; Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA.
  • McNulty B; Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA.
  • Park J; University of California Santa Cruz, Santa Cruz, CA, USA.
  • Rautiainen M; University of California Santa Cruz, Santa Cruz, CA, USA.
  • Rhie A; University of California Santa Cruz, Santa Cruz, CA, USA.
  • Roelofs D; Genome Informatics Section, Center for Genomics and Data Science Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA.
  • Schneiders H; Genome Informatics Section, Center for Genomics and Data Science Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA.
  • Vrijenhoek I; KeyGene, Agro Business Park 90, 6708 PW Wageningen, Netherlands.
  • Nijbroek K; KeyGene, Agro Business Park 90, 6708 PW Wageningen, Netherlands.
  • Ware D; KeyGene, Agro Business Park 90, 6708 PW Wageningen, Netherlands.
  • Schatz MC; KeyGene, Agro Business Park 90, 6708 PW Wageningen, Netherlands.
  • Garrison E; Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA.
  • Huang S; Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA.
  • McCombie WR; Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, Tennessee, USA.
  • Miga KH; Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China.
  • Wittenberg AHJ; State Key Laboratory of Tropical Crop Breeding, Chinese Academy of Tropical Agricultural Sciences, Haikou, Hainan, China.
  • Phillippy AM; Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA.
bioRxiv ; 2024 Mar 19.
Article em En | MEDLINE | ID: mdl-38529488
ABSTRACT
The combination of ultra-long Oxford Nanopore (ONT) sequencing reads with long, accurate PacBio HiFi reads has enabled the completion of a human genome and spurred similar efforts to complete the genomes of many other species. However, this approach for complete, "telomere-to-telomere" genome assembly relies on multiple sequencing platforms, limiting its accessibility. ONT "Duplex" sequencing reads, where both strands of the DNA are read to improve quality, promise high per-base accuracy. To evaluate this new data type, we generated ONT Duplex data for three widely-studied genomes human HG002, Solanum lycopersicum Heinz 1706 (tomato), and Zea mays B73 (maize). For the diploid, heterozygous HG002 genome, we also used "Pore-C" chromatin contact mapping to completely phase the haplotypes. We found the accuracy of Duplex data to be similar to HiFi sequencing, but with read lengths tens of kilobases longer, and the Pore-C data to be compatible with existing diploid assembly algorithms. This combination of read length and accuracy enables the construction of a high-quality initial assembly, which can then be further resolved using the ultra-long reads, and finally phased into chromosome-scale haplotypes with Pore-C. The resulting assemblies have a base accuracy exceeding 99.999% (Q50) and near-perfect continuity, with most chromosomes assembled as single contigs. We conclude that ONT sequencing is a viable alternative to HiFi sequencing for de novo genome assembly, and has the potential to provide a single-instrument solution for the reconstruction of complete genomes.
Palavras-chave

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Revista: BioRxiv Ano de publicação: 2024 Tipo de documento: Article País de afiliação: Estados Unidos

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Revista: BioRxiv Ano de publicação: 2024 Tipo de documento: Article País de afiliação: Estados Unidos