Your browser doesn't support javascript.
loading
Highly accurate long-read HiFi sequencing data for five complex genomes.
Hon, Ting; Mars, Kristin; Young, Greg; Tsai, Yu-Chih; Karalius, Joseph W; Landolin, Jane M; Maurer, Nicholas; Kudrna, David; Hardigan, Michael A; Steiner, Cynthia C; Knapp, Steven J; Ware, Doreen; Shapiro, Beth; Peluso, Paul; Rank, David R.
Afiliação
  • Hon T; Pacific Biosciences of California Inc., 1305 O'Brien Dr., Menlo Park, CA, 94025, USA.
  • Mars K; Pacific Biosciences of California Inc., 1305 O'Brien Dr., Menlo Park, CA, 94025, USA.
  • Young G; Pacific Biosciences of California Inc., 1305 O'Brien Dr., Menlo Park, CA, 94025, USA.
  • Tsai YC; Pacific Biosciences of California Inc., 1305 O'Brien Dr., Menlo Park, CA, 94025, USA.
  • Karalius JW; Pacific Biosciences of California Inc., 1305 O'Brien Dr., Menlo Park, CA, 94025, USA.
  • Landolin JM; Ravel Biotechnology Inc., 953 Indiana St., San Francisco, CA, 94107, USA.
  • Maurer N; Department of Ecology and Evolutionary Biology, University of California Santa Cruz, Santa Cruz, CA, 95064, USA.
  • Kudrna D; Arizona Genomics Institute and School of Plant Sciences, University of Arizona, Tucson, AZ, 85721, USA.
  • Hardigan MA; Department of Plant Sciences, University of California, Davis, One Shields Ave, Davis, CA, 95616-8571, USA.
  • Steiner CC; Conservation Genetics, Beckman Center for Conservation Research, San Diego Zoo Global, 15600 San Pasqual Valley Road, Escondido, CA, 92027, USA.
  • Knapp SJ; Department of Plant Sciences, University of California, Davis, One Shields Ave, Davis, CA, 95616-8571, USA.
  • Ware D; Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA.
  • Shapiro B; USDA-ARS, Plant, Soil, and Nutrition Research Unit, Ithaca, NY, 14853, USA.
  • Peluso P; Department of Ecology and Evolutionary Biology, University of California Santa Cruz, Santa Cruz, CA, 95064, USA.
  • Rank DR; Howard Hughes Medical Institute, University of California Santa Cruz, Santa Cruz, CA, 95064, USA.
Sci Data ; 7(1): 399, 2020 11 17.
Article em En | MEDLINE | ID: mdl-33203859
ABSTRACT
The PacBio® HiFi sequencing method yields highly accurate long-read sequencing datasets with read lengths averaging 10-25 kb and accuracies greater than 99.5%. These accurate long reads can be used to improve results for complex applications such as single nucleotide and structural variant detection, genome assembly, assembly of difficult polyploid or highly repetitive genomes, and assembly of metagenomes. Currently, there is a need for sample data sets to both evaluate the benefits of these long accurate reads as well as for development of bioinformatic tools including genome assemblers, variant callers, and haplotyping algorithms. We present deep coverage HiFi datasets for five complex samples including the two inbred model genomes Mus musculus and Zea mays, as well as two complex genomes, octoploid Fragaria × ananassa and the diploid anuran Rana muscosa. Additionally, we release sequence data from a mock metagenome community. The datasets reported here can be used without restriction to develop new algorithms and explore complex genome structure and evolution. Data were generated on the PacBio Sequel II System.
Assuntos

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Limite: Animals Idioma: En Ano de publicação: 2020 Tipo de documento: Article

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Limite: Animals Idioma: En Ano de publicação: 2020 Tipo de documento: Article