RESUMO
Pacific Biosciences (PacBio) HiFi sequencing technology generates long reads (>10 kbp) with very high accuracy (<0.01% sequencing error). Although several de novo assembly tools are available for HiFi reads, there are no comprehensive studies on the evaluation of these assemblers. We evaluated the performance of 11 de novo HiFi assemblers on (1) real data for three eukaryotic genomes; (2) 34 synthetic data sets with different ploidy, sequencing coverage levels, heterozygosity rates, and sequencing error rates; (3) one real metagenomic data set; and (4) five synthetic metagenomic data sets with different composition abundance and heterozygosity rates. The 11 assemblers were evaluated using quality assessment tool (QUAST) and benchmarking universal single-copy ortholog (BUSCO). We also used several additional criteria, namely, completion rate, single-copy completion rate, duplicated completion rate, average proportion of largest category, average distance difference, quality value, run-time, and memory utilization. Results show that hifiasm and hifiasm-meta should be the first choice for assembling eukaryotic genomes and metagenomes with HiFi data. We performed a comprehensive benchmarking study of commonly used assemblers on complex eukaryotic genomes and metagenomes. Our study will help the research community to choose the most appropriate assembler for their data and identify possible improvements in assembly algorithms.
Assuntos
Metagenoma , Software , Análise de Sequência de DNA/métodos , Algoritmos , Metagenômica/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodosRESUMO
Transgenic papaya is widely publicized for controlling papaya ringspot virus. However, the impact of particle bombardment on the genome remains unknown. The transgenic SunUp and its progenitor Sunset genomes were assembled into 351.5 and 350.3 Mb in nine chromosomes, respectively. We identified a 1.64 Mb insertion containing three transgenic insertions in SunUp chromosome 5, consisting of 52 nuclear-plastid, 21 nuclear-mitochondrial and 1 nuclear genomic fragments. A 591.9 kb fragment in chromosome 5 was translocated into the 1.64 Mb insertion. We assembled a gapless 9.8 Mb hermaphrodite-specific region of the Yh chromosome and its 6.0 Mb X counterpart. Resequencing 86 genomes revealed three distinct groups, validating their geographic origin and breeding history. We identified 147 selective sweeps and defined the essential role of zeta-carotene desaturase in carotenoid accumulation during domestication. Our findings elucidated the impact of particle bombardment and improved our understanding of sex chromosomes and domestication to expedite papaya improvement.