RESUMEN
Updates in nanopore technology have made it possible to obtain gigabases of sequence data. Prior to this, nanopore sequencing technology was mainly used to analyze microbial samples. Here, we describe the generation of a comprehensive nanopore sequencing data set with a median read length of 11,979 bp for a self-compatible accession of the wild tomato species Solanum pennellii We describe the assembly of its genome to a contig N50 of 2.5 MB. The assembly pipeline comprised initial read correction with Canu and assembly with SMARTdenovo. The resulting raw nanopore-based de novo genome is structurally highly similar to that of the reference S. pennellii LA716 accession but has a high error rate and was rich in homopolymer deletions. After polishing the assembly with Illumina reads, we obtained an error rate of <0.02% when assessed versus the same Illumina data. We obtained a gene completeness of 96.53%, slightly surpassing that of the reference S. pennellii Taken together, our data indicate that such long read sequencing data can be used to affordably sequence and assemble gigabase-sized plant genomes.
Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Nanoporos , Solanum/genética , Análisis de Secuencia de ADNRESUMEN
Functional gene clusters, containing two or more genes encoding different enzymes for the same pathway, are sometimes observed in plant genomes, most often when the genes specify the synthesis of specialized defensive metabolites. Here, we show that a cluster of genes in tomato (Solanum lycopersicum; Solanaceae) contains genes for terpene synthases (TPSs) that specify the synthesis of monoterpenes and diterpenes from cis-prenyl diphosphates, substrates that are synthesized by enzymes encoded by cis-prenyl transferase (CPT) genes also located within the same cluster. The monoterpene synthase genes in the cluster likely evolved from a diterpene synthase gene in the cluster by duplication and divergence. In the orthologous cluster in Solanum habrochaites, a new sesquiterpene synthase gene was created by a duplication event of a monoterpene synthase followed by a localized gene conversion event directed by a diterpene synthase gene. The TPS genes in the Solanum cluster encoding cis-prenyl diphosphate-utilizing enzymes are closely related to a tobacco (Nicotiana tabacum; Solanaceae) diterpene synthase encoding Z-abienol synthase (Nt-ABS). Nt-ABS uses the substrate copal-8-ol diphosphate, which is made from the all-trans geranylgeranyl diphosphate by copal-8-ol diphosphate synthase (Nt-CPS2). The Solanum gene cluster also contains an ortholog of Nt-CPS2, but it appears to encode a nonfunctional protein. Thus, the Solanum functional gene cluster evolved by duplication and divergence of TPS genes, together with alterations in substrate specificity to utilize cis-prenyl diphosphates and through the acquisition of CPT genes.