RESUMO
Single-cell genomics permits a new resolution in the examination of molecular and cellular dynamics, allowing global, parallel assessments of cell types and cellular behaviors through development and in response to environmental circumstances, such as interaction with water and the light-dark cycle of the Earth. Here, we leverage the smallest, and possibly most structurally reduced, plant, the semiaquatic Wolffia australiana, to understand dynamics of cell expression in these contexts at the whole-plant level. We examined single-cell-resolution RNA-sequencing data and found Wolffia cells divide into four principal clusters representing the above- and below-water-situated parenchyma and epidermis. Although these tissues share transcriptomic similarity with model plants, they display distinct adaptations that Wolffia has made for the aquatic environment. Within this broad classification, discrete subspecializations are evident, with select cells showing unique transcriptomic signatures associated with developmental maturation and specialized physiologies. Assessing this simplified biological system temporally at two key time-of-day (TOD) transitions, we identify additional TOD-responsive genes previously overlooked in whole-plant transcriptomic approaches and demonstrate that the core circadian clock machinery and its downstream responses can vary in cell-specific manners, even in this simplified system. Distinctions between cell types and their responses to submergence and/or TOD are driven by expression changes of unexpectedly few genes, characterizing Wolffia as a highly streamlined organism with the majority of genes dedicated to fundamental cellular processes. Wolffia provides a unique opportunity to apply reductionist biology to elucidate signaling functions at the organismal level, for which this work provides a powerful resource.
Assuntos
Araceae , Regulação da Expressão Gênica de Plantas , Transcriptoma , Araceae/genética , Araceae/metabolismo , Análise de Célula Única/métodos , Perfilação da Expressão Gênica/métodosRESUMO
Powdery mildew (PM) in Cannabis sativa is most frequently caused by the biotrophic fungus Golovinomyces ambrosiae. Based on previously characterized variation in susceptibility to PM, biparental populations were developed by crossing the most resistant cultivar evaluated, 'FL 58', with a susceptible cultivar, 'TJ's CBD'. F1 progeny were evaluated and displayed a range of susceptibility, and two were self-pollinated to generate two F2 populations. In 2021, the F2 populations (n = 706) were inoculated with PM and surveyed for disease severity. In both F2 populations, 25% of the progeny were resistant, while the remaining 75% showed a range of susceptibility. The F2 populations, as well as selected F1 progeny and the parents, were genotyped with a single-nucleotide polymorphism array, and a consensus genetic map was produced. A major effect quantitative trait locus on C. sativa chromosome 1 (Chr01) and other smaller-effect quantitative trait loci (QTL) on four other chromosomes were identified. The most associated marker on Chr01 was located near CsMLO1, a candidate susceptibility gene. Genomic DNA and cDNA sequencing of CsMLO1 revealed a 6.8-kb insertion in FL 58, relative to TJ's CBD, of which 846 bp are typically spliced into the mRNA transcript encoding a premature stop codon. Molecular marker assays were developed using CsMLO1 sequences to distinguish PM-resistant and PM-susceptible genotypes. These data support the hypothesis that a mutated MLO susceptibility gene confers resistance to PM in C. sativa and provides new genetic resources to develop resistant cultivars. [Formula: see text] Copyright © 2024 The Author(s). This is an open access article distributed under the CC BY-NC-ND 4.0 International license.
Assuntos
Cannabis , Cannabis/genética , Resistência à Doença/genética , Mapeamento Cromossômico , Locos de Características Quantitativas/genética , Genótipo , Doenças das Plantas/genética , Doenças das Plantas/microbiologiaRESUMO
Rootless plants in the genus Wolffia are some of the fastest growing known plants on Earth. Wolffia have a reduced body plan, primarily multiplying through a budding type of asexual reproduction. Here, we generated draft reference genomes for Wolffia australiana (Benth.) Hartog & Plas, which has the smallest genome size in the genus at 357 Mb and has a reduced set of predicted protein-coding genes at about 15,000. Comparison between multiple high-quality draft genome sequences from W. australiana clones confirmed loss of several hundred genes that are highly conserved among flowering plants, including genes involved in root developmental and light signaling pathways. Wolffia has also lost most of the conserved nucleotide-binding leucine-rich repeat (NLR) genes that are known to be involved in innate immunity, as well as those involved in terpene biosynthesis, while having a significant overrepresentation of genes in the sphingolipid pathways that may signify an alternative defense system. Diurnal expression analysis revealed that only 13% of Wolffia genes are expressed in a time-of-day (TOD) fashion, which is less than the typical â¼40% found in several model plants under the same condition. In contrast to the model plants Arabidopsis and rice, many of the pathways associated with multicellular and developmental processes are not under TOD control in W. australiana, where genes that cycle the conditions tested predominantly have carbon processing and chloroplast-related functions. The Wolffia genome and TOD expression data set thus provide insight into the interplay between a streamlined plant body plan and optimized growth.
RESUMO
SUMMARY: Pangenomes are replacing single reference genomes as the definitive representation of DNA sequence within a species or clade. Pangenome analysis predominantly leverages graph-based methods that require computationally intensive multiple genome alignments, do not scale to highly complex eukaryotic genomes, limit their scope to identifying structural variants (SVs), or incur bias by relying on a reference genome. Here, we present PanKmer, a toolkit designed for reference-free analysis of pangenome datasets consisting of dozens to thousands of individual genomes. PanKmer decomposes a set of input genomes into a table of observed k-mers and their presence-absence values in each genome. These are stored in an efficient k-mer index data format that encodes SNPs, INDELs, and SVs. It also includes functions for downstream analysis of the k-mer index, such as calculating sequence similarity statistics between individuals at whole-genome or local scales. For example, k-mers can be "anchored" in any individual genome to quantify sequence variability or conservation at a specific locus. This facilitates workflows with various biological applications, e.g. identifying cases of hybridization between plant species. PanKmer provides researchers with a valuable and convenient means to explore the full scope of genetic variation in a population, without reference bias. AVAILABILITY AND IMPLEMENTATION: PanKmer is implemented as a Python package with components written in Rust, released under a BSD license. The source code is available from the Python Package Index (PyPI) at https://pypi.org/project/pankmer/ as well as Gitlab at https://gitlab.com/salk-tm/pankmer. Full documentation is available at https://salk-tm.gitlab.io/pankmer/.
Assuntos
Genoma , Software , Humanos , Eucariotos , Documentação , Análise de Sequência de DNA/métodosRESUMO
Recent increases in frequency and intensity of warm water anomalies and marine heatwaves have led to shifts in species ranges and assemblages. Genomic tools can be instrumental in detecting such shifts. In the early stages of a project assessing population genetic structure in Pacific Sardine (Sardinops sagax), we detected the presence of Japanese Sardine (Sardinops melanosticta) along the west coast of North America for the first time. We assembled a high quality, chromosome-scale reference genome of the Pacific Sardine and generated low coverage, whole genome sequence (lcWGS) data for 345 sardine collected in the California Current Large Marine Ecosystem (CCLME) in 2021 and 2022. Fifty individuals sampled in 2022 were identified as Japanese Sardine based on strong differentiation observed in lcWGS SNP and full mitogenome data. Although we detected a single case of mitochondrial introgression, we did not observe evidence for recent hybridization events. These findings change our understanding of Sardinops spp. distribution and dispersal in the Pacific and highlight the importance of long-term monitoring programs.
RESUMO
The aquatic Lemnaceae family, commonly called duckweed, comprises some of the smallest and fastest growing angiosperms known on Earth. Their tiny size, rapid growth by clonal propagation, and facile uptake of labeled compounds from the media were attractive features that made them a well-known model for plant biology from 1950 to 1990. Interest in duckweed has steadily regained momentum over the past decade, driven in part by the growing need to identify alternative plants from traditional agricultural crops that can help tackle urgent societal challenges, such as climate change and rapid population expansion. Propelled by rapid advances in genomic technologies, recent studies with duckweed again highlight the potential of these small plants to enable discoveries in diverse fields from ecology to chronobiology. Building on established community resources, duckweed is reemerging as a platform to study plant processes at the systems level and to translate knowledge gained for field deployment to address some of society's pressing needs. This review details the anatomy, development, physiology, and molecular characteristics of the Lemnaceae to introduce them to the broader plant research community. We highlight recent research enabled by Lemnaceae to demonstrate how these plants can be used for quantitative studies of complex processes and for revealing potentially novel strategies in plant defense and genome maintenance.
Assuntos
Araceae/genética , Genoma de Planta , GenômicaRESUMO
Over 15 families of aquatic plants are known to use a strategy of developmental switching upon environmental stress to produce dormant propagules called turions. However, few molecular details for turion biology have been elucidated due to the difficulties in isolating high-quality nucleic acids from this tissue. We successfully developed a new protocol to isolate high-quality transcripts and carried out RNA-seq analysis of mature turions from the Greater Duckweed Spirodela polyrhiza. Comparison of turion transcriptomes to that of fronds, the actively growing leaf-like tissue, were carried out. Bioinformatic analysis of high confidence, differentially expressed transcripts between frond and mature turion tissues revealed major pathways related to stress tolerance, starch and lipid metabolism, and dormancy that are mobilized to reprogram frond meristems for turion differentiation. We identified the key genes that are likely to drive starch and lipid accumulation during turion formation, as well as those in pathways for starch and lipid utilization upon turion germination. Comparison of genome-wide cytosine methylation levels also revealed evidence for epigenetic changes in the formation of turion tissues. Similarities between turions and seeds provide evidence that key regulators for seed maturation and germination were retooled for their function in turion biology.
Assuntos
Araceae , Germinação , Germinação/genética , Araceae/genética , Genômica , Amido/metabolismo , Lipídeos , Dormência de Plantas/genéticaRESUMO
The circadian clock is conserved at both the level of transcriptional networks as well as core genes in plants, ensuring that biological processes are phased to the correct time of day. In the model plant Arabidopsis (Arabidopsis thaliana), the core circadian SHAQKYF-type-MYB (sMYB) genes CIRCADIAN CLOCK ASSOCIATED 1 (CCA1) and REVEILLE (RVE4) show genetic linkage with PSEUDO-RESPONSE REGULATOR 9 (PRR9) and PRR7, respectively. Leveraging chromosome-resolved plant genomes and syntenic ortholog analysis enabled tracing this genetic linkage back to Amborella trichopoda, a sister lineage to the angiosperm, and identifying an additional evolutionarily conserved genetic linkage in light signaling genes. The LHY/CCA1-PRR5/9, RVE4/8-PRR3/7, and PIF3-PHYA genetic linkages emerged in the bryophyte lineage and progressively moved within several genes of each other across an array of angiosperm families representing distinct whole-genome duplication and fractionation events. Soybean (Glycine max) maintained all but two genetic linkages, and expression analysis revealed the PIF3-PHYA linkage overlapping with the E4 maturity group locus was the only pair to robustly cycle with an evening phase, in contrast to the sMYB-PRR morning and midday phase. While most monocots maintain the genetic linkages, they have been lost in the economically important grasses (Poaceae), such as maize (Zea mays), where the genes have been fractionated to separate chromosomes and presence/absence variation results in the segregation of PRR7 paralogs across heterotic groups. The environmental robustness model is put forward, suggesting that evolutionarily conserved genetic linkages ensure superior microhabitat pollinator synchrony, while wide-hybrids or unlinking the genes, as seen in the grasses, result in heterosis, adaptation, and colonization of new ecological niches.
Assuntos
Proteínas de Arabidopsis , Arabidopsis , Relógios Circadianos , Arabidopsis/metabolismo , Proteínas de Arabidopsis/metabolismo , Relógios Circadianos/genética , Ritmo Circadiano/fisiologia , Regulação da Expressão Gênica de Plantas , Ligação Genética , Humanos , Fatores de Transcrição/metabolismoRESUMO
The ability to trace every cell in some model organisms has led to the fundamental understanding of development and cellular function. However, in plants the complexity of cell number, organ size, and developmental time makes this a challenge even in the diminutive model plant Arabidopsis (Arabidopsis thaliana). Duckweed, basal nongrass aquatic monocots, provide an opportunity to follow every cell of an entire plant due to their small size, reduced body plan, and fast clonal growth habit. Here we present a chromosome-resolved genome for the highly invasive Lesser Duckweed (Lemna minuta) and generate a preliminary cell atlas leveraging low cell coverage single-nuclei sequencing. We resolved the 360 megabase genome into 21 chromosomes, revealing a core nonredundant gene set with only the ancient tau whole-genome duplication shared with all monocots, and paralog expansion as a result of tandem duplications related to phytoremediation. Leveraging SMARTseq2 single-nuclei sequencing, which provided higher gene coverage yet lower cell count, we profiled 269 nuclei covering 36.9% (8,457) of the L. minuta transcriptome. Since molecular validation was not possible in this nonmodel plant, we leveraged gene orthology with model organism single-cell expression datasets, gene ontology, and cell trajectory analysis to define putative cell types. We found that the tissue that we computationally defined as mesophyll expressed high levels of elemental transport genes consistent with this tissue playing a role in L. minuta wastewater detoxification. The L. minuta genome and preliminary cell map provide a paradigm to decipher developmental genes and pathways for an entire plant.
Assuntos
Araceae/genética , Espécies Introduzidas , Dispersão Vegetal/genética , Transcriptoma , Genoma de PlantaRESUMO
Utricularia and Genlisea are highly specialized carnivorous plants whose phylogenetic history has been poorly explored using phylogenomic methods. Additional sampling and genomic data are needed to advance our phylogenetic and taxonomic knowledge of this group of plants. Within a comparative framework, we present a characterization of plastome (PT) and mitochondrial (MT) genes of 26 Utricularia and six Genlisea species, with representatives of all subgenera and growth habits. All PT genomes maintain similar gene content, showing minor variation across the genes located between the PT junctions. One exception is a major variation related to different patterns in the presence and absence of ndh genes in the small single copy region, which appears to follow the phylogenetic history of the species rather than their lifestyle. All MT genomes exhibit similar gene content, with most differences related to a lineage-specific pseudogenes. We find evidence for episodic positive diversifying selection in PT and for most of the Utricularia MT genes that may be related to the current hypothesis that bladderworts' nuclear DNA is under constant ROS oxidative DNA damage and unusual DNA repair mechanisms, or even low fidelity polymerase that bypass lesions which could also be affecting the organellar genomes. Finally, both PT and MT phylogenetic trees were well resolved and highly supported, providing a congruent phylogenomic hypothesis for Utricularia and Genlisea clade given the study sampling.
Assuntos
Lamiales , Magnoliopsida , Filogenia , Magnoliopsida/genética , Evolução BiológicaRESUMO
Cassava (Manihot esculenta Crantz, 2n = 36) is a global food security crop. It has a highly heterozygous genome, high genetic load, and genotype-dependent asynchronous flowering. It is typically propagated by stem cuttings and any genetic variation between haplotypes, including large structural variations, is preserved by such clonal propagation. Traditional genome assembly approaches generate a collapsed haplotype representation of the genome. In highly heterozygous plants, this results in artifacts and an oversimplification of heterozygous regions. We used a combination of Pacific Biosciences (PacBio), Illumina, and Hi-C to resolve each haplotype of the genome of a farmer-preferred cassava line, TME7 (Oko-iyawo). PacBio reads were assembled using the FALCON suite. Phase switch errors were corrected using FALCON-Phase and Hi-C read data. The ultralong-range information from Hi-C sequencing was also used for scaffolding. Comparison of the two phases revealed >5000 large haplotype-specific structural variants affecting over 8 Mb, including insertions and deletions spanning thousands of base pairs. The potential of these variants to affect allele-specific expression was further explored. RNA-sequencing data from 11 different tissue types were mapped against the scaffolded haploid assembly and gene expression data are incorporated into our existing easy-to-use web-based interface to facilitate use by the broader plant science community. These two assemblies provide an excellent means to study the effects of heterozygosity, haplotype-specific structural variation, gene hemizygosity, and allele-specific gene expression contributing to important agricultural traits and further our understanding of the genetics and domestication of cassava.
Assuntos
Genoma de Planta , Haplótipos , Manihot/genética , África , Elementos de DNA Transponíveis , Diploide , Regulação da Expressão Gênica de Plantas , Tamanho do Genoma , Heterozigoto , Anotação de Sequência Molecular , SinteniaRESUMO
ENHANCED DISEASE SUSCEPTIBILITY 1 (EDS1) mediates the induction of defense responses against pathogens in most angiosperms. However, it has recently been shown that a few species have lost EDS1. It is unknown how defense against disease unfolds and evolves in the absence of EDS1. We utilize duckweeds; a collection of aquatic species that lack EDS1, to investigate this question. We established duckweed-Pseudomonas pathosystems and used growth curves and microscopy to characterize pathogen-induced responses. Through comparative genomics and transcriptomics, we show that the copy number of infection-associated genes and the infection-induced transcriptional responses of duckweeds differ from other model species. Pathogen defense in duckweeds has evolved along different trajectories than in other plants, including genomic and transcriptional reprogramming. Specifically, the miAMP1 domain-containing proteins, which are absent in Arabidopsis, showed pathogen responsive upregulation in duckweeds. Despite such divergence between Arabidopsis and duckweed species, we found conservation of upregulation of certain genes and the role of hormones in response to disease. Our work highlights the importance of expanding the pool of model species to study defense responses that have evolved in the plant kingdom independent of EDS1.
Assuntos
Proteínas de Arabidopsis , Arabidopsis , Araceae , Arabidopsis/metabolismo , Proteínas de Arabidopsis/metabolismo , Regulação da Expressão Gênica de Plantas , Doenças das Plantas/microbiologia , Proteínas de Ligação a DNA/metabolismo , Araceae/genéticaRESUMO
Plants with facultative crassulacean acid metabolism (CAM) maximize performance through utilizing C3 or C4 photosynthesis under ideal conditions while temporally switching to CAM under water stress (drought). While genome-scale analyses of constitutive CAM plants suggest that time of day networks are shifted, or phased to the evening compared to C3, little is known for how the shift from C3 to CAM networks is modulated in drought induced CAM. Here we generate a draft genome for the drought-induced CAM-cycling species Sedum album. Through parallel sampling in well-watered (C3) and drought (CAM) conditions, we uncover a massive rewiring of time of day expression and a CAM and stress-specific network. The core circadian genes are expanded in S. album and under CAM induction, core clock genes either change phase or amplitude. While the core clock cis-elements are conserved in S. album, we uncover a set of novel CAM and stress specific cis-elements consistent with our finding of rewired co-expression networks. We identified shared elements between constitutive CAM and CAM-cycling species and expression patterns unique to CAM-cycling S. album. Together these results demonstrate that drought induced CAM-cycling photosynthesis evolved through the mobilization of a stress-specific, time of day network, and not solely the phasing of existing C3 networks. These results will inform efforts to engineer water use efficiency into crop plants for growth on marginal land.
Assuntos
Adaptação Fisiológica/genética , Fotossíntese/genética , Proteínas de Plantas/genética , Sedum/genética , Carbono/metabolismo , Ciclo do Carbono/genética , Dióxido de Carbono/metabolismo , Secas , Regulação da Expressão Gênica de Plantas , Genoma de Planta/genética , Proteínas de Plantas/metabolismo , Sedum/metabolismo , Água/químicaRESUMO
The bacterium Agrobacterium tumefaciens has been the workhorse in plant genome engineering. Customized replacement of native tumor-inducing (Ti) plasmid elements enabled insertion of a sequence of interest called Transfer-DNA (T-DNA) into any plant genome. Although these transfer mechanisms are well understood, detailed understanding of structure and epigenomic status of insertion events was limited by current technologies. Here we applied two single-molecule technologies and analyzed Arabidopsis thaliana lines from three widely used T-DNA insertion collections (SALK, SAIL and WISC). Optical maps for four randomly selected T-DNA lines revealed between one and seven insertions/rearrangements, and the length of individual insertions from 27 to 236 kilobases. De novo nanopore sequencing-based assemblies for two segregating lines partially resolved T-DNA structures and revealed multiple translocations and exchange of chromosome arm ends. For the current TAIR10 reference genome, nanopore contigs corrected 83% of non-centromeric misassemblies. The unprecedented contiguous nucleotide-level resolution enabled an in-depth study of the epigenome at T-DNA insertion sites. SALK_059379 line T-DNA insertions were enriched for 24nt small interfering RNAs (siRNA) and dense cytosine DNA methylation, resulting in transgene silencing via the RNA-directed DNA methylation pathway. In contrast, SAIL_232 line T-DNA insertions are predominantly targeted by 21/22nt siRNAs, with DNA methylation and silencing limited to a reporter, but not the resistance gene. Additionally, we profiled the H3K4me3, H3K27me3 and H2A.Z chromatin environments around T-DNA insertions using ChIP-seq in SALK_059379, SAIL_232 and five additional T-DNA lines. We discovered various effect s ranging from complete loss of chromatin marks to the de novo incorporation of H2A.Z and trimethylation of H3K4 and H3K27 around the T-DNA integration sites. This study provides new insights into the structural impact of inserting foreign fragments into plant genomes and demonstrates the utility of state-of-the-art long-range sequencing technologies to rapidly identify unanticipated genomic changes.
Assuntos
Metilação de DNA/genética , DNA Bacteriano/genética , DNA de Plantas/genética , Epigênese Genética/genética , Agrobacterium tumefaciens/genética , Arabidopsis/genética , Mapeamento Cromossômico , Cromossomos de Plantas/genética , Genoma de Planta/genética , Mutagênese Insercional/genética , Plasmídeos Indutores de Tumores em Plantas/genética , Plantas Geneticamente Modificadas/genética , Plantas Geneticamente Modificadas/crescimento & desenvolvimento , Transformação GenéticaRESUMO
Demand for cannabidiol (CBD), the predominant cannabinoid in hemp (Cannabis sativa), has favored cultivars producing unprecedented quantities of CBD. We investigated the ancestry of a new cultivar and cannabinoid synthase genes in relation to cannabinoid inheritance. A nanopore-based assembly anchored to a high-resolution linkage map provided a chromosome-resolved genome for CBDRx, a potent CBD-type cultivar. We measured cannabinoid synthase expression by cDNA sequencing and conducted a population genetic analysis of diverse Cannabis accessions. Quantitative trait locus mapping of cannabinoids in a hemp × marijuana segregating population was also performed. Cannabinoid synthase paralogs are arranged in tandem arrays embedded in long terminal repeat retrotransposons on chromosome 7. Although CBDRx is predominantly of marijuana ancestry, the genome has cannabidiolic acid synthase (CBDAS) introgressed from hemp and lacks a complete sequence for tetrahydrocannabinolic acid synthase (THCAS). Three additional genomes, including one with complete THCAS, confirmed this genomic structure. Only cannabidiolic acid synthase (CBDAS) was expressed in CBD-type Cannabis, while both CBDAS and THCAS were expressed in a cultivar with an intermediate tetrahydrocannabinol (THC) : CBD ratio. Although variation among cannabinoid synthase loci might affect the THC : CBD ratio, variability among cultivars in overall cannabinoid content (potency) was also associated with other chromosomes.
Assuntos
Canabidiol , Canabinoides , Cannabis , Cannabis/genética , Mapeamento Cromossômico , DronabinolRESUMO
Duckweeds are a monophyletic group of rapidly reproducing aquatic monocots in the Lemnaceae family. Given their clonal, exponentially fast reproduction, a key question is whether genome structure is conserved across the species in the absence of meiotic recombination. Here, we studied the genome and proteome of Spirodela polyrhiza, or greater duckweed, which has the largest body plan yet the smallest genome size in the family (1C=150 Mb). Using Oxford Nanopore sequencing combined with Hi-C scaffolding, we generated a highly contiguous, chromosome-scale assembly of S. polyrhiza line Sp7498 (Sp7498_HiC). Both the Sp7498_HiC and Sp9509 genome assemblies reveal large chromosomal misorientations relative to a recent PacBio assembly of Sp7498, highlighting the need for orthogonal long-range scaffolding techniques such as Hi-C and BioNano optical mapping. Shotgun proteomics of Sp7498 verified the expression of ~2250 proteins and revealed a high abundance of proteins involved in photosynthesis and carbohydrate metabolism among other functions. In addition, a strong increase in chloroplast proteins was observed that correlated to chloroplast density. This Sp7498_HiC genome was generated cheaply and quickly with a single Oxford Nanopore MinION flow cell and one Hi-C library in a classroom setting. Combining these data with a mass spectrometry-generated proteome illustrates the utility of duckweed as a model for genomics- and proteomics-based education.
Assuntos
Araceae , Proteínas de Cloroplastos , Araceae/genética , Genoma de Planta , Genômica , ProteômicaRESUMO
Plant genomes, and eukaryotic genomes in general, are typically repetitive, polyploid and heterozygous, which complicates genome assembly. The short read lengths of early Sanger and current next-generation sequencing platforms hinder assembly through complex repeat regions, and many draft and reference genomes are fragmented, lacking skewed GC and repetitive intergenic sequences, which are gaining importance due to projects like the Encyclopedia of DNA Elements (ENCODE). Here we report the whole-genome sequencing and assembly of the desiccation-tolerant grass Oropetium thomaeum. Using only single-molecule real-time sequencing, which generates long (>16 kilobases) reads with random errors, we assembled 99% (244 megabases) of the Oropetium genome into 625 contigs with an N50 length of 2.4 megabases. Oropetium is an example of a 'near-complete' draft genome which includes gapless coverage over gene space as well as intergenic sequences such as centromeres, telomeres, transposable elements and rRNA clusters that are typically unassembled in draft genomes. Oropetium has 28,466 protein-coding genes and 43% repeat sequences, yet with 30% more compact euchromatic regions it is the smallest known grass genome. The Oropetium genome demonstrates the utility of single-molecule real-time sequencing for assembling high-quality plant and other eukaryotic genomes, and serves as a valuable resource for the plant comparative genomics community.
Assuntos
Genoma de Planta/genética , Poaceae/genética , Análise de Sequência de DNA/métodos , Aclimatação/genética , Mapeamento de Sequências Contíguas , Desidratação , Dessecação , Secas , Genes de Plantas/genética , Genômica , Dados de Sequência MolecularRESUMO
Plants are continuously exposed to diurnal fluctuations in light and temperature, and spontaneous changes in their physical or biotic environment. The circadian clock coordinates regulation of gene expression with a 24 h period, enabling the anticipation of these events. We used RNA sequencing to characterize the Brachypodium distachyon transcriptome under light and temperature cycles, as well as under constant conditions. Approximately 3% of the transcriptome was regulated by the circadian clock, a smaller proportion than reported in most other species. For most transcripts that were rhythmic under all conditions, including many known clock genes, the period of gene expression lengthened from 24 to 27 h in the absence of external cues. To functionally characterize the cyclic transcriptome in B. distachyon, we used Gene Ontology enrichment analysis, and found several terms significantly associated with peak expression at particular times of the day. Furthermore, we identified sequence motifs enriched in the promoters of similarly phased genes, some potentially associated with transcription factors. When considering the overlap in rhythmic gene expression and specific pathway behavior, thermocycles was the prevailing cue that controlled diurnal gene regulation. Taken together, our characterization of the rhythmic B. distachyon transcriptome represents a foundational resource with implications in other grass species.
Assuntos
Brachypodium , Brachypodium/genética , Ritmo Circadiano/genética , Sinais (Psicologia) , Regulação da Expressão Gênica , Regulação da Expressão Gênica de Plantas , TemperaturaRESUMO
Kluyveromyces marxianus (K. marxianus) is an increasingly popular industrially relevant yeast. It is known to possess a highly efficient non-homologous end joining (NHEJ) pathway that promotes random integration of non-homologous DNA fragments into its genome. The nature of the integration events was traditionally analyzed by Southern blot hybridization. However, the precise DNA sequence at the insertion sites were not fully explored. We transformed a PCR product of the Saccharomyces cerevisiae URA3 gene (ScURA3) into an uracil auxotroph K. marxianus otherwise wildtype strain and picked 24 stable Ura+ transformants for sequencing analysis. We took advantage of rapid advances in DNA sequencing technologies and developed a method using a combination of Illumina MiSeq and Oxford Nanopore sequencing. This approach enables us to uncover the gross chromosomal rearrangements (GCRs) that are associated with the ScURA3 random integration. Moreover, it will shine a light on understanding DNA repair mechanisms in eukaryotes, which could potentially provide insights for cancer research.
Assuntos
Cromossomos Fúngicos , Kluyveromyces/genética , Proteínas de Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/genética , Aberrações Cromossômicas , Reparo do DNA por Junção de Extremidades , DNA Fúngico/genética , Sequenciamento por Nanoporos/métodos , Transformação GenéticaRESUMO
Duckweeds are the fastest growing angiosperms and have the potential to become a new generation of sustainable crops. Although a seed plant, Spirodela polyrhiza clones rarely flower and multiply mainly through vegetative propagation. Whole-genome sequencing using different approaches and clones yielded two reference maps. One for clone 9509, supported in its assembly by optical mapping of single DNA molecules, and one for clone 7498, supported by cytogenetic assignment of 96 fingerprinted bacterial artificial chromosomes (BACs) to its 20 chromosomes. However, these maps differ in the composition of several individual chromosome models. We validated both maps further to resolve these differences and addressed whether they could be due to chromosome rearrangements in different clones. For this purpose, we applied sequential multicolor fluorescence in situ hybridization (mcFISH) to seven S. polyrhiza clones, using 106 BACs that were mapped onto the 39 pseudomolecules for clone 7498. Furthermore we integrated high-depth Oxford Nanopore (ON) sequence data for clone 9509 to validate and revise the previously assembled chromosome models. We found no major structural rearrangements between these seven clones, identified seven chimeric pseudomolecules and Illumina assembly errors in the previous maps, respectively. A new S. polyrhiza genome map with high contiguity was produced with the ON sequence data and genome-wide synteny analysis supported the occurrence of two Whole Genome Duplication events during its evolution. This work generated a high confidence genome map for S. polyrhiza at the chromosome scale, and illustrates the complementarity of independent approaches to produce whole-genome assemblies in the absence of a genetic map.