RESUMO
The genomic diversity underpinning high ecological and species diversity in the green algae (Chlorophyta) remains little known. Here, we aimed to track genome evolution in the Chlorophyta, focusing on loss and gain of homologous genes, and lineage-specific innovations of the core Chlorophyta. We generated a high-quality nuclear genome for pedinophyte YPF701, a sister lineage to others in the core Chlorophyta and incorporated this genome in a comparative analysis with 25 other genomes from diverse Viridiplantae taxa. The nuclear genome of pedinophyte YPF701 has an intermediate size and gene number between those of most prasinophytes and the remainder of the core Chlorophyta. Our results suggest positive selection for genome streamlining in the Pedinophyceae, independent from genome minimisation observed among prasinophyte lineages. Genome expansion was predicted along the branch leading to the UTC clade (classes Ulvophyceae, Trebouxiophyceae and Chlorophyceae) after divergence from their last common ancestor with pedinophytes, with genomic novelty implicated in a range of basic biological functions. Results emphasise multiple independent signals of genome minimisation within the Chlorophyta, as well as the genomic novelty arising before diversification in the UTC clade, which may underpin the success of this species-rich clade in a diversity of habitats.
Assuntos
Clorófitas , Núcleo Celular/genética , Clorófitas/genética , Evolução Molecular , Genoma , Genômica , FilogeniaRESUMO
BACKGROUND: Dinoflagellates in the family Symbiodiniaceae are important photosynthetic symbionts in cnidarians (such as corals) and other coral reef organisms. Breakdown of the coral-dinoflagellate symbiosis due to environmental stress (i.e. coral bleaching) can lead to coral death and the potential collapse of reef ecosystems. However, evolution of Symbiodiniaceae genomes, and its implications for the coral, is little understood. Genome sequences of Symbiodiniaceae remain scarce due in part to their large genome sizes (1-5 Gbp) and idiosyncratic genome features. RESULTS: Here, we present de novo genome assemblies of seven members of the genus Symbiodinium, of which two are free-living, one is an opportunistic symbiont, and the remainder are mutualistic symbionts. Integrating other available data, we compare 15 dinoflagellate genomes revealing high sequence and structural divergence. Divergence among some Symbiodinium isolates is comparable to that among distinct genera of Symbiodiniaceae. We also recovered hundreds of gene families specific to each lineage, many of which encode unknown functions. An in-depth comparison between the genomes of the symbiotic Symbiodinium tridacnidorum (isolated from a coral) and the free-living Symbiodinium natans reveals a greater prevalence of transposable elements, genetic duplication, structural rearrangements, and pseudogenisation in the symbiotic species. CONCLUSIONS: Our results underscore the potential impact of lifestyle on lineage-specific gene-function innovation, genome divergence, and the diversification of Symbiodinium and Symbiodiniaceae. The divergent features we report, and their putative causes, may also apply to other microbial eukaryotes that have undergone symbiotic phases in their evolutionary history.
Assuntos
Antozoários , Dinoflagellida , Animais , Antozoários/genética , Recifes de Corais , Dinoflagellida/genética , Ecossistema , Variação Genética , Genoma/genéticaRESUMO
Comparative algal genomics often relies on predicted genes from de novo assembled genomes. However, the artifacts introduced by different gene-prediction approaches, and their impact on comparative genomic analysis remain poorly understood. Here, using available genome data from six dinoflagellate species in the Symbiodiniaceae, we identified methodological biases in the published genes that were predicted using different approaches and putative contaminant sequences in the published genome assemblies. We developed and applied a comprehensive customized workflow to predict genes from these genomes. The observed variation among predicted genes resulting from our workflow agreed with current understanding of phylogenetic relationships among these taxa, whereas the variation among the previously published genes was largely biased by the distinct approaches used in each instance. Importantly, these biases affect the inference of homologous gene families and synteny among genomes, thus impacting biological interpretation of these data. Our results demonstrate that a consistent gene-prediction approach is critical for comparative analysis of dinoflagellate genomes.
Assuntos
Dinoflagellida , Genoma , Filogenia , SinteniaRESUMO
Dinoflagellates are a diverse group of phytoplankton, ranging from harmful bloom-forming microalgae to photosymbionts of coral reefs. Genome-scale data from dinoflagellates reveal atypical genomic features, extensive genomic divergence, and lineage-specific innovation of gene functions. Long non-coding RNAs (lncRNAs), known to regulate gene expression in eukaryotes, are largely unexplored in dinoflagellates. Here, using high-quality genome and transcriptome data, we identified 48039 polyadenylated lncRNAs in three dinoflagellate species: the coral symbionts Cladocopium proliferum and Durusdinium trenchii, and the bloom-forming species, Prorocentrum cordatum. These lncRNAs have fewer introns and lower G+C content than protein-coding sequences; 37 768 (78.6%) are unique with respect to sequence similarity. We classified all lncRNAs based on conserved motifs (k-mers) into distinct clusters, following properties of protein-binding and/or subcellular localisation. Interestingly, 3708 (7.7%) lncRNAs are differentially expressed under heat stress, algal lifestyle, and/or growth phase, and share co-expression patterns with protein-coding genes. Based on inferred triplex interactions between lncRNA and putative promoter regions, we identified 19 460 putative gene targets for 3721 lncRNAs; 907 genes exhibit differential expression under heat stress. These results reveal, for the first time, the diversity of lncRNAs in dinoflagellates and how lncRNAs may regulate gene expression as a heat-stress response in these ecologically important microbes.
RESUMO
Dinoflagellates in the family Symbiodiniaceae are taxonomically diverse, predominantly symbiotic lineages that are well-known for their association with corals. The ancestor of these taxa is believed to have been free-living. The establishment of symbiosis (i.e. symbiogenesis) is hypothesized to have occurred multiple times during Symbiodiniaceae evolution, but its impact on genome evolution of these taxa is largely unknown. Among Symbiodiniaceae, the genus Effrenium is a free-living lineage that is phylogenetically positioned between two robustly supported groups of genera within which symbiotic taxa have emerged. The apparent lack of symbiogenesis in Effrenium suggests that the ancestral features of Symbiodiniaceae may have been retained in this lineage. Here, we present de novo assembled genomes (1.2-1.9 Gbp in size) and transcriptome data from three isolates of Effrenium voratum and conduct a comparative analysis that includes 16 Symbiodiniaceae taxa and the other dinoflagellates. Surprisingly, we find that genome reduction, which is often associated with a symbiotic lifestyle, predates the origin of Symbiodiniaceae. The free-living lifestyle distinguishes Effrenium from symbiotic Symbiodiniaceae vis-à-vis their longer introns, more-extensive mRNA editing, fewer (~30%) lineage-specific gene sets, and lower (~10%) level of pseudogenization. These results demonstrate how genome reduction and the adaptation to distinct lifestyles intersect to drive diversification and genome evolution of Symbiodiniaceae.
Assuntos
Dinoflagellida , Filogenia , Simbiose , Dinoflagellida/genética , Dinoflagellida/classificação , Evolução Molecular , Transcriptoma , Genoma de ProtozoárioRESUMO
The algal endosymbiont Durusdinium trenchii enhances the resilience of coral reefs under thermal stress. D. trenchii can live freely or in endosymbiosis, and the analysis of genetic markers suggests that this species has undergone whole-genome duplication (WGD). However, the evolutionary mechanisms that underpin the thermotolerance of this species are largely unknown. Here, we present genome assemblies for two D. trenchii isolates, confirm WGD in these taxa, and examine how selection has shaped the duplicated genome regions using gene expression data. We assess how the free-living versus endosymbiotic lifestyles have contributed to the retention and divergence of duplicated genes, and how these processes have enhanced the thermotolerance of D. trenchii. Our combined results suggest that lifestyle is the driver of post-WGD evolution in D. trenchii, with the free-living phase being the most important, followed by endosymbiosis. Adaptations to both lifestyles likely enabled D. trenchii to provide enhanced thermal stress protection to the host coral.
Assuntos
Antozoários , Duplicação Gênica , Genoma , Simbiose , Termotolerância , Simbiose/genética , Antozoários/genética , Antozoários/fisiologia , Antozoários/microbiologia , Animais , Termotolerância/genética , Recifes de Corais , FilogeniaRESUMO
With the development of social economy and smart technology, the explosive growth of vehicles has caused traffic forecasting to become a daunting challenge, especially for smart cities. Recent methods exploit graph spatial-temporal characteristics, including constructing the shared patterns of traffic data, and modeling the topological space of traffic data. However, existing methods fail to consider the spatial position information and only utilize little spatial neighborhood information. To tackle above limitation, we design a Graph Spatial-Temporal Position Recurrent Network (GSTPRN) architecture for traffic forecasting. We first construct a position graph convolution module based on self-attention and calculate the dependence strengths among the nodes to capture the spatial dependence relationship. Next, we develop approximate personalized propagation that extends the propagation range of spatial dimension information to obtain more spatial neighborhood information. Finally, we systematically integrate the position graph convolution, approximate personalized propagation and adaptive graph learning into a recurrent network (i.e. Gated Recurrent Units). Experimental evaluation on two benchmark traffic datasets demonstrates that GSTPRN is superior to the state-of-art methods.
Assuntos
Benchmarking , Aprendizagem , Análise EspacialRESUMO
Dinoflagellates in the order Suessiales include the family Symbiodiniaceae, which have essential roles as photosymbionts in corals, and their cold-adapted sister group, Polarella glacialis. These diverse taxa exhibit extensive genomic divergence, although their genomes are relatively small (haploid size < 3 Gbp) when compared with most other free-living dinoflagellates. Different strains of Symbiodiniaceae form symbiosis with distinct hosts and exhibit different regimes of gene expression, but intraspecific whole-genome divergence is poorly understood. Focusing on three Symbiodiniaceae species (the free-living Effrenium voratum and the symbiotic Symbiodinium microadriaticum and Durusdinium trenchii) and the free-living outgroup P. glacialis, for which whole-genome data from multiple isolates are available, we assessed intraspecific genomic divergence with respect to sequence and structure. Our analysis, based on alignment and alignment-free methods, revealed a greater extent of intraspecific sequence divergence in Symbiodiniaceae than in P. glacialis. Our results underscore the role of gene duplication in generating functional innovation, with a greater prevalence of tandemly duplicated single-exon genes observed in the genomes of free-living species than in symbionts. These results demonstrate the remarkable intraspecific genomic divergence in dinoflagellates under the constraint of reduced genome sizes, shaped by genetic duplications and symbiogenesis events during the diversification of Symbiodiniaceae.
RESUMO
BACKGROUND: "Red tides" are harmful algal blooms caused by dinoflagellate microalgae that accumulate toxins lethal to other organisms, including humans via consumption of contaminated seafood. These algal blooms are driven by a combination of environmental factors including nutrient enrichment, particularly in warm waters, and are increasingly frequent. The molecular, regulatory, and evolutionary mechanisms that underlie the heat stress response in these harmful bloom-forming algal species remain little understood, due in part to the limited genomic resources from dinoflagellates, complicated by the large sizes of genomes, exhibiting features atypical of eukaryotes. RESULTS: We present the de novo assembled genome (~ 4.75 Gbp with 85,849 protein-coding genes), transcriptome, proteome, and metabolome from Prorocentrum cordatum, a globally abundant, bloom-forming dinoflagellate. Using axenic algal cultures, we study the molecular mechanisms that underpin the algal response to heat stress, which is relevant to current ocean warming trends. We present the first evidence of a complementary interplay between RNA editing and exon usage that regulates the expression and functional diversity of biomolecules, reflected by reduction in photosynthesis, central metabolism, and protein synthesis. These results reveal genomic signatures and post-transcriptional regulation for the first time in a pelagic dinoflagellate. CONCLUSIONS: Our multi-omics analyses uncover the molecular response to heat stress in an important bloom-forming algal species, which is driven by complex gene structures in a large, high-G+C genome, combined with multi-level transcriptional regulation. The dynamics and interplay of molecular regulatory mechanisms may explain in part how dinoflagellates diversified to become some of the most ecologically successful organisms on Earth.
Assuntos
Dinoflagellida , Proliferação Nociva de Algas , Humanos , Dinoflagellida/genética , Multiômica , Genômica , Resposta ao Choque TérmicoRESUMO
Dinoflagellates of the family Symbiodiniaceae are predominantly essential symbionts of corals and other marine organisms. Recent research reveals extensive genome sequence divergence among Symbiodiniaceae taxa and high phylogenetic diversity hidden behind subtly different cell morphologies. Using an alignment-free phylogenetic approach based on sub-sequences of fixed length k (i.e. k-mers), we assessed the phylogenetic signal among whole-genome sequences from 16 Symbiodiniaceae taxa (including the genera of Symbiodinium, Breviolum, Cladocopium, Durusdinium and Fugacium) and two strains of Polarella glacialis as outgroup. Based on phylogenetic trees inferred from k-mers in distinct genomic regions (i.e. repeat-masked genome sequences, protein-coding sequences, introns and repeats) and in protein sequences, the phylogenetic signal associated with protein-coding DNA and the encoded amino acids is largely consistent with the Symbiodiniaceae phylogeny based on established markers, such as large subunit rRNA. The other genome sequences (introns and repeats) exhibit distinct phylogenetic signals, supporting the expected differential evolutionary pressure acting on these regions. Our analysis of conserved core k-mers revealed the prevalence of conserved k-mers (>95% core 23-mers among all 18 genomes) in annotated repeats and non-genic regions of the genomes. We observed 180 distinct repeat types that are significantly enriched in genomes of the symbiotic versus free-living Symbiodinium taxa, suggesting an enhanced activity of transposable elements linked to the symbiotic lifestyle. We provide evidence that representation of alignment-free phylogenies as dynamic networks enhances the ability to generate new hypotheses about genome evolution in Symbiodiniaceae. These results demonstrate the potential of alignment-free phylogenetic methods as a scalable approach for inferring comprehensive, unbiased whole-genome phylogenies of dinoflagellates and more broadly of microbial eukaryotes.
RESUMO
Dinoflagellates of the family Symbiodiniaceae are crucial photosymbionts in corals and other marine organisms. Of these, Cladocopium goreaui is one of the most dominant symbiont species in the Indo-Pacific. Here, we present an improved genome assembly of C. goreaui combining new long-read sequence data with previously generated short-read data. Incorporating new full-length transcripts to guide gene prediction, the C. goreaui genome (1.2 Gb) exhibits a high extent of completeness (82.4% based on BUSCO protein recovery) and better resolution of repetitive sequence regions; 45,322 gene models were predicted, and 327 putative, topologically associated domains of the chromosomes were identified. Comparison with other Symbiodiniaceae genomes revealed a prevalence of repeats and duplicated genes in C. goreaui, and lineage-specific genes indicating functional innovation. Incorporating 2,841,408 protein sequences from 96 taxonomically diverse eukaryotes and representative prokaryotes in a phylogenomic approach, we assessed the evolutionary history of C. goreaui genes. Of the 5246 phylogenetic trees inferred from homologous protein sets containing two or more phyla, 35-36% have putatively originated via horizontal gene transfer (HGT), predominantly (19-23%) via an ancestral Archaeplastida lineage implicated in the endosymbiotic origin of plastids: 10-11% are of green algal origin, including genes encoding photosynthetic functions. Our results demonstrate the utility of long-read sequence data in resolving structural features of a dinoflagellate genome, and highlight how genetic transfer has shaped genome evolution of a facultative symbiont, and more broadly of dinoflagellates.
RESUMO
Modern microbial taxonomy generally relies on the use of single marker genes or sets of concatenated genes to generate a framework for the delineation and classification of organisms at different taxonomic levels. However, given that DNA is the 'blueprint of life', and hence the ultimate arbiter of taxonomy, classification systems should attempt to use as much of the blueprint as possible to capture a comprehensive phylogenetic signal. Recent analysis of whole-genome sequences from coral reef symbionts (dinoflagellates of the family Symbiodiniaceae) and other microalgal groups has uncovered extensive divergence not recognised by current algal taxonomic approaches. In the era of 'sequence everything', we argue that whole-genome data are pivotal to guide informed taxonomic inference, particularly for microbial eukaryotes.
Assuntos
Antozoários , Dinoflagellida , Animais , Antozoários/genética , Recifes de Corais , Dinoflagellida/genética , Filogenia , SimbioseRESUMO
Herbicides are commonly deployed as the front-line treatment to control infestations of weeds in native ecosystems and among crop plants in agriculture. However, the prevalence of herbicide resistance in many species is a major global challenge. The specificity and effectiveness of herbicides acting on diverse weed species are tightly linked to targeted proteins. The conservation and variance at these sites among different weed species remain largely unexplored. Using novel genome data in a genome-guided approach, 12 common herbicide-target genes and their coded proteins were identified from seven species of Weeds of National Significance in Australia: Alternanthera philoxeroides (alligator weed), Lycium ferocissimum (African boxthorn), Senecio madagascariensis (fireweed), Lantana camara (lantana), Parthenium hysterophorus (parthenium), Cryptostegia grandiflora (rubber vine), and Eichhornia crassipes (water hyacinth). Gene and protein sequences targeted by the acetolactate synthase (ALS) inhibitors and glyphosate were recovered. Compared to structurally resolved homologous proteins as reference, high sequence conservation was observed at the herbicide-target sites in the ALS (target for ALS inhibitors), and in 5-enolpyruvylshikimate-3-phosphate (EPSP) synthase (target for glyphosate). Although the sequences are largely conserved in the seven phylogenetically diverse species, mutations observed in the ALS proteins of fireweed and parthenium suggest resistance of these weeds to ALS-inhibiting and other herbicides. These protein sites remain as attractive targets for the development of novel inhibitors and herbicides. This notion is reinforced by the results from the phylogenetic analysis of the 12 proteins, which reveal a largely consistent vertical inheritance in their evolutionary histories. These results demonstrate the utility of high-throughput genome sequencing to rapidly identify and characterize gene targets by computational methods, bypassing the experimental characterization of individual genes. Data generated from this study provide a useful reference for future investigations in herbicide discovery and development.
RESUMO
Ethanol production from sugarcane is a key renewable fuel industry in Brazil. Major drivers of this alcoholic fermentation are Saccharomyces cerevisiae strains that originally were contaminants to the system and yet prevail in the industrial process. Here we present newly sequenced genomes (using Illumina short-read and PacBio long-read data) of two monosporic isolates (H3 and H4) of the S. cerevisiae PE-2, a predominant bioethanol strain in Brazil. The assembled genomes of H3 and H4, together with 42 draft genomes of sugarcane-fermenting (fuel ethanol plus cachaça) strains, were compared against those of the reference S288C and diverse S. cerevisiae. All genomes of bioethanol yeasts have amplified SNO2(3)/SNZ2(3) gene clusters for vitamin B1/B6 biosynthesis, and display ubiquitous presence of a particular family of SAM-dependent methyl transferases, rare in S. cerevisiae. Widespread amplifications of quinone oxidoreductases YCR102C/YLR460C/YNL134C, and the structural or punctual variations among aquaporins and components of the iron homeostasis system, likely represent adaptations to industrial fermentation. Interesting is the pervasive presence among the bioethanol/cachaça strains of a five-gene cluster (Region B) that is a known phylogenetic signature of European wine yeasts. Combining genomes of H3, H4, and 195 yeast strains, we comprehensively assessed whole-genome phylogeny of these taxa using an alignment-free approach. The 197-genome phylogeny substantiates that bioethanol yeasts are monophyletic and closely related to the cachaça and wine strains. Our results support the hypothesis that biofuel-producing yeasts in Brazil may have been co-opted from a pool of yeasts that were pre-adapted to alcoholic fermentation of sugarcane for the distillation of cachaça spirit, which historically is a much older industry than the large-scale fuel ethanol production.
RESUMO
The green alga Ostreobium is an important coral holobiont member, playing key roles in skeletal decalcification and providing photosynthate to bleached corals that have lost their dinoflagellate endosymbionts. Ostreobium lives in the coral's skeleton, a low-light environment with variable pH and O2 availability. We present the Ostreobium nuclear genome and a metatranscriptomic analysis of healthy and bleached corals to improve our understanding of Ostreobium's adaptations to its extreme environment and its roles as a coral holobiont member. The Ostreobium genome has 10,663 predicted protein-coding genes and shows adaptations for life in low and variable light conditions and other stressors in the endolithic environment. This alga presents a rich repertoire of light-harvesting complex proteins but lacks many genes for photoprotection and photoreceptors. It also has a large arsenal of genes for oxidative stress response. An expansion of extracellular peptidases suggests that Ostreobium may supplement its energy needs by feeding on the organic skeletal matrix, and a diverse set of fermentation pathways allows it to live in the anoxic skeleton at night. Ostreobium depends on other holobiont members for vitamin B12, and our metatranscriptomes identify potential bacterial sources. Metatranscriptomes showed Ostreobium becoming a dominant agent of photosynthesis in bleached corals and provided evidence for variable responses among coral samples and different Ostreobium genotypes. Our work provides a comprehensive understanding of the adaptations of Ostreobium to its extreme environment and an important genomic resource to improve our comprehension of coral holobiont resilience, bleaching, and recovery.
Assuntos
Adaptação Biológica/genética , Antozoários , Clorófitas/genética , Genômica , Simbiose , AnimaisRESUMO
Dinoflagellates of the Symbiodiniaceae family encompass diverse symbionts that are critical to corals and other species living in coral reefs. It is well known that sexual reproduction enhances adaptive evolution in changing environments. Although genes related to meiotic functions were reported in Symbiodiniaceae, cytological evidence of meiosis and fertilisation are however yet to be observed in these taxa. Using transcriptome and genome data from 21 Symbiodiniaceae isolates, we studied genes that encode proteins associated with distinct stages of meiosis and syngamy. We report the absence of genes that encode main components of the synaptonemal complex (SC), a protein structure that mediates homologous chromosomal pairing and class I crossovers. This result suggests an independent loss of canonical SCs in the alveolates, that also includes the SC-lacking ciliates. We hypothesise that this loss was due in part to permanently condensed chromosomes and repeat-rich sequences in Symbiodiniaceae (and other dinoflagellates) which favoured the SC-independent class II crossover pathway. Our results reveal novel insights into evolution of the meiotic molecular machinery in the ecologically important Symbiodiniaceae and in other eukaryotes.