RESUMEN
The democratization of sequencing technologies fostered a leap in our knowledge of the diversity of marine phytoplanktonic microalgae, revealing many previously unknown species and lineages. The evolutionary history of the diversification of microalgae can be inferred from the analysis of their genome sequences. However, the link between the DNA sequence and the associated phenotype is notoriously difficult to assess, all the more so for marine phytoplanktonic microalgae for which the lab culture and, thus, biological experimentation is very tedious. Here, we explore the potential of a high-throughput untargeted metabolomic approach to explore the phenotypic-genotypic gap in 12 marine microalgae encompassing 1.2 billion years of evolution. We identified species- and lineage-specific metabolites. We also provide evidence of a very good correlation between the molecular divergence, inferred from the DNA sequences, and the metabolomic divergence, inferred from the complete metabolomic profiles. These results provide novel insights into the potential of chemotaxonomy in marine phytoplankton and support the hypothesis of a metabolomic clock, suggesting that DNA and metabolomic profiles co-evolve.
Asunto(s)
Biodiversidad , Microalgas/metabolismo , Evolución Molecular , Especiación Genética , Ensayos Analíticos de Alto Rendimiento , Metabolómica , Microalgas/genética , Filogenia , Especificidad de la EspecieRESUMEN
Mutation is the ultimate source of genetic variation, and knowledge of mutation rates is fundamental for our understanding of all evolutionary processes. High throughput sequencing of mutation accumulation lines has provided genome wide spontaneous mutation rates in a dozen model species, but estimates from nonmodel organisms from much of the diversity of life are very limited. Here, we report mutation rates in four haploid marine bacterial-sized photosynthetic eukaryotic algae; Bathycoccus prasinos, Ostreococcus tauri, Ostreococcus mediterraneus, and Micromonas pusilla. The spontaneous mutation rate between species varies from µ = 4.4 × 10-10 to 9.8 × 10-10 mutations per nucleotide per generation. Within genomes, there is a two-fold increase of the mutation rate in intergenic regions, consistent with an optimization of mismatch and transcription-coupled DNA repair in coding sequences. Additionally, we show that deviation from the equilibrium GC content increases the mutation rate by â¼2% to â¼12% because of a GC bias in coding sequences. More generally, the difference between the observed and equilibrium GC content of genomes explains some of the inter-specific variation in mutation rates.
Asunto(s)
Chlorophyta/genética , Fotosíntesis/genética , Composición de Base/genética , ADN Intergénico/genética , Eucariontes/genética , Evolución Molecular , Variación Genética , Genoma/genética , Mutación , Tasa de MutaciónRESUMEN
Micro-algae of the genus Ostreococcus and related species of the order Mamiellales are globally distributed in the photic zone of world's oceans where they contribute to fixation of atmospheric carbon and production of oxygen, besides providing a primary source of nutrition in the food web. Their tiny size, simple cells, ease of culture, compact genomes and susceptibility to the most abundant large DNA viruses in the sea render them attractive as models for integrative marine biology. In culture, spontaneous resistance to viruses occurs frequently. Here, we show that virus-producing resistant cell lines arise in many independent cell lines during lytic infections, but over two years, more and more of these lines stop producing viruses. We observed sweeping over-expression of all genes in more than half of chromosome 19 in resistant lines, and karyotypic analyses showed physical rearrangements of this chromosome. Chromosome 19 has an unusual genetic structure whose equivalent is found in all of the sequenced genomes in this ecologically important group of green algae.
Asunto(s)
Chlorophyta/genética , Cromosomas/inmunología , Secuencia de Bases , Chlorophyta/virología , Electroforesis en Gel de Campo Pulsado , Microscopía Electrónica de Transmisión , Análisis de Secuencia por Matrices de Oligonucleótidos , FilogeniaRESUMEN
We have investigated whether there is adaptive evolution in mitochondrial DNA, using an extensive data set containing over 500 animal species from a wide range of taxonomic groups. We apply a variety of McDonald-Kreitman style methods to the data. We find that the evolution of mitochondrial DNA is dominated by slightly deleterious mutations, a finding which is supported by a number of previous studies. However, when we control for the presence of deleterious mutations using a new method, we find that mitochondria undergo a significant amount of adaptive evolution, with an estimated 26% (95% confidence intervals: 5.7-45%) of nonsynonymous substitutions fixed by adaptive evolution. We further find some weak evidence that the rate of adaptive evolution is correlated to synonymous diversity. We interpret this as evidence that at least some adaptive evolution is limited by the supply of mutations.
Asunto(s)
Adaptación Biológica/genética , Evolución Molecular , Mitocondrias/genética , Animales , ADN Mitocondrial/genética , Modelos Genéticos , Tasa de Mutación , Polimorfismo GenéticoRESUMEN
BACKGROUND: Cost effective next generation sequencing technologies now enable the production of genomic datasets for many novel planktonic eukaryotes, representing an understudied reservoir of genetic diversity. O. tauri is the smallest free-living photosynthetic eukaryote known to date, a coccoid green alga that was first isolated in 1995 in a lagoon by the Mediterranean sea. Its simple features, ease of culture and the sequencing of its 13 Mb haploid nuclear genome have promoted this microalga as a new model organism for cell biology. Here, we investigated the quality of genome assemblies of Illumina GAIIx 75 bp paired-end reads from Ostreococcus tauri, thereby also improving the existing assembly and showing the genome to be stably maintained in culture. RESULTS: The 3 assemblers used, ABySS, CLCBio and Velvet, produced 95% complete genomes in 1402 to 2080 scaffolds with a very low rate of misassembly. Reciprocally, these assemblies improved the original genome assembly by filling in 930 gaps. Combined with additional analysis of raw reads and PCR sequencing effort, 1194 gaps have been solved in total adding up to 460 kb of sequence. Mapping of RNAseq Illumina data on this updated genome led to a twofold reduction in the proportion of multi-exon protein coding genes, representing 19% of the total 7699 protein coding genes. The comparison of the DNA extracted in 2001 and 2009 revealed the fixation of 8 single nucleotide substitutions and 2 deletions during the approximately 6000 generations in the lab. The deletions either knocked out or truncated two predicted transmembrane proteins, including a glutamate-receptor like gene. CONCLUSION: High coverage (>80 fold) paired-end Illumina sequencing enables a high quality 95% complete genome assembly of a compact ~13 Mb haploid eukaryote. This genome sequence has remained stable for 6000 generations of lab culture.
Asunto(s)
Chlorophyta/genética , Genoma de Planta , Genómica , Biología Computacional , Evolución Molecular , Variación Genética , Secuenciación de Nucleótidos de Alto Rendimiento , Anotación de Secuencia Molecular , Datos de Secuencia MolecularRESUMEN
Unicellular green picophytoplankton from the Mamiellales order are pervasive in marine ecosystems and susceptible to infections by prasinoviruses, large double-stranded DNA viruses within the Nucleocytoviricota phylum. We developed a double-stranded DNA virus enrichment and shotgun sequencing method, and successfully assembled 80 prasinovirus genomes from 43 samples in the South China Sea. Our research delivered the first direct estimation of 94% accuracy in correlating genome similarity to host range. Stirkingly, our analyses uncovered unexpected host-switching across diverse algal lineages, challenging the existing paradigms of host-virus co-speciation and revealing the dynamic nature of viral evolution. We also detected six instances of horizontal gene transfer between prasinoviruses and their hosts, including a novel alternative oxidase. Additionally, diversifying selection on a major capsid protein suggests an ongoing co-evolutionary arms race. These insights not only expand our understanding of prasinovirus genomic diversity but also highlight the intricate evolutionary mechanisms driving their ecological success and shaping broader virus-host interactions in marine environments.
RESUMEN
Diatoms, a prominent group of phytoplankton, have a significant impact on both the oceanic food chain and carbon sequestration, thereby playing a crucial role in regulating the climate. These highly diverse organisms show a wide geographic distribution across various latitudes. In addition to their ecological significance, diatoms represent a vital source of bioactive compounds that are widely used in biotechnology applications. In the present study, we investigated the genetic and transcriptomic diversity of 17 accessions of the model diatom Phaeodactylum tricornutum including those sampled a century ago as well as more recently collected accessions. The analysis of the data reveals a higher genetic diversity and the emergence of novel clades, indicating an increasing diversity within the P. tricornutum population structure, compared to the previous study and a persistent long-term balancing selection of genes in old and newly sampled accessions. However, the study did not establish a clear link between the year of sampling and genetic diversity, thereby, rejecting the hypothesis of loss of heterozygoty in cultured strains. Transcript analysis identified novel transcript including noncoding RNA and other categories of small RNA such as PiwiRNAs. Additionally, transcripts analysis using differential expression as well as Weighted Gene Correlation Network Analysis has provided evidence that the suppression or downregulation of genes cannot be solely attributed to loss-of-function mutations. This implies that other contributing factors, such as epigenetic modifications, may play a crucial role in regulating gene expression. Our study provides novel genetic resources, which are now accessible through the platform PhaeoEpiview (https://PhaeoEpiView.univ-nantes.fr), that offer both ease of use and advanced tools to further investigate microalgae biology and ecology, consequently enriching our current understanding of these organisms.
RESUMEN
With the advent of next generation genome sequencing, the number of sequenced algal genomes and transcriptomes is rapidly growing. Although a few genome portals exist to browse individual genome sequences, exploring complete genome information from multiple species for the analysis of user-defined sequences or gene lists remains a major challenge. pico-PLAZA is a web-based resource (http://bioinformatics.psb.ugent.be/pico-plaza/) for algal genomics that combines different data types with intuitive tools to explore genomic diversity, perform integrative evolutionary sequence analysis and study gene functions. Apart from homologous gene families, multiple sequence alignments, phylogenetic trees, Gene Ontology, InterPro and text-mining functional annotations, different interactive viewers are available to study genome organization using gene collinearity and synteny information. Different search functions, documentation pages, export functions and an extensive glossary are available to guide non-expert scientists. To illustrate the versatility of the platform, different case studies are presented demonstrating how pico-PLAZA can be used to functionally characterize large-scale EST/RNA-Seq data sets and to perform environmental genomics. Functional enrichments analysis of 16 Phaeodactylum tricornutum transcriptome libraries offers a molecular view on diatom adaptation to different environments of ecological relevance. Furthermore, we show how complementary genomic data sources can easily be combined to identify marker genes to study the diversity and distribution of algal species, for example in metagenomes, or to quantify intraspecific diversity from environmental strains.
Asunto(s)
Bases de Datos Genéticas/normas , Bases de Datos Genéticas/tendencias , Eucariontes/genética , Genómica , Chlorophyta/genética , Código de Barras del ADN Taxonómico , Diatomeas/genética , Variación Genética , Genoma de Planta/genéticaRESUMEN
Although duplications have long been recognized as a fundamental process driving major evolutionary innovations, direct estimates of spontaneous chromosome duplication rates, leading to aneuploid karyotypes, are scarce. Here, from mutation accumulation (MA) experiments, we provide the first estimates of spontaneous chromosome duplication rates in six unicellular eukaryotic species, which range from 1 × 10-4 to 1 × 10-3 per genome per generation. Although this is â¼5 to â¼60 times less frequent than spontaneous point mutations per genome, chromosome duplication events can affect 1-7% of the total genome size. In duplicated chromosomes, mRNA levels reflected gene copy numbers, but the level of translation estimated by polysome profiling revealed that dosage compensation must be occurring. In particular, one duplicated chromosome showed a 2.1-fold increase of mRNA but translation rates were decreased to 0.7-fold. Altogether, our results support previous observations of chromosome-dependent dosage compensation effects, providing evidence that compensation occurs during translation. We hypothesize that an unknown posttranscriptional mechanism modulates the translation of hundreds of transcripts from genes located on duplicated regions in eukaryotes.
Asunto(s)
Duplicación Cromosómica , Genoma , Humanos , Dosificación de Gen , Cromosomas/genética , ARN Mensajero/genética , Duplicación de GenRESUMEN
Phytoplankton-bacteria interactions rule over carbon fixation in the sunlit ocean, yet only a handful of phytoplanktonic-bacteria interactions have been experimentally characterized. In this study, we investigated the effect of three bacterial strains isolated from a long-term microcosm experiment with one Ostreococcus strain (Chlorophyta, Mamiellophyceae). We provided evidence that two Roseovarius strains (Alphaproteobacteria) had a beneficial effect on the long-term survival of the microalgae whereas one Winogradskyella strain (Flavobacteriia) led to the collapse of the microalga culture. Co-cultivation of the beneficial and the antagonistic strains also led to the loss of the microalga cells. Metagenomic analysis of the microcosm is consistent with vitamin B12 synthesis by the Roseovarius strains and unveiled two additional species affiliated to Balneola (Balneolia) and Muricauda (Flavobacteriia), which represent less than 4% of the reads, whereas Roseovarius and Winogradskyella recruit 57 and 39% of the reads, respectively. These results suggest that the low-frequency bacterial species may antagonize the algicidal effect of Winogradskyella in the microbiome of Ostreococcus tauri and thus stabilize the microalga persistence in the microcosm. Altogether, these results open novel perspectives into long-term stability of phytoplankton cultures.
RESUMEN
Ostreococcus spp. are common worldwide oceanic picoeukaryotic pelagic algae. The complete genomes of three strains from different ecological niches revealed them to represent biologically distinct species despite their identical cellular morphologies (cryptic species). Their tiny genomes (13 Mb), with approximately 20 chromosomes, are colinear and densely packed with coding sequences, but no sexual life cycle has been described. Seventeen new strains of one of these species, Ostreococcus tauri, were isolated from 98 seawater samplings from the NW Mediterranean by filtering, culturing, cloning, and plating for single colonies and identification by sequencing their ribosomal 18S gene. In order to find the genetic markers for detection of polymorphisms and sexual recombination, we used an in silico approach to screen available genomic data. Intergenic regions of DNA likely to evolve neutrally were analyzed following polymerase chain reaction amplification of sequences using flanking primers from adjacent conserved coding sequences that were present as syntenic pairs in two different species of Ostreococcus. Analyses of such DNA regions from eight marker loci on two chromosomes from each strain revealed that the isolated O. tauri clones were haploid and that the overall level of polymorphism was approximately 0.01. Four different genetic tests for recombination showed that sexual exchanges must be inferred to account for the between-locus and between-chromosome marker combinations observed. However, our data suggest that sexual encounters are infrequent because we estimate the frequency of meioses/mitoses among the sampled strains to be 10(-6). Ostreococcus tauri and related species encode and express core genes for mitosis and meiosis, but their mechanisms of cell division and recombination, nevertheless, remain enigmatic because a classical eukaryotic spindle with 40 canonical microtubules would be much too large for the available approximately 0.9-microm(3) cellular volume.
Asunto(s)
Chlorophyta/genética , ADN de Algas/genética , Evolución Molecular , Recombinación Genética , Secuencia de Bases , Segregación Cromosómica , Simulación por Computador , Marcadores Genéticos , Genoma , Datos de Secuencia Molecular , Polimorfismo Genético , ARN Ribosómico 18S/genética , Alineación de SecuenciaRESUMEN
Although marine picophytoplankton are at the base of the global food chain, accounting for half of the planetary primary production, they are outnumbered 10 to 1 and are largely controlled by hugely diverse populations of viruses. Eukaryotic microalgae form a ubiquitous and particularly dynamic fraction of such plankton, with environmental clone libraries from coastal regions sometimes being dominated by one or more of the three genera Bathycoccus, Micromonas, and Ostreococcus (class Prasinophyceae). The complete sequences of two double-stranded (dsDNA) Bathycoccus, one dsDNA Micromonas, and one new dsDNA Ostreococcus virus genomes are described. Genome comparison of these giant viruses revealed a high degree of conservation, both for orthologous genes and for synteny, except for one 36-kb inversion in the Ostreococcus lucimarinus virus and two very large predicted proteins in Bathycoccus prasinos viruses. These viruses encode a gene repertoire of certain amino acid biosynthesis pathways never previously observed in viruses that are likely to have been acquired from lateral gene transfer from their host or from bacteria. Pairwise comparisons of whole genomes using all coding sequences with homologous counterparts, either between viruses or between their corresponding hosts, revealed that the evolutionary divergences between viruses are lower than those between their hosts, suggesting either multiple recent host transfers or lower viral evolution rates.
Asunto(s)
Evolución Biológica , Infecciones por Virus ADN/genética , Virus ADN/genética , Virus ADN/patogenicidad , Transferencia de Gen Horizontal , Genoma Viral , Biología Marina , Microalgas/virología , Infecciones por Virus ADN/virología , ADN Viral/fisiología , Genes Virales/fisiología , Variación Genética , FilogeniaRESUMEN
Although sex is now accepted as a ubiquitous and ancestral feature of eukaryotes, direct observation of sex is still lacking in most unicellular eukaryotic lineages. Evidence of sex is frequently indirect and inferred from the identification of genes involved in meiosis from whole genome data and/or the detection of recombination signatures from genetic diversity in natural populations. In haploid unicellular eukaryotes, sex-related chromosomes are named mating-type (MTs) chromosomes and generally carry large genomic regions where recombination is suppressed. These regions have been characterized in Fungi and Chlorophyta and determine gamete compatibility and fusion. Two candidate MT+ and MT- alleles, spanning 450-650 kb, have recently been described in Ostreococcus tauri, a marine phytoplanktonic alga from the Mamiellophyceae class, an early diverging branch in the green lineage. Here, we investigate the architecture and evolution of these candidate MT+ and MT- alleles. We analyzed the phylogenetic profile and GC content of MT gene families in eight different genomes whose divergence has been previously estimated at up to 640 Myr, and found evidence that the divergence of the two MT alleles predates speciation in the Ostreococcus genus. Phylogenetic profiles of MT trans-specific polymorphisms in gametologs disclosed candidate MTs in two additional species, and possibly a third. These Mamiellales MT candidates are likely to be the oldest mating-type loci described to date, which makes them fascinating models to investigate the evolutionary mechanisms of haploid sex determination in eukaryotes.
Asunto(s)
Chlorophyta , Cromosomas Sexuales , Chlorophyta/genética , Evolución Molecular , Genoma , Genómica , Filogenia , Cromosomas Sexuales/genéticaRESUMEN
Ostreococcus tauri is a simple unicellular green alga representing an ecologically important group of phytoplankton in oceans worldwide. Modern molecular techniques must be developed in order to understand the mechanisms that permit adaptation of microalgae to their environment. We present for the first time in O. tauri a detailed characterization of individual genomic integration events of foreign DNA of plasmid origin after PEG-mediated transformation. Vector integration occurred randomly at a single locus in the genome and mainly as a single copy. Thus, we confirmed the utility of this technique for insertional mutagenesis. While the mechanism of double-stranded DNA repair in the O. tauri model remains to be elucidated, we clearly demonstrate by genome resequencing that the integration of the vector leads to frequent structural variations (deletions/insertions and duplications) and some chromosomal rearrangements in the genome at the insertion loci. Furthermore, we often observed variations in the vector sequence itself. From these observations, we speculate that a nonhomologous end-joining-like mechanism is employed during random insertion events, as described in plants and other freshwater algal models. PEG-mediated transformation is therefore a promising molecular biology tool, not only for functional genomic studies, but also for biotechnological research in this ecologically important marine alga.
Asunto(s)
Chlorophyta/genética , Reparación del ADN/genética , Genoma/genética , Mutación/genética , Reparación del ADN/fisiología , Genómica , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Mutagénesis Insercional/métodos , NanoporosRESUMEN
Although interactions between microalgae and bacteria are observed in both natural environment and the laboratory, the modalities of coexistence of bacteria inside microalgae phycospheres in laboratory cultures are mostly unknown. Here, we focused on well-controlled cultures of the model green picoalga Ostreococcus tauri and the most abundant member of its phycosphere, Marinobacter algicola. The prevalence of M. algicola in O. tauri cultures raises questions about how this bacterium maintains itself under laboratory conditions in the microalga culture. The results showed that M. algicola did not promote O. tauri growth in the absence of vitamin B12 while M. algicola depended on O. tauri to grow in synthetic medium, most likely to obtain organic carbon sources provided by the microalgae. M. algicola grew on a range of lipids, including triacylglycerols that are known to be produced by O. tauri in culture during abiotic stress. Genomic screening revealed the absence of genes of two particular modes of quorum-sensing in Marinobacter genomes which refutes the idea that these bacterial communication systems operate in this genus. To date, the 'opportunistic' behaviour of M. algicola in the laboratory is limited to several phytoplanktonic species including Chlorophyta such as O. tauri. This would indicate a preferential occurrence of M. algicola in association with these specific microalgae under optimum laboratory conditions.
RESUMEN
Among marine phytoplankton, Mamiellales encompass several species from the genera Micromonas, Ostreococcus and Bathycoccus, which are important contributors to primary production. Previous studies based on single gene markers described their wide geographical distribution but led to discussion because of the uneven taxonomic resolution of the method. Here, we leverage genome sequences for six Mamiellales species, two from each genus Micromonas, Ostreococcus and Bathycoccus, to investigate their distribution across 133 stations sampled during the Tara Oceans expedition. Our study confirms the cosmopolitan distribution of Mamiellales and further suggests non-random distribution of species, with two triplets of co-occurring genomes associated with different temperatures: Ostreococcuslucimarinus, Bathycoccusprasinos and Micromonaspusilla were found in colder waters, whereas Ostreococcus spp. RCC809, Bathycoccus spp. TOSAG39-1 and Micromonascommoda were more abundant in warmer conditions. We also report the distribution of the two candidate mating-types of Ostreococcus for which the frequency of sexual reproduction was previously assumed to be very low. Indeed, both mating types were systematically detected together in agreement with either frequent sexual reproduction or the high prevalence of a diploid stage. Altogether, these analyses provide novel insights into Mamiellales' biogeography and raise novel testable hypotheses about their life cycle and ecology.
Asunto(s)
Chlorophyta/genética , Filogeografía/métodos , Secuencia de Bases , Demografía/métodos , Genoma , Océanos y Mares , Filogenia , Fitoplancton , Densidad de Población , Agua de MarRESUMEN
Diatoms emerged in the Mesozoic period and presently constitute one of the main primary producers in the world's ocean and are of a major economic importance. In the current study, using whole genome sequencing of ten accessions of the model diatom Phaeodactylum tricornutum, sampled at broad geospatial and temporal scales, we draw a comprehensive landscape of the genomic diversity within the species. We describe strong genetic subdivisions of the accessions into four genetic clades (A-D) with constituent populations of each clade possessing a conserved genetic and functional makeup, likely a consequence of the limited dispersal of P. tricornutum in the open ocean. We further suggest dominance of asexual reproduction across all the populations, as implied by high linkage disequilibrium. Finally, we show limited yet compelling signatures of genetic and functional convergence inducing changes in the selection pressure on many genes and metabolic pathways. We propose these findings to have significant implications for understanding the genetic structure of diatom populations in nature and provide a framework to assess the genomic underpinnings of their ecological success and impact on aquatic ecosystems where they play a major role. Our work provides valuable resources for functional genomics and for exploiting the biotechnological potential of this model diatom species.
Asunto(s)
Diatomeas/genética , Diatomeas/clasificación , Diatomeas/metabolismo , Ecosistema , Genoma , Genómica , Redes y Vías Metabólicas/genética , Polimorfismo GenéticoRESUMEN
Virus-microbe interactions in the ocean are commonly described by "boom and bust" dynamics, whereby a numerically dominant microorganism is lysed and replaced by a virus-resistant one. Here, we isolated a microalga strain and its infective dsDNA virus whose dynamics are characterized instead by parallel growth of both the microalga and the virus. Experimental evolution of clonal lines revealed that this viral production originates from the lysis of a minority of virus-susceptible cells, which are regenerated from resistant cells. Whole-genome sequencing demonstrated that this resistant-susceptible switch involved a large deletion on one chromosome. Mathematical modeling explained how the switch maintains stable microalga-virus population dynamics consistent with their observed growth pattern. Comparative genomics confirmed an ancient origin of this "accordion" chromosome despite a lack of sequence conservation. Together, our results show how dynamic genomic rearrangements may account for a previously overlooked coexistence mechanism in microalgae-virus interactions.
Asunto(s)
Genoma , Genómica , Interacciones Huésped-Patógeno , Fitoplancton/virología , Simbiosis , Algoritmos , Genómica/métodos , Microalgas/ultraestructura , Microalgas/virología , Modelos Teóricos , Fitoplancton/ultraestructuraRESUMEN
An amendment to this paper has been published and can be accessed via a link at the top of the paper.
RESUMEN
Benthic diatoms are the main primary producers in shallow freshwater and coastal environments, fulfilling important ecological functions such as nutrient cycling and sediment stabilization. However, little is known about their evolutionary adaptations to these highly structured but heterogeneous environments. Here, we report a reference genome for the marine biofilm-forming diatom Seminavis robusta, showing that gene family expansions are responsible for a quarter of all 36,254 protein-coding genes. Tandem duplications play a key role in extending the repertoire of specific gene functions, including light and oxygen sensing, which are probably central for its adaptation to benthic habitats. Genes differentially expressed during interactions with bacteria are strongly conserved in other benthic diatoms while many species-specific genes are strongly upregulated during sexual reproduction. Combined with re-sequencing data from 48 strains, our results offer insights into the genetic diversity and gene functions in benthic diatoms.