Your browser doesn't support javascript.
loading
: 20 | 50 | 100
1 - 20 de 39
1.
Genome Biol ; 22(1): 256, 2021 09 03.
Article En | MEDLINE | ID: mdl-34479618

Currently, different sequencing platforms are used to generate plant genomes and no workflow has been properly developed to optimize time, cost, and assembly quality. We present LeafGo, a complete de novo plant genome workflow, that starts from tissue and produces genomes with modest laboratory and bioinformatic resources in approximately 7 days and using one long-read sequencing technology. LeafGo is optimized with ten different plant species, three of which are used to generate high-quality chromosome-level assemblies without any scaffolding technologies. Finally, we report the diploid genomes of Eucalyptus rudis and E. camaldulensis and the allotetraploid genome of Arachis hypogaea.


Genome, Plant , High-Throughput Nucleotide Sequencing/methods , Plant Leaves/genetics , Software , Arachis/genetics , DNA, Plant/genetics , DNA, Plant/isolation & purification , Diploidy , Species Specificity , Tetraploidy , Time Factors
2.
Sci Rep ; 9(1): 14646, 2019 10 10.
Article En | MEDLINE | ID: mdl-31601866

In this study, by exploring chromatin conformation capture data, we show that the nuclear segregation of Topologically Associated Domains (TADs) is contributed by DNA sequence composition. GC-peaks and valleys of TADs strongly influence interchromosomal interactions and chromatin 3D structure. To gain insight on the compositional and functional constraints associated with chromatin interactions and TADs formation, we analysed intra-TAD and intra-loop GC variations. This led to the identification of clear GC-gradients, along which, the density of genes, super-enhancers, transcriptional activity, and CTCF binding sites occupancy co-vary non-randomly. Further, the analysis of DNA base composition of nucleolar aggregates and nuclear speckles showed strong sequence-dependant effects. We conjecture that dynamic DNA binding affinity and flexibility underlay the emergence of chromatin condensates, their growth is likely promoted in mechanically soft regions (GC-rich) of the lowest chromatin and nucleosome densities. As a practical perspective, the strong linear association between sequence composition and interchromosomal contacts can help define consensus chromatin interactions, which in turn may be used to study alternative states of chromatin architecture.


CCCTC-Binding Factor/metabolism , Cell Nucleus/metabolism , Chromatin Assembly and Disassembly/genetics , Chromatin/metabolism , Transcription, Genetic , Base Composition/genetics , Binding Sites , Cell Line , Cell Nucleus/genetics , Chromatin/genetics , Datasets as Topic , Enhancer Elements, Genetic/genetics , Gene Expression Profiling , Genomics , Humans
3.
PLoS One ; 14(3): e0213278, 2019.
Article En | MEDLINE | ID: mdl-30865674

Recent findings established a link between DNA sequence composition and interphase chromatin architecture and explained the evolutionary conservation of TADs (Topologically Associated Domains) and LADs (Lamina Associated Domains) in mammals. This prompted us to analyse conformation capture and recombination rate data to study the relationship between chromatin architecture and recombination landscape of human and mouse genomes. The results reveal that: (1) low recombination domains and blocks of elevated linkage disequilibrium tend to coincide with TADs and isochores, indicating co-evolving regulatory elements and genes in insulated neighbourhoods; (2) double strand break (DSB) and recombination frequencies increase in the short loops of GC-rich TADs, whereas recombination cold spots are typical of LADs and (3) the binding and loading of proteins, which are critical for DSB and meiotic recombination (SPO11, DMC1, H3K4me3 and PRMD9) are higher in GC-rich TADs. One explanation for these observations is that the occurrence of DSB and recombination in meiotic cells are associated with compositional and epigenetic features (genomic code) that influence DNA stiffness/flexibility and appear to be similar to those guiding the chromatin architecture in the interphase nucleus of pre-leptotene cells.


Chromatin/genetics , Chromosomes, Mammalian/genetics , Genomics/methods , Histones/genetics , Homologous Recombination , Meiosis , Animals , Chromatin/chemistry , Chromatin/metabolism , DNA Breaks, Double-Stranded , Humans , Isochores , Mice
4.
PLoS One ; 13(8): e0202022, 2018.
Article En | MEDLINE | ID: mdl-30148849

Genetic Generalized Epilepsy (GGE) and benign epilepsy with centro-temporal spikes or Rolandic Epilepsy (RE) are common forms of genetic epilepsies. Rare copy number variants have been recognized as important risk factors in brain disorders. We performed a systematic survey of rare deletions affecting protein-coding genes derived from exome data of patients with common forms of genetic epilepsies. We analysed exomes from 390 European patients (196 GGE and 194 RE) and 572 population controls to identify low-frequency genic deletions. We found that 75 (32 GGE and 43 RE) patients out of 390, i.e. ~19%, carried rare genic deletions. In particular, large deletions (>400 kb) represent a higher burden in both GGE and RE syndromes as compared to controls. The detected low-frequency deletions (1) share genes with brain-expressed exons that are under negative selection, (2) overlap with known autism and epilepsy-associated candidate genes, (3) are enriched for CNV intolerant genes recorded by the Exome Aggregation Consortium (ExAC) and (4) coincide with likely disruptive de novo mutations from the NPdenovo database. Employing several knowledge databases, we discuss the most prominent epilepsy candidate genes and their protein-protein networks for GGE and RE.


Epilepsy, Rolandic/genetics , Gene Deletion , Genetic Association Studies , Genetic Predisposition to Disease , Autistic Disorder/genetics , Autistic Disorder/metabolism , Chromosome Deletion , Comparative Genomic Hybridization , DNA Copy Number Variations , Epilepsy, Generalized/genetics , Epilepsy, Rolandic/metabolism , Exome , Genetic Association Studies/methods , Humans , Mutation , Protein Interaction Mapping , Protein Interaction Maps , Reproducibility of Results , Workflow
5.
Lancet Neurol ; 17(8): 699-708, 2018 08.
Article En | MEDLINE | ID: mdl-30033060

BACKGROUND: Genetic generalised epilepsy is the most common type of inherited epilepsy. Despite a high concordance rate of 80% in monozygotic twins, the genetic background is still poorly understood. We aimed to investigate the burden of rare genetic variants in genetic generalised epilepsy. METHODS: For this exome-based case-control study, we used three different genetic generalised epilepsy case cohorts and three independent control cohorts, all of European descent. Cases included in the study were clinically evaluated for genetic generalised epilepsy. Whole-exome sequencing was done for the discovery case cohort, a validation case cohort, and two independent control cohorts. The replication case cohort underwent targeted next-generation sequencing of the 19 known genes encoding subunits of GABAA receptors and was compared to the respective GABAA receptor variants of a third independent control cohort. Functional investigations were done with automated two-microelectrode voltage clamping in Xenopus laevis oocytes. FINDINGS: Statistical comparison of 152 familial index cases with genetic generalised epilepsy in the discovery cohort to 549 ethnically matched controls suggested an enrichment of rare missense (Nonsyn) variants in the ensemble of 19 genes encoding GABAA receptors in cases (odds ratio [OR] 2·40 [95% CI 1·41-4·10]; pNonsyn=0·0014, adjusted pNonsyn=0·019). Enrichment for these genes was validated in a whole-exome sequencing cohort of 357 sporadic and familial genetic generalised epilepsy cases and 1485 independent controls (OR 1·46 [95% CI 1·05-2·03]; pNonsyn=0·0081, adjusted pNonsyn=0·016). Comparison of genes encoding GABAA receptors in the independent replication cohort of 583 familial and sporadic genetic generalised epilepsy index cases, based on candidate-gene panel sequencing, with a third independent control cohort of 635 controls confirmed the overall enrichment of rare missense variants for 15 GABAA receptor genes in cases compared with controls (OR 1·46 [95% CI 1·02-2·08]; pNonsyn=0·013, adjusted pNonsyn=0·027). Functional studies for two selected genes (GABRB2 and GABRA5) showed significant loss-of-function effects with reduced current amplitudes in four of seven tested variants compared with wild-type receptors. INTERPRETATION: Functionally relevant variants in genes encoding GABAA receptor subunits constitute a significant risk factor for genetic generalised epilepsy. Examination of the role of specific gene groups and pathways can disentangle the complex genetic architecture of genetic generalised epilepsy. FUNDING: EuroEPINOMICS (European Science Foundation through national funding organisations), Epicure and EpiPGX (Sixth Framework Programme and Seventh Framework Programme of the European Commission), Research Unit FOR2715 (German Research Foundation and Luxembourg National Research Fund).


Epilepsy, Generalized/genetics , Exome Sequencing/methods , Genetic Predisposition to Disease/genetics , Genetic Variation/genetics , Receptors, GABA-A/genetics , Adolescent , Adult , Aged , Aged, 80 and over , Case-Control Studies , Child , Child, Preschool , Cohort Studies , Epilepsy, Generalized/ethnology , Europe , Family Health , Female , Humans , Infant , Infant, Newborn , International Cooperation , Male , Middle Aged , Models, Molecular , Young Adult
6.
Life (Basel) ; 8(1)2018 Jan 30.
Article En | MEDLINE | ID: mdl-29385718

The CCCTC-binding factor (CTCF) is multi-functional, ubiquitously expressed, and highly conserved from Drosophila to human. It has important roles in transcriptional insulation and the formation of a high-dimensional chromatin structure. CTCF has a paralog called "Brother of Regulator of Imprinted Sites" (BORIS) or "CTCF-like" (CTCFL). It binds DNA at sites similar to those of CTCF. However, the expression profiles of the two proteins are quite different. We investigated the evolutionary trajectories of the two proteins after the duplication event using a phylogenomic and interactomic approach. We find that CTCF has 52 direct interaction partners while CTCFL only has 19. Almost all interactors already existed before the emergence of CTCF and CTCFL. The unique secondary loss of CTCF from several nematodes is paralleled by a loss of two of its interactors, the polycomb repressive complex subunit SuZ12 and the multifunctional transcription factor TYY1. In contrast to earlier studies reporting the absence of BORIS from birds, we present evidence for a multigene synteny block containing CTCFL that is conserved in mammals, reptiles, and several species of birds, indicating that not the entire lineage of birds experienced a loss of CTCFL. Within this synteny block, BORIS and its genomic neighbors seem to be partitioned into two nested chromatin loops. The high expression of SPO11, RAE1, RBM38, and PMEPA1 in male tissues suggests a possible link between CTCFL, meiotic recombination, and fertility-associated phenotypes. Using the 65,700 exomes and the 1000 genomes data, we observed a higher number of intergenic, non-synonymous, and loss-of-function mutations in CTCFL than in CTCF, suggesting a reduced strength of purifying selection, perhaps due to less functional constraint.

7.
Eur J Hum Genet ; 26(2): 258-264, 2018 02.
Article En | MEDLINE | ID: mdl-29358611

Rolandic epilepsy (RE) is the most common focal epilepsy in childhood. To date no hypothesis-free exome-wide mutational screen has been conducted for RE and atypical RE (ARE). Here we report on whole-exome sequencing of 194 unrelated patients with RE/ARE and 567 ethnically matched population controls. We identified an exome-wide significantly enriched burden for deleterious and loss-of-function variants only for the established RE/ARE gene GRIN2A. The statistical significance of the enrichment disappeared after removing ARE patients. For several disease-related gene-sets, an odds ratio >1 was detected for loss-of-function variants.


Epilepsy, Rolandic/genetics , Loss of Function Mutation , Receptors, N-Methyl-D-Aspartate/genetics , Adolescent , Child , Epilepsy, Rolandic/pathology , Exome , Female , Humans , Male
8.
PLoS One ; 12(1): e0168023, 2017.
Article En | MEDLINE | ID: mdl-28060840

A recent investigation showed the existence of correlations between the architectural features of mammalian interphase chromosomes and the compositional properties of isochores. This result prompted us to compare maps of the Topologically Associating Domains (TADs) and of the Lamina Associated Domains (LADs) with the corresponding isochore maps of mouse and human chromosomes. This approach revealed that: 1) TADs and LADs correspond to isochores, i.e., isochores are the genomic units that underlie chromatin domains; 2) the conservation of TADs and LADs in mammalian genomes is explained by the evolutionary conservation of isochores; 3) chromatin domains corresponding to GC-poor isochores (e.g., LADs) show not only self-interactions but also intrachromosomal interactions with other domains also corresponding to GC-poor isochores even if located far away; in contrast, chromatin domains corresponding to GC-rich isochores (e.g., TADs) show more localized chromosomal interactions, many of which are inter-chromosomal. In conclusion, this investigation establishes a link between DNA sequences and chromatin architecture, explains the evolutionary conservation of TADs and LADs and provides new information on the spatial distribution of GC-poor/gene-poor and GC-rich/gene-rich chromosomal regions in the interphase nucleus.


Chromatin , Isochores , Animals , Base Composition , Chromosomes, Mammalian , Cluster Analysis , Evolution, Molecular , GC Rich Sequence , Humans , Mice , Synteny
9.
Genomics ; 108(1): 31-6, 2016 07.
Article En | MEDLINE | ID: mdl-26772991

Epilepsy is a common complex disorder most frequently associated with psychiatric and neurological diseases. Massive parallel sequencing of individual or cohort genomes and exomes led the identification of several disease associated genes. We review here the candidate genes in epilepsy genetics with focus on exome and gene panel data. Together with the examination of brain expressed genes and post synaptic proteome the results show that: (1) Non-metabolic epilepsies and autism candidate genes tend to be AT-rich and (2) large transcript size and local AT-richness are characteristic features of genes involved in developmental brain disorders and synaptic functions. These results point to the preferential location of core epilepsy and autism candidate genes in late replicating, GC-poor chromosomal regions (isochores). These results indicate that the genomic alterations leading to some brain disorders are confined to responsive chromatin areas harboring brain critical genes.


Autistic Disorder/genetics , Epilepsy/genetics , Genetic Predisposition to Disease/genetics , Genome, Human/genetics , Genomics/methods , Brain/metabolism , Gene Expression Profiling , Humans , Isochores/genetics , Proteome/genetics
10.
PLoS One ; 10(5): e0126321, 2015.
Article En | MEDLINE | ID: mdl-25942438

Next generation sequencing (NGS) has been a great success and is now a standard method of research in the life sciences. With this technology, dozens of whole genomes or hundreds of exomes can be sequenced in rather short time, producing huge amounts of data. Complex bioinformatics analyses are required to turn these data into scientific findings. In order to run these analyses fast, automated workflows implemented on high performance computers are state of the art. While providing sufficient compute power and storage to meet the NGS data challenge, high performance computing (HPC) systems require special care when utilized for high throughput processing. This is especially true if the HPC system is shared by different users. Here, stability, robustness and maintainability are as important for automated workflows as speed and throughput. To achieve all of these aims, dedicated solutions have to be developed. In this paper, we present the tricks and twists that we utilized in the implementation of our exome data processing workflow. It may serve as a guideline for other high throughput data analysis projects using a similar infrastructure. The code implementing our solutions is provided in the supporting information files.


Computational Biology/methods , Computing Methodologies , High-Throughput Nucleotide Sequencing/methods , Sequence Analysis, DNA/methods , Electronic Data Processing/methods , Humans , Workflow
11.
Science ; 345(6199): 950-3, 2014 Aug 22.
Article En | MEDLINE | ID: mdl-25146293

Oilseed rape (Brassica napus L.) was formed ~7500 years ago by hybridization between B. rapa and B. oleracea, followed by chromosome doubling, a process known as allopolyploidy. Together with more ancient polyploidizations, this conferred an aggregate 72× genome multiplication since the origin of angiosperms and high gene content. We examined the B. napus genome and the consequences of its recent duplication. The constituent An and Cn subgenomes are engaged in subtle structural, functional, and epigenetic cross-talk, with abundant homeologous exchanges. Incipient gene loss and expression divergence have begun. Selection in B. napus oilseed types has accelerated the loss of glucosinolate genes, while preserving expansion of oil biosynthesis genes. These processes provide insights into allopolyploid evolution and its relationship with crop domestication and improvement.


Brassica napus/genetics , Chromosome Duplication , Evolution, Molecular , Genome, Plant , Polyploidy , Seeds/genetics , Brassica napus/cytology
12.
Nat Biotechnol ; 32(7): 656-62, 2014 Jul.
Article En | MEDLINE | ID: mdl-24908277

Cultivated citrus are selections from, or hybrids of, wild progenitor species whose identities and contributions to citrus domestication remain controversial. Here we sequence and compare citrus genomes--a high-quality reference haploid clementine genome and mandarin, pummelo, sweet-orange and sour-orange genomes--and show that cultivated types derive from two progenitor species. Although cultivated pummelos represent selections from one progenitor species, Citrus maxima, cultivated mandarins are introgressions of C. maxima into the ancestral mandarin species Citrus reticulata. The most widely cultivated citrus, sweet orange, is the offspring of previously admixed individuals, but sour orange is an F1 hybrid of pure C. maxima and C. reticulata parents, thus implying that wild mandarins were part of the early breeding germplasm. A Chinese wild 'mandarin' diverges substantially from C. reticulata, thus suggesting the possibility of other unrecognized wild citrus species. Understanding citrus phylogeny through genome analysis clarifies taxonomic relationships and facilitates sequence-directed genetic improvement.


Breeding , Citrus/classification , Citrus/genetics , Conserved Sequence/genetics , Crops, Agricultural/genetics , Genetic Variation/genetics , Genome, Plant/genetics , Base Sequence , Evolution, Molecular , Molecular Sequence Data , Sequence Analysis, DNA , Species Specificity
13.
Ann Neurol ; 75(5): 788-92, 2014 May.
Article En | MEDLINE | ID: mdl-24591017

Recent studies reported DEPDC5 loss-of-function mutations in different focal epilepsy syndromes. Here we identified 1 predicted truncation and 2 missense mutations in 3 children with rolandic epilepsy (3 of 207). In addition, we identified 3 families with unclassified focal childhood epilepsies carrying predicted truncating DEPDC5 mutations (3 of 82). The detected variants were all novel, inherited, and present in all tested affected (n=11) and in 7 unaffected family members, indicating low penetrance. Our findings extend the phenotypic spectrum associated with mutations in DEPDC5 and suggest that rolandic epilepsy, albeit rarely, and other nonlesional childhood epilepsies are among the associated syndromes.


Epilepsies, Partial/genetics , Mutation/genetics , TOR Serine-Threonine Kinases/genetics , Child , Child, Preschool , Epilepsies, Partial/diagnosis , Epilepsy, Rolandic/diagnosis , Epilepsy, Rolandic/genetics , Female , Genetic Variation/genetics , Humans , Intracellular Signaling Peptides and Proteins , Male , Pedigree , Phenotype
14.
PLoS Genet ; 10(2): e1004007, 2014 Feb.
Article En | MEDLINE | ID: mdl-24516393

Members of the family Trypanosomatidae infect many organisms, including animals, plants and humans. Plant-infecting trypanosomes are grouped under the single genus Phytomonas, failing to reflect the wide biological and pathological diversity of these protists. While some Phytomonas spp. multiply in the latex of plants, or in fruit or seeds without apparent pathogenicity, others colonize the phloem sap and afflict plants of substantial economic value, including the coffee tree, coconut and oil palms. Plant trypanosomes have not been studied extensively at the genome level, a major gap in understanding and controlling pathogenesis. We describe the genome sequences of two plant trypanosomatids, one pathogenic isolate from a Guianan coconut and one non-symptomatic isolate from Euphorbia collected in France. Although these parasites have extremely distinct pathogenic impacts, very few genes are unique to either, with the vast majority of genes shared by both isolates. Significantly, both Phytomonas spp. genomes consist essentially of single copy genes for the bulk of their metabolic enzymes, whereas other trypanosomatids e.g. Leishmania and Trypanosoma possess multiple paralogous genes or families. Indeed, comparison with other trypanosomatid genomes revealed a highly streamlined genome, encoding for a minimized metabolic system while conserving the major pathways, and with retention of a full complement of endomembrane organelles, but with no evidence for functional complexity. Identification of the metabolic genes of Phytomonas provides opportunities for establishing in vitro culturing of these fastidious parasites and new tools for the control of agricultural plant disease.


Kinetoplastida/genetics , Plant Diseases/genetics , Sequence Analysis, DNA , Trypanosomatina/genetics , Animals , Cocos/genetics , Cocos/parasitology , Coffee/genetics , Coffee/parasitology , France , Genome , Humans , Kinetoplastida/pathogenicity , Plant Diseases/parasitology , Seeds/parasitology , Trypanosomatina/pathogenicity
15.
Proc Natl Acad Sci U S A ; 110(13): 5247-52, 2013 Mar 26.
Article En | MEDLINE | ID: mdl-23503846

Red seaweeds are key components of coastal ecosystems and are economically important as food and as a source of gelling agents, but their genes and genomes have received little attention. Here we report the sequencing of the 105-Mbp genome of the florideophyte Chondrus crispus (Irish moss) and the annotation of the 9,606 genes. The genome features an unusual structure characterized by gene-dense regions surrounded by repeat-rich regions dominated by transposable elements. Despite its fairly large size, this genome shows features typical of compact genomes, e.g., on average only 0.3 introns per gene, short introns, low median distance between genes, small gene families, and no indication of large-scale genome duplication. The genome also gives insights into the metabolism of marine red algae and adaptations to the marine environment, including genes related to halogen metabolism, oxylipins, and multicellularity (microRNA processing and transcription factors). Particularly interesting are features related to carbohydrate metabolism, which include a minimalistic gene set for starch biosynthesis, the presence of cellulose synthases acquired before the primary endosymbiosis showing the polyphyly of cellulose synthesis in Archaeplastida, and cellulases absent in terrestrial plants as well as the occurrence of a mannosylglycerate synthase potentially originating from a marine bacterium. To explain the observations on genome structure and gene content, we propose an evolutionary scenario involving an ancestral red alga that was driven by early ecological forces to lose genes, introns, and intergenetic DNA; this loss was followed by an expansion of genome size as a consequence of activity of transposable elements.


Chondrus/genetics , Evolution, Molecular , Genes, Plant , Base Sequence , MicroRNAs/genetics , Molecular Sequence Data , Plant Proteins/genetics , RNA, Plant/genetics
16.
Nature ; 488(7410): 213-7, 2012 Aug 09.
Article En | MEDLINE | ID: mdl-22801500

Bananas (Musa spp.), including dessert and cooking types, are giant perennial monocotyledonous herbs of the order Zingiberales, a sister group to the well-studied Poales, which include cereals. Bananas are vital for food security in many tropical and subtropical countries and the most popular fruit in industrialized countries. The Musa domestication process started some 7,000 years ago in Southeast Asia. It involved hybridizations between diverse species and subspecies, fostered by human migrations, and selection of diploid and triploid seedless, parthenocarpic hybrids thereafter widely dispersed by vegetative propagation. Half of the current production relies on somaclones derived from a single triploid genotype (Cavendish). Pests and diseases have gradually become adapted, representing an imminent danger for global banana production. Here we describe the draft sequence of the 523-megabase genome of a Musa acuminata doubled-haploid genotype, providing a crucial stepping-stone for genetic improvement of banana. We detected three rounds of whole-genome duplications in the Musa lineage, independently of those previously described in the Poales lineage and the one we detected in the Arecales lineage. This first monocotyledon high-continuity whole-genome sequence reported outside Poales represents an essential bridge for comparative genome analysis in plants. As such, it clarifies commelinid-monocotyledon phylogenetic relationships, reveals Poaceae-specific features and has led to the discovery of conserved non-coding sequences predating monocotyledon-eudicotyledon divergence.


Evolution, Molecular , Genome, Plant/genetics , Musa/genetics , Conserved Sequence/genetics , DNA Transposable Elements/genetics , Gene Duplication/genetics , Genes, Plant/genetics , Genotype , Haploidy , Molecular Sequence Data , Musa/classification , Phylogeny
17.
Genome Biol ; 11(8): R85, 2010.
Article En | MEDLINE | ID: mdl-20738856

BACKGROUND: Diatoms represent the predominant group of eukaryotic phytoplankton in the oceans and are responsible for around 20% of global photosynthesis. Two whole genome sequences are now available. Notwithstanding, our knowledge of diatom biology remains limited because only around half of their genes can be ascribed a function based onhomology-based methods. High throughput tools are needed, therefore, to associate functions with diatom-specific genes. RESULTS: We have performed a systematic analysis of 130,000 ESTs derived from Phaeodactylum tricornutum cells grown in 16 different conditions. These include different sources of nitrogen, different concentrations of carbon dioxide, silicate and iron, and abiotic stresses such as low temperature and low salinity. Based on unbiased statistical methods, we have catalogued transcripts with similar expression profiles and identified transcripts differentially expressed in response to specific treatments. Functional annotation of these transcripts provides insights into expression patterns of genes involved in various metabolic and regulatory pathways and into the roles of novel genes with unknown functions. Specific growth conditions could be associated with enhanced gene diversity, known gene product functions, and over-representation of novel transcripts. Comparative analysis of data from the other sequenced diatom, Thalassiosira pseudonana, helped identify several unique diatom genes that are specifically regulated under particular conditions, thus facilitating studies of gene function, genome annotation and the molecular basis of species diversity. CONCLUSIONS: The digital gene expression database represents a new resource for identifying candidate diatom-specific genes involved in processes of major ecological relevance.


Adaptation, Physiological/genetics , Diatoms/genetics , Gene Expression Profiling/methods , Gene Expression Regulation/physiology , RNA, Messenger/analysis , Carbon Dioxide/metabolism , Environment , Expressed Sequence Tags , Iron/metabolism , Molecular Sequence Data , Nitrogen/metabolism , Salinity , Silicates/metabolism , Temperature
18.
New Phytol ; 188(1): 52-66, 2010 Oct.
Article En | MEDLINE | ID: mdl-20646219

• By comparative analyses we identify lineage-specific diversity in transcription factors (TFs) from stramenopile (or heterokont) genome sequences. We compared a pennate (Phaeodactylum tricornutum) and a centric diatom (Thalassiosira pseudonana) with those of other stramenopiles (oomycetes, Pelagophyceae, and Phaeophyceae (Ectocarpus siliculosus)) as well as to that of Emiliania huxleyi, a haptophyte that is evolutionarily related to the stramenopiles. • We provide a detailed description of diatom TF complements and report numerous peculiarities: in both diatoms, the heat shock factor (HSF) family is overamplified and constitutes the most abundant class of TFs; Myb and C2H2-type zinc finger TFs are the two most abundant TF families encoded in all the other stramenopile genomes investigated; the presence of diatom and lineage-specific gene fusions, in particular a class of putative photoreceptors with light-sensitive Per-Arnt-Sim (PAS) and DNA-binding (basic-leucine zipper, bZIP) domains and an HSF-AP2 domain fusion. • Expression data analysis shows that many of the TFs studied are transcribed and may be involved in specific responses to environmental stimuli. • Evolutionary and functional relevance of these observations are discussed.


Genome/genetics , Multigene Family/genetics , Photosynthesis/genetics , Stramenopiles/genetics , Transcription Factors/metabolism , Base Sequence , Gene Expression Regulation , Phylogeny , Protein Structure, Tertiary , Transcription Factors/chemistry , Transcription Factors/genetics
19.
Nature ; 465(7298): 617-21, 2010 Jun 03.
Article En | MEDLINE | ID: mdl-20520714

Brown algae (Phaeophyceae) are complex photosynthetic organisms with a very different evolutionary history to green plants, to which they are only distantly related. These seaweeds are the dominant species in rocky coastal ecosystems and they exhibit many interesting adaptations to these, often harsh, environments. Brown algae are also one of only a small number of eukaryotic lineages that have evolved complex multicellularity (Fig. 1). We report the 214 million base pair (Mbp) genome sequence of the filamentous seaweed Ectocarpus siliculosus (Dillwyn) Lyngbye, a model organism for brown algae, closely related to the kelps (Fig. 1). Genome features such as the presence of an extended set of light-harvesting and pigment biosynthesis genes and new metabolic processes such as halide metabolism help explain the ability of this organism to cope with the highly variable tidal environment. The evolution of multicellularity in this lineage is correlated with the presence of a rich array of signal transduction genes. Of particular interest is the presence of a family of receptor kinases, as the independent evolution of related molecules has been linked with the emergence of multicellularity in both the animal and green plant lineages. The Ectocarpus genome sequence represents an important step towards developing this organism as a model species, providing the possibility to combine genomic and genetic approaches to explore these and other aspects of brown algal biology further.


Algal Proteins/genetics , Biological Evolution , Genome/genetics , Phaeophyceae/cytology , Phaeophyceae/genetics , Animals , Eukaryota , Evolution, Molecular , Molecular Sequence Data , Phaeophyceae/metabolism , Phylogeny , Pigments, Biological/biosynthesis , Signal Transduction/genetics
20.
New Phytol ; 185(2): 446-58, 2010 Jan.
Article En | MEDLINE | ID: mdl-19912547

Summary *Ten axenic cultures, referred to as Fibrocapsa japonica, were studied for their morphology, pigment composition, toxicity and phylogeny. *Morphologically, all 10 accessions were similar and displayed equivalent pigment contents. We identified chlorophylls a and c, beta-carotene and fucoxanthin as the dominant pigments, together with xanthophyll cycle carotenoids likely to be involved in photoprotection. *All 10 accessions caused brine shrimp, Artemia salina, mortality and displayed haemolytic and haemaglutination activities toward sheep erythrocytes. Our results indicate that haemaglutination activity is a key component of F. japonica toxicity. *Examination of a collection of F. japonica expressed sequence tags (ESTs) has led to the identification of candidate genes involved in F. japonica toxicity and/or growth control.


Eukaryota , Animals , Artemia , Carotenoids/analysis , Chlorophyll/analysis , Erythrocytes/drug effects , Eukaryota/chemistry , Eukaryota/genetics , Eukaryota/pathogenicity , Eukaryota/physiology , Expressed Sequence Tags , Genomics , Hemagglutination , Sheep , Xanthophylls/analysis
...