RESUMO
Oryza sativa (rice) plays an essential food security role for more than half of the world's population. Obtaining crops with high levels of disease resistance is a major challenge for breeders, especially today, given the urgent need for agriculture to be more sustainable. Plant resistance genes are mainly encoded by three large leucine-rich repeat (LRR)-containing receptor (LRR-CR) families: the LRR-receptor-like kinase (LRR-RLK), LRR-receptor-like protein (LRR-RLP) and nucleotide-binding LRR receptor (NLR). Using lrrprofiler, a pipeline that we developed to annotate and classify these proteins, we compared three publicly available annotations of the rice Nipponbare reference genome. The extended discrepancies that we observed for LRR-CR gene models led us to perform an in-depth manual curation of their annotations while paying special attention to nonsense mutations. We then transferred this manually curated annotation to Kitaake, a cultivar that is closely related to Nipponbare, using an optimized strategy. Here, we discuss the breakthrough achieved by manual curation when comparing genomes and, in addition to 'functional' and 'structural' annotations, we propose that the community adopts this approach, which we call 'comprehensive' annotation. The resulting data are crucial for further studies on the natural variability and evolution of LRR-CR genes in order to promote their use in breeding future resilient varieties.
Assuntos
Anotação de Sequência Molecular , Oryza/genética , Proteínas de Plantas/genética , Sequências Repetitivas de Aminoácidos , Genoma de Planta , Genótipo , Anotação de Sequência Molecular/métodos , Oryza/química , Proteínas de Plantas/químicaRESUMO
Multiple sequence alignment is a prerequisite for many evolutionary analyses. Multiple Alignment of Coding Sequences (MACSE) is a multiple sequence alignment program that explicitly accounts for the underlying codon structure of protein-coding nucleotide sequences. Its unique characteristic allows building reliable codon alignments even in the presence of frameshifts. This facilitates downstream analyses such as selection pressure estimation based on the ratio of nonsynonymous to synonymous substitutions. Here, we present MACSE v2, a major update with an improved version of the initial algorithm enriched with a complete toolkit to handle multiple alignments of protein-coding sequences. A graphical interface now provides user-friendly access to the different subprograms.
Assuntos
Alinhamento de Sequência , Software , Códon de Terminação , Mutação da Fase de LeituraRESUMO
Gene duplications are an important factor in plant evolution, and lineage-specific expanded (LSE) genes are of particular interest. Receptor-like kinases expanded massively in land plants, and leucine-rich repeat receptor-like kinases (LRR-RLK) constitute the largest receptor-like kinases family. Based on the phylogeny of 7,554 LRR-RLK genes from 31 fully sequenced flowering plant genomes, the complex evolutionary dynamics of this family was characterized in depth. We studied the involvement of selection during the expansion of this family among angiosperms. LRR-RLK subgroups harbor extremely contrasting rates of duplication, retention, or loss, and LSE copies are predominantly found in subgroups involved in environmental interactions. Expansion rates also differ significantly depending on the time when rounds of expansion or loss occurred on the angiosperm phylogenetic tree. Finally, using a dN/dS-based test in a phylogenetic framework, we searched for selection footprints on LSE and single-copy LRR-RLK genes. Selective constraint appeared to be globally relaxed at LSE genes, and codons under positive selection were detected in 50% of them. Moreover, the leucine-rich repeat domains, and specifically four amino acids in them, were found to be the main targets of positive selection. Here, we provide an extensive overview of the expansion and evolution of this very large gene family.
Assuntos
Evolução Molecular , Magnoliopsida/genética , Família Multigênica , Proteínas de Plantas/genética , Receptores Proteína Tirosina Quinases/genética , Sequências Repetitivas de Aminoácidos , Motivos de Aminoácidos , Duplicação Gênica , Variação Genética , Magnoliopsida/classificação , Modelos Genéticos , Filogenia , Proteínas de Plantas/classificação , Receptores Proteína Tirosina Quinases/classificação , Seleção Genética , Especificidade da Espécie , Fatores de TempoRESUMO
Local climatic conditions likely constitute an important selective pressure on genes underlying important fitness-related traits such as flowering time, and in many species, flowering phenology and climatic gradients strongly covary. To test whether climate shapes the genetic variation on flowering time genes and to identify candidate flowering genes involved in the adaptation to environmental heterogeneity, we used a large Medicago truncatula core collection to examine the association between nucleotide polymorphisms at 224 candidate genes and both climate variables and flowering phenotypes. Unlike genome-wide studies, candidate gene approaches are expected to enrich for the number of meaningful trait associations because they specifically target genes that are known to affect the trait of interest. We found that flowering time mediates adaptation to climatic conditions mainly by variation at genes located upstream in the flowering pathways, close to the environmental stimuli. Variables related to the annual precipitation regime reflected selective constraints on flowering time genes better than the other variables tested (temperature, altitude, latitude or longitude). By comparing phenotype and climate associations, we identified 12 flowering genes as the most promising candidates responsible for phenological adaptation to climate. Four of these genes were located in the known flowering time QTL region on chromosome 7. However, climate and flowering associations also highlighted largely distinct gene sets, suggesting different genetic architectures for adaptation to climate and flowering onset.
Assuntos
Aclimatação/genética , Clima , Flores/fisiologia , Medicago truncatula/genética , África do Norte , Europa (Continente) , Genética Populacional , Medicago truncatula/fisiologia , Modelos Genéticos , Família Multigênica , Fenótipo , Polimorfismo de Nucleotídeo Único , Locos de Características QuantitativasRESUMO
BACKGROUND: Recurrent gene duplication and retention played an important role in angiosperm genome evolution. It has been hypothesized that these processes contribute significantly to plant adaptation but so far this hypothesis has not been tested at the genome scale. RESULTS: We studied available sequenced angiosperm genomes to assess the frequency of positive selection footprints in lineage specific expanded (LSE) gene families compared to single-copy genes using a dN/dS-based test in a phylogenetic framework. We found 5.38% of alignments in LSE genes with codons under positive selection. In contrast, we found no evidence for codons under positive selection in the single-copy reference set. An analysis at the branch level shows that purifying selection acted more strongly on single-copy genes than on LSE gene clusters. Moreover we detect significantly more branches indicating evolution under positive selection and/or relaxed constraint in LSE genes than in single-copy genes. CONCLUSIONS: In this - to our knowledge -first genome-scale study we provide strong empirical support for the hypothesis that LSE genes fuel adaptation in angiosperms. Our conservative approach for detecting selection footprints as well as our results can be of interest for further studies on (plant) gene family evolution.
Assuntos
Adaptação Fisiológica/genética , Duplicação Gênica , Genoma de Planta , Análise por Conglomerados , Códon/genética , Bases de Dados Genéticas , Anotação de Sequência Molecular , Família Multigênica , Mutação/genética , Polimorfismo Genético , Seleção Genética , Fatores de TempoRESUMO
⢠The use of quantitative disease resistance (QDR) is a promising strategy for promoting durable resistance to plant pathogens, but genes involved in QDR are largely unknown. To identify genetic components and accelerate improvement of QDR in legumes to the root pathogen Aphanomyces euteiches, we took advantage of both the recently generated massive genomic data for Medicago truncatula and natural variation of this model legume. ⢠A high-density (≈5.1 million single nucleotide polymorphisms (SNPs)) genome-wide association study (GWAS) was performed with both in vitro and glasshouse phenotyping data collected for 179 lines. ⢠GWAS identified several candidate genes and pinpointed two independent major loci on the top of chromosome 3 that were detected in both phenotyping methods. Candidate SNPs in the most significant locus (σ(A)²= 23%) were in the promoter and coding regions of an F-box protein coding gene. Subsequent qRT-PCR and bioinformatic analyses performed on 20 lines demonstrated that resistance is associated with mutations directly affecting the interaction domain of the F-box protein rather than gene expression. ⢠These results refine the position of previously identified QTL to specific candidate genes, suggest potential molecular mechanisms, and identify new loci explaining QDR against A. euteiches.
Assuntos
Aphanomyces/fisiologia , Mapeamento Cromossômico , Resistência à Doença/genética , Proteínas F-Box/genética , Estudo de Associação Genômica Ampla , Medicago truncatula/genética , Medicago truncatula/microbiologia , Doenças das Plantas/imunologia , Contagem de Colônia Microbiana , Citocininas/metabolismo , Proteínas F-Box/metabolismo , Regulação da Expressão Gênica de Plantas , Genes de Plantas/genética , Medicago truncatula/crescimento & desenvolvimento , Medicago truncatula/imunologia , Mutação/genética , Doenças das Plantas/microbiologia , Proteínas de Plantas/genética , Proteínas de Plantas/metabolismo , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Ralstonia/fisiologia , Nódulos Radiculares de Plantas/metabolismo , Nódulos Radiculares de Plantas/microbiologia , Transdução de Sinais/genética , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo , Transcrição Gênica , Regulação para CimaRESUMO
BACKGROUND: Gene duplications are a molecular mechanism potentially mediating generation of functional novelty. However, the probabilities of maintenance and functional divergence of duplicated genes are shaped by selective pressures acting on gene copies immediately after the duplication event. The ratio of non-synonymous to synonymous substitution rates in protein-coding sequences provides a means to investigate selective pressures based on genic sequences. Three molecular signatures can reveal early stages of functional divergence between gene copies: change in the level of purifying selection between paralogous genes, occurrence of positive selection, and transient relaxed purifying selection following gene duplication. We studied three pairs of genes that are known to be involved in an interaction with symbiotic bacteria and were recently duplicated in the history of the Medicago genus (Fabaceae). We sequenced two pairs of polygalacturonase genes (Pg11-Pg3 and Pg11a-Pg11c) and one pair of auxine transporter-like genes (Lax2-Lax4) in 17 species belonging to the Medicago genus, and sought for molecular signatures of differentiation between copies. RESULTS: Selective histories revealed by these three signatures of molecular differentiation were found to be markedly different between each pair of paralogs. We found sites under positive selection in the Pg11 paralogs while Pg3 has mainly evolved under purifying selection. The most recent paralogs examined Pg11a and Pg11c, are both undergoing positive selection and might be acquiring new functions. Lax2 and Lax4 paralogs are both under strong purifying selection, but still underwent a temporary relaxation of purifying selection immediately after duplication. CONCLUSIONS: This study illustrates the variety of selective pressures undergone by duplicated genes and the effect of age of the duplication. We found that relaxation of selective constraints immediately after duplication might promote adaptive divergence.
Assuntos
Medicago/classificação , Medicago/genética , Seleção Genética , Duplicação Gênica , Proteínas de Membrana Transportadoras/genética , Dados de Sequência Molecular , Proteínas de Plantas/genética , Poligalacturonase/genéticaRESUMO
BACKGROUND: We studied patterns of molecular adaptation in the wild Mediterranean legume Medicago truncatula. We focused on two phenotypic traits that are not functionally linked: flowering time and perception of symbiotic microbes. Phenology is an important fitness component, especially for annual plants, and many instances of molecular adaptation have been reported for genes involved in flowering pathways. While perception of symbiotic microbes is also integral to adaptation in many plant species, very few reports of molecular adaptation exist for symbiotic genes. Here we used data from 57 individuals and 53 gene fragments to quantify the overall strength of both positive and purifying selection in M. truncatula and asked if footprints of positive selection can be detected at key genes of rhizobia recognition pathways. RESULTS: We examined nucleotide variation among 57 accessions from natural populations in 53 gene fragments: 5 genes involved in nitrogen-fixing bacteria recognition, 11 genes involved in flowering, and 37 genes used as control loci. We detected 1757 polymorphic sites yielding an average nucleotide diversity (pi) of 0.003 per site. Non-synonymous variation is under sizable purifying selection with 90% of amino-acid changing mutations being strongly selected against. Accessions were structured in two groups consistent with geographical origins. Each of these two groups harboured an excess of rare alleles, relative to expectations of a constant-sized population, suggesting recent population expansion. Using coalescent simulations and an approximate Bayesian computation framework we detected several instances of genes departing from selective neutrality within each group and showed that the polymorphism of two nodulation and four flowering genes has probably been shaped by recent positive selection. CONCLUSION: We quantify the intensity of purifying selection in the M. truncatula genome and show that putative footprints of natural selection can be detected at different time scales in both flowering and symbiotic pathways.
Assuntos
Evolução Molecular , Medicago truncatula/genética , Proteínas de Plantas/genética , Polimorfismo Genético , Adaptação Fisiológica , Variação Genética , Genótipo , Medicago truncatula/microbiologia , Medicago truncatula/fisiologia , Dados de Sequência Molecular , Proteínas de Plantas/metabolismo , Rhizobium/fisiologia , Seleção Genética , SimbioseRESUMO
Most genomic and evolutionary comparative analyses rely on accurate multiple sequence alignments. With their underlying codon structure, protein-coding nucleotide sequences pose a specific challenge for multiple sequence alignment. Multiple Alignment of Coding Sequences (MACSE) is a multiple sequence alignment program that provided the first automatic solution for aligning protein-coding gene datasets containing both functional and nonfunctional sequences (pseudogenes). Through its unique features, reliable codon alignments can be built in the presence of frameshifts and stop codons suitable for subsequent analysis of selection based on the ratio of nonsynonymous to synonymous substitutions. Here we offer a practical overview and guidelines on the use of MACSE v2. This major update of the initial algorithm now comes with a graphical interface providing user-friendly access to different subprograms to handle multiple alignments of protein-coding sequences. We also present new pipelines based on MACSE v2 subprograms to handle large datasets and distributed as Singularity containers. MACSE and associated pipelines are available at: https://bioweb.supagro.inra.fr/macse/ .
Assuntos
Biologia Computacional/métodos , Alinhamento de Sequência/métodos , Análise de Sequência de DNA/métodos , Software , Algoritmos , Sequência de Aminoácidos/genética , Animais , Sequência de Bases/genética , Evolução Molecular , Genômica , Filogenia , PseudogenesRESUMO
Because of their high level of diversity and complex evolutionary histories, most studies on plant receptor-like kinase subfamilies have focused on their kinase domains. With the large amount of genome sequence data available today, particularly on basal land plants and Charophyta, more attention should be paid to primary events that shaped the diversity of the RLK gene family. We thus focus on the motifs and domains found in association with kinase domains to illustrate their origin, organization, and evolutionary dynamics. We discuss when these different domain associations first occurred and how they evolved, based on a literature review complemented by some of our unpublished results.
Assuntos
Proteínas de Plantas , Plantas , Evolução Biológica , Genoma de Planta , Filogenia , Proteínas de Plantas/genética , Plantas/genética , Proteínas Serina-Treonina QuinasesRESUMO
BACKGROUND: Plant non-specific lipid transfer proteins (nsLTPs) are encoded by multigene families and possess physiological functions that remain unclear. Our objective was to characterize the complete nsLtp gene family in rice and arabidopsis and to perform wheat EST database mining for nsLtp gene discovery. RESULTS: In this study, we carried out a genome-wide analysis of nsLtp gene families in Oryza sativa and Arabidopsis thaliana and identified 52 rice nsLtp genes and 49 arabidopsis nsLtp genes. Here we present a complete overview of the genes and deduced protein features. Tandem duplication repeats, which represent 26 out of the 52 rice nsLtp genes and 18 out of the 49 arabidopsis nsLtp genes identified, support the complexity of the nsLtp gene families in these species. Phylogenetic analysis revealed that rice and arabidopsis nsLTPs are clustered in nine different clades. In addition, we performed comparative analysis of rice nsLtp genes and wheat (Triticum aestivum) EST sequences indexed in the UniGene database. We identified 156 putative wheat nsLtp genes, among which 91 were found in the 'Chinese Spring' cultivar. The 122 wheat non-redundant nsLTPs were organized in eight types and 33 subfamilies. Based on the observation that seven of these clades were present in arabidopsis, rice and wheat, we conclude that the major functional diversification within the nsLTP family predated the monocot/dicot divergence. In contrast, there is no type VII nsLTPs in arabidopsis and type IX nsLTPs were only identified in arabidopsis. The reason for the larger number of nsLtp genes in wheat may simply be due to the hexaploid state of wheat but may also reflect extensive duplication of gene clusters as observed on rice chromosomes 11 and 12 and arabidopsis chromosome 5. CONCLUSION: Our current study provides fundamental information on the organization of the rice, arabidopsis and wheat nsLtp gene families. The multiplicity of nsLTP types provide new insights on arabidopsis, rice and wheat nsLtp gene families and will strongly support further transcript profiling or functional analyses of nsLtp genes. Until such time as specific physiological functions are defined, it seems relevant to categorize plant nsLTPs on the basis of sequence similarity and/or phylogenetic clustering.
Assuntos
Arabidopsis/genética , Proteínas de Transporte/genética , Oryza/genética , Proteínas de Plantas/genética , Triticum/genética , Sequência de Aminoácidos , Proteínas de Arabidopsis/genética , Bases de Dados Genéticas , Etiquetas de Sequências Expressas , Genes de Plantas , Variação Genética , Genoma de Planta , Genômica , Dados de Sequência Molecular , Família Multigênica , Filogenia , Homologia de Sequência de Aminoácidos , Especificidade da Espécie , Sequências de Repetição em TandemRESUMO
Plant cell walls play a fundamental role in several plant traits and also influence crop use as livestock nutrition or biofuel production. The Glycosyltransferase family 61 (GT61) is involved in the synthesis of cell wall xylans. In grasses (Poaceae), a copy number expansion was reported for the GT61 family, and raised the question of the evolutionary history of this gene family in a broader taxonomic context. A phylogenetic study was performed on GT61 members from 13 species representing the major angiosperm clades, in order to classify the genes, reconstruct the evolutionary history of this gene family and study its expansion in monocots. Four orthogroups (OG) were identified in angiosperms with two of them displaying a copy number expansion in monocots. These copy number expansions resulted from both tandem and segmental duplications during the genome evolution of monocot lineages. Positive selection footprints were detected on the ancestral branch leading to one of the orthogroups suggesting that the gene number expansion was accompanied by functional diversification, at least partially. We propose an OG-based classification framework for the GT61 genes at different taxonomic levels of the angiosperm useful for any further functional or translational biology study.
RESUMO
Oaks are an important part of our natural and cultural heritage. Not only are they ubiquitous in our most common landscapes1 but they have also supplied human societies with invaluable services, including food and shelter, since prehistoric times2. With 450 species spread throughout Asia, Europe and America3, oaks constitute a critical global renewable resource. The longevity of oaks (several hundred years) probably underlies their emblematic cultural and historical importance. Such long-lived sessile organisms must persist in the face of a wide range of abiotic and biotic threats over their lifespans. We investigated the genomic features associated with such a long lifespan by sequencing, assembling and annotating the oak genome. We then used the growing number of whole-genome sequences for plants (including tree and herbaceous species) to investigate the parallel evolution of genomic characteristics potentially underpinning tree longevity. A further consequence of the long lifespan of trees is their accumulation of somatic mutations during mitotic divisions of stem cells present in the shoot apical meristems. Empirical4 and modelling5 approaches have shown that intra-organismal genetic heterogeneity can be selected for6 and provides direct fitness benefits in the arms race with short-lived pests and pathogens through a patchwork of intra-organismal phenotypes7. However, there is no clear proof that large-statured trees consist of a genetic mosaic of clonally distinct cell lineages within and between branches. Through this case study of oak, we demonstrate the accumulation and transmission of somatic mutations and the expansion of disease-resistance gene families in trees.
Assuntos
Genoma de Planta/genética , Quercus/genética , Evolução Biológica , DNA de Plantas/genética , Variação Genética/genética , Longevidade/genética , Mutação , Filogenia , Análise de Sequência de DNARESUMO
Domestication is known to strongly reduce genomic diversity through population bottlenecks. The resulting loss of polymorphism has been thoroughly documented in numerous cultivated species. Here we investigate the impact of domestication on the diversity of alternative transcript expressions using RNAseq data obtained on cultivated and wild sorghum accessions (ten accessions for each pool). In that aim, we focus on genes expressing two isoforms in sorghum and estimate the ratio between expression levels of those isoforms in each accession. Noticeably, for a given gene, one isoform can either be overexpressed or underexpressed in some wild accessions, whereas in the cultivated accessions, the balance between the two isoforms of the same gene appears to be much more homogenous. Indeed, we observe in sorghum significantly more variation in isoform expression balance among wild accessions than among domesticated accessions. The possibility exists that the loss of nucleotide diversity due to domestication could affect regulatory elements, controlling transcription or degradation of these isoforms. Impact on the isoform expression balance is discussed. As far as we know, this is the first time that the impact of domestication on transcript isoform balance has been studied at the genomic scale. This could pave the way towards the identification of key domestication genes with finely tuned isoform expressions in domesticated accessions while being highly variable in their wild relatives.
Assuntos
Processamento Alternativo/genética , Sorghum/genética , Sorghum/metabolismo , Processamento Alternativo/fisiologia , Proteínas de Plantas/genética , Proteínas de Plantas/metabolismo , Isoformas de Proteínas/genética , Isoformas de Proteínas/metabolismoRESUMO
[This corrects the article on p. 381 in vol. 8, PMID: 28424707.].
RESUMO
Leucine-Rich Repeats Receptor-Like Kinase (LRR-RLK) genes represent a large and complex gene family in plants, mainly involved in development and stress responses. These receptors are composed of an LRR-containing extracellular domain (ECD), a transmembrane domain (TM) and an intracellular kinase domain (KD). To provide new perspectives on functional analyses of these genes in model and non-model plant species, we performed a phylogenetic analysis on 8,360 LRR-RLK receptors in 31 angiosperm genomes (8 monocots and 23 dicots). We identified 101 orthologous groups (OGs) of genes being conserved among almost all monocot and dicot species analyzed. We observed that more than 10% of these OGs are absent in the Brassicaceae species studied. We show that the ECD structural features are not always conserved among orthologs, suggesting that functions may have diverged in some OG sets. Moreover, we looked at targets of positive selection footprints in 12 pairs of OGs and noticed that depending on the subgroups, positive selection occurred more frequently either in the ECDs or in the KDs.
RESUMO
We produced a unique large data set of reference transcriptomes to obtain new knowledge about the evolution of plant genomes and crop domestication. For this purpose, we validated a RNA-Seq data assembly protocol to perform comparative population genomics. For the validation, we assessed and compared the quality of de novo Illumina short-read assemblies using data from two crops for which an annotated reference genome was available, namely grapevine and sorghum. We used the same protocol for the release of 26 new transcriptomes of crop plants and wild relatives, including still understudied crops such as yam, pearl millet and fonio. The species list has a wide taxonomic representation with the inclusion of 15 monocots and 11 eudicots. All contigs were annotated using BLAST, prot4EST and Blast2GO. A strong originality of the data set is that each crop is associated with close relative species, which will permit whole-genome comparative evolutionary studies between crops and their wild-related species. This large resource will thus serve research communities working on both crops and model organisms. All the data are available at http://arcad-bioinformatics.southgreen.fr/.
Assuntos
Produtos Agrícolas/genética , Genoma de Planta , Metagenômica , Transcriptoma , Evolução Biológica , Mapeamento de Sequências ContíguasRESUMO
Extensive genomic resources are available in the model legume Medicago truncatula. Here, we present the discovery and design of the first array of single-nucleotide polymorphism (SNP) markers in M. truncatula through large-scale Sanger resequencing of genomic fragments spanning the genome, in a diverse panel of 16 M. truncatula accessions. Both anonymous fragments and fragments targeting candidate genes for flowering phenology and symbiosis were surveyed for nucleotide variation in almost 230 kb of unique genomic regions. A set of 384 SNP markers was designed for an Illumina's GoldenGate assay, genotyped on a collection of 192 inbred lines (CC192) representing the geographical range of the species and used to survey the diversity of two natural populations. Finally, 86% of the tested SNPs were of high quality and exhibited polymorphism in the CC192 collection. Even at the population level, we detected polymorphism for more than 50% of the selected SNPs. Analysis of the allele frequency spectrum in the CC192 showed a reduced ascertainment bias, mostly limited to very rare alleles (frequency <0.01). The substantial polymorphism detected at the species and population levels, the high marker quality and the potential to survey large samples of individuals make this set of SNP markers a valuable tool to improve our understanding of the effect of demographic and selective factors that shape the natural genetic diversity within the selfing species Medicago truncatula.
Assuntos
Demografia , Variação Genética , Genoma de Planta/genética , Medicago truncatula/genética , Polimorfismo de Nucleotídeo Único/genética , Sequência de Bases , Primers do DNA/genética , Frequência do Gene , Genótipo , Região do Mediterrâneo , Dados de Sequência Molecular , Reação em Cadeia da Polimerase , Análise de Sequência de DNARESUMO
We study here the evolution of genes located in the same physical locus using the recently sequenced Ha locus in seven wheat genomes in diploid, tetraploid, and hexaploid species and compared them with barley and rice orthologous regions. We investigated both the conservation of microcolinearity and the molecular evolution of genes, including coding and noncoding sequences. Microcolinearity is restricted to two groups of genes (Unknown gene-2, VAMP, BGGP, Gsp-1, and Unknown gene-8 surrounded by several copies of ATPase), almost conserved in rice and barley, but in a different relative position. Highly conserved genes between wheat and rice run along with genes harboring different copy numbers and highly variable sequences between close wheat genomes. The coding sequence evolution appeared to be submitted to heterogeneous selective pressure and intronic sequences analysis revealed that the molecular clock hypothesis is violated in most cases.
Assuntos
Evolução Molecular , Genes de Plantas/genética , Hordeum/genética , Triticum/genética , Sequência de Bases , Cromossomos Artificiais Bacterianos/genética , Cromossomos de Plantas/genética , Códon/genética , Sequência Conservada , DNA Intergênico/genética , Íntrons/genética , Dados de Sequência Molecular , Oryza/genéticaRESUMO
Modern sugarcane (Saccharum spp.) is an important grass that contributes 60% of the raw sugar produced worldwide and has a high biofuel production potential. It was created about a century ago through hybridization of two highly polyploid species, namely S. officinarum and S. spontaneum. We investigated genome dynamics in this highly polyploid context by analyzing two homoeologous sequences (97 and 126 kb) in a region that has already been studied in several cereals. Our findings indicate that the two Saccharum species diverged 1.5-2 million years ago from one another and 8-9 million years ago from sorghum. The two sugarcane homoeologous haplotypes show perfect colinearity as well as high gene structure conservation. Apart from the insertion of a few retrotransposable elements, high homology was also observed for the non-transcribed regions. Relative to sorghum, the sugarcane sequences displayed colinearity, with the exception of two genes present only in sorghum, and striking homology in most non-coding parts of the genome. The gene distribution highlighted high synteny and colinearity with rice, and partial colinearity with each homoeologous maize region, which became perfect when the sequences were combined. The haplotypes observed in sugarcane may thus closely represent the ancestral Andropogoneae haplotype. This analysis of sugarcane haplotype organization at the sequence level suggests that the high ploidy in sugarcane did not induce generalized reshaping of its genome, thus challenging the idea that polyploidy quickly induces generalized rearrangement of genomes. These results also confirm the view that sorghum is the model of choice for sugarcane.