RESUMO
Polyploidization is important to the evolution of plants. Subgenome dominance is a distinct phenomenon associated with most allopolyploids. A gene on the dominant subgenome tends to express to higher RNA levels in all organs as compared to the expression of its syntenic paralogue (homoeolog). The mechanism that underlies the formation of subgenome dominance remains unknown, but there is evidence for the involvement of transposon/DNA methylation density differences nearby the genes of parents as being causal. The subgenome with lower density of transposon and methylation near genes is positively associated with subgenome dominance. Here, we generated eight generations of allotetraploid progenies from the merging of parental genomes Brassica rapa and Brassica oleracea. We found that transposon/methylation density differ near genes between the parental (rapa:oleracea) existed in the wide hybrid, persisted in the neotetraploids (the synthetic Brassica napus), but these neotetraploids expressed no expected subgenome dominance. This absence of B. rapa vs. B. oleracea subgenome dominance is particularly significant because, while there is no negative relationship between transposon/methylation level and subgenome dominance in the neotetraploids, the more ancient parental subgenomes for all Brassica did show differences in transposon/methylation densities near genes and did express, in the same samples of cells, biased gene expression diagnostic of subgenome dominance. We conclude that subgenome differences in methylated transposon near genes are not sufficient to initiate the biased gene expressions defining subgenome dominance. Our result was unexpected, and we suggest a "nuclear chimera" model to explain our data.
Assuntos
Brassica napus , Brassica rapa , Brassica , Brassica/genética , Genoma de Planta/genética , Brassica rapa/genética , Brassica napus/genética , Metilação de DNA/genética , PoliploidiaRESUMO
Polyploidization plays a crucial role in plant evolution and is becoming increasingly important in breeding. Structural variations and epigenomic repatterning have been observed in synthetic polyploidizations. However, the mechanisms underlying the occurrence and their effects on gene expression and phenotype remain unknown. Here, we investigated genome-wide large deletion/duplication regions (DelDups) and genomic methylation dynamics in leaf organs of progeny from the first eight generations of synthetic tetraploids derived from Chinese cabbage (Brassica rapa L. ssp. pekinensis) and cabbage (Brassica oleracea L. var. capitata). One- or two-copy DelDups, with a mean size of 5.70 Mb (400 kb - 65.85 Mb), occurred from the first generation of selfing and thereafter. The duplication of a fragment in one subgenome consistently coincided with the deletion of its syntenic fragment in the other subgenome, and vice versa, indicating that these DelDups were generated by homoeologous exchanges (HEs). Interestingly, the larger the genomic syntenic region, the higher the frequency of DelDups, further suggesting that the pairing of large homoeologous fragments is crucial for HEs. Moreover, we found that the active transcription of continuously distributed genes in local regions is positively associated with the occurrence of HE breakpoints. In addition, the expression of genes within DelDups exhibited a dosage effect, and plants with extra parental genomic fragments generally displayed phenotypes biased towards the corresponding parent. Genome-wide methylation fluctuated remarkably, which did not clearly affect gene expression on a large scale. Our findings provide insights into the early evolution of polyploid genomes, offering valuable knowledge for polyploidization-based breeding.
RESUMO
Twenty-four-nucleotide (nt) small interfering RNAs (siRNAs) maintain asymmetric DNA methylation at thousands of euchromatic transposable elements in plant genomes in a process called RNA-directed DNA methylation (RdDM). RdDM is dispensable for growth and development in Arabidopsis thaliana, but is required for reproduction in other plants, such as Brassica rapa. The 24-nt siRNAs are abundant in maternal reproductive tissue, due largely to overwhelming expression from a few loci in the ovule and developing seed coat, termed siren loci. A recent study showed that 24-nt siRNAs produced in the anther tapetal tissue can methylate male meiocyte genes in trans. Here we show that in B. rapa, a similar process takes place in female tissue. siRNAs are produced from gene fragments embedded in some siren loci, and these siRNAs can trigger methylation in trans at related protein-coding genes. This trans-methylation is associated with silencing of some target genes and may be responsible for seed abortion in RdDM mutants. Furthermore, we demonstrate that a consensus sequence in at least two families of DNA transposons is associated with abundant siren expression, most likely through recruitment of CLASSY3, a putative chromatin remodeler. This research describes a mechanism whereby RdDM influences gene expression and sheds light on the role of RdDM during plant reproduction.
Assuntos
Proteínas de Arabidopsis , Arabidopsis , Arabidopsis/genética , Proteínas de Arabidopsis/metabolismo , Cromatina/metabolismo , Metilação de DNA/genética , Elementos de DNA Transponíveis/genética , Regulação da Expressão Gênica de Plantas/genética , Nucleotídeos/metabolismo , Óvulo Vegetal/genética , Óvulo Vegetal/metabolismo , RNA de Plantas/genética , RNA Interferente Pequeno/genética , RNA Interferente Pequeno/metabolismoRESUMO
The members of the tribe Brassiceae share a whole-genome triplication (WGT), and one proposed model for its formation is a two-step pair of hybridizations producing hexaploid descendants. However, evidence for this model is incomplete, and the evolutionary and functional constraints that drove evolution after the hexaploidy are even less understood. Here, we report a new genome sequence of Crambe hispanica, a species sister to most sequenced Brassiceae. Using this new genome and three others that share the hexaploidy, we traced the history of gene loss after the WGT using the Polyploidy Orthology Inference Tool (POInT). We confirm the two-step formation model and infer that there was a significant temporal gap between those two allopolyploidizations, with about a third of the gene losses from the first two subgenomes occurring before the arrival of the third. We also, for the 90,000 individual genes in our study, make parental subgenome assignments, inferring, with measured uncertainty, from which of the progenitor genomes of the allohexaploidy each gene derives. We further show that each subgenome has a statistically distinguishable rate of homoeolog losses. There is little indication of functional distinction between the three subgenomes: the individual subgenomes show no patterns of functional enrichment, no excess of shared protein-protein or metabolic interactions between their members, and no biases in their likelihood of having experienced a recent selective sweep. We propose a "mix and match" model of allopolyploidy, in which subgenome origin drives homoeolog loss propensities but where genes from different subgenomes function together without difficulty.
Assuntos
Genoma , Poliploidia , Evolução Molecular , Genoma de Planta , Humanos , Hibridização Genética , FilogeniaRESUMO
Leaf heading is an important and economically valuable horticultural trait in many vegetables. The formation of a leafy head is a specialized leaf morphogenesis characterized by the emergence of the enlarged incurving leaves. However, the transcriptional regulation mechanisms underlying the transition to leaf heading remain unclear. We carried out large-scale time-series transcriptome assays covering the major vegetative growth phases of two headingBrassica crops, Chinese cabbage and cabbage, with the non-heading morphotype Taicai as the control. A regulatory transition stage that initiated the heading process is identified, accompanied by a developmental switch from rosette leaf to heading leaf in Chinese cabbages. This transition did not exist in the non-heading control. Moreover, we reveal that the heading transition stage is also conserved in the cabbage clade. Chinese cabbage acquired through domestication a leafy head independently from the origins of heading in other cabbages; phylogenetics supports that the ancestor of all cabbages is non-heading. The launch of the transition stage is closely associated with the ambient temperature. In addition, examination of the biological activities in the transition stage identified the ethylene pathway as particularly active, and we hypothesize that this pathway was targeted for selection for domestication to form the heading trait specifically in Chinese cabbage. In conclusion, our findings on the transcriptome transition that initiated the leaf heading in Chinese cabbage and cabbage provide a new perspective for future studies of leafy head crops.
Assuntos
Brassica , Brassica/metabolismo , Folhas de Planta/metabolismo , Proteínas de Plantas/genética , Proteínas de Plantas/metabolismo , TranscriptomaRESUMO
Small RNAs are abundant in plant reproductive tissues, especially 24-nucleotide (nt) small interfering RNAs (siRNAs). Most 24-nt siRNAs are dependent on RNA Pol IV and RNA-DEPENDENT RNA POLYMERASE 2 (RDR2) and establish DNA methylation at thousands of genomic loci in a process called RNA-directed DNA methylation (RdDM). In Brassica rapa, RdDM is required in the maternal sporophyte for successful seed development. Here, we demonstrate that a small number of siRNA loci account for over 90% of siRNA expression during B. rapa seed development. These loci exhibit unique characteristics with regard to their copy number and association with genomic features, but they resemble canonical 24-nt siRNA loci in their dependence on RNA Pol IV/RDR2 and role in RdDM. These loci are expressed in ovules before fertilization and in the seed coat, embryo, and endosperm following fertilization. We observed a similar pattern of 24-nt siRNA expression in diverse angiosperms despite rapid sequence evolution at siren loci. In the endosperm, siren siRNAs show a marked maternal bias, and siren expression in maternal sporophytic tissues is required for siren siRNA accumulation. Together, these results demonstrate that seed development occurs under the influence of abundant maternal siRNAs that might be transported to, and function in, filial tissues.
Assuntos
Brassica rapa/embriologia , Regulação da Expressão Gênica no Desenvolvimento/fisiologia , Regulação da Expressão Gênica de Plantas/fisiologia , RNA de Plantas , Sementes/crescimento & desenvolvimento , Alelos , Arabidopsis/metabolismo , Brassica rapa/genética , Brassica rapa/crescimento & desenvolvimento , Brassica rapa/metabolismo , Proteínas de Plantas/genética , Proteínas de Plantas/metabolismo , RNA Interferente Pequeno , Sementes/genética , Sementes/metabolismoRESUMO
Recent pangenome studies have revealed a large fraction of the gene content within a species exhibits presence-absence variation (PAV). However, coding regions alone provide an incomplete assessment of functional genomic sequence variation at the species level. Little to no attention has been paid to noncoding regulatory regions in pangenome studies, though these sequences directly modulate gene expression and phenotype. To uncover regulatory genetic variation, we generated chromosome-scale genome assemblies for thirty Arabidopsis thaliana accessions from multiple distinct habitats and characterized species level variation in Conserved Noncoding Sequences (CNS). Our analyses uncovered not only PAV and positional variation (PosV) but that diversity in CNS is nonrandom, with variants shared across different accessions. Using evolutionary analyses and chromatin accessibility data, we provide further evidence supporting roles for conserved and variable CNS in gene regulation. Additionally, our data suggests that transposable elements contribute to CNS variation. Characterizing species-level diversity in all functional genomic sequences may later uncover previously unknown mechanistic links between genotype and phenotype.
Assuntos
Arabidopsis/genética , Sequência Conservada , Evolução Molecular , Variação Genética , Sequências Reguladoras de Ácido Nucleico/genética , Duplicação Gênica , Genoma de Planta , Seleção GenéticaRESUMO
MOTIVATION: Over the last decade, RNA-Seq whole-genome sequencing has become a widely used method for measuring and understanding transcriptome-level changes in gene expression. Since RNA-Seq is relatively inexpensive, it can be used on multiple genomes to evaluate gene expression across many different conditions, tissues and cell types. Although many tools exist to map and compare RNA-Seq at the genomics level, few web-based tools are dedicated to making data generated for individual genomic analysis accessible and reusable at a gene-level scale for comparative analysis between genes, across different genomes and meta-analyses. RESULTS: To address this challenge, we revamped the comparative gene expression tool qTeller to take advantage of the growing number of public RNA-Seq datasets. qTeller allows users to evaluate gene expression data in a defined genomic interval and also perform two-gene comparisons across multiple user-chosen tissues. Though previously unpublished, qTeller has been cited extensively in the scientific literature, demonstrating its importance to researchers. Our new version of qTeller now supports multiple genomes for intergenomic comparisons, and includes capabilities for both mRNA and protein abundance datasets. Other new features include support for additional data formats, modernized interface and back-end database and an optimized framework for adoption by other organisms' databases. AVAILABILITY AND IMPLEMENTATION: The source code for qTeller is open-source and available through GitHub (https://github.com/Maize-Genetics-and-Genomics-Database/qTeller). A maize instance of qTeller is available at the Maize Genetics and Genomics database (MaizeGDB) (https://qteller.maizegdb.org/), where we have mapped over 200 unique datasets from GenBank across 27 maize genomes. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos
Genoma , Genômica , Software , Bases de Dados de Ácidos Nucleicos , Zea mays/genética , Perfilação da Expressão GênicaRESUMO
Plant genomes, and eukaryotic genomes in general, are typically repetitive, polyploid and heterozygous, which complicates genome assembly. The short read lengths of early Sanger and current next-generation sequencing platforms hinder assembly through complex repeat regions, and many draft and reference genomes are fragmented, lacking skewed GC and repetitive intergenic sequences, which are gaining importance due to projects like the Encyclopedia of DNA Elements (ENCODE). Here we report the whole-genome sequencing and assembly of the desiccation-tolerant grass Oropetium thomaeum. Using only single-molecule real-time sequencing, which generates long (>16 kilobases) reads with random errors, we assembled 99% (244 megabases) of the Oropetium genome into 625 contigs with an N50 length of 2.4 megabases. Oropetium is an example of a 'near-complete' draft genome which includes gapless coverage over gene space as well as intergenic sequences such as centromeres, telomeres, transposable elements and rRNA clusters that are typically unassembled in draft genomes. Oropetium has 28,466 protein-coding genes and 43% repeat sequences, yet with 30% more compact euchromatic regions it is the smallest known grass genome. The Oropetium genome demonstrates the utility of single-molecule real-time sequencing for assembling high-quality plant and other eukaryotic genomes, and serves as a valuable resource for the plant comparative genomics community.
Assuntos
Genoma de Planta/genética , Poaceae/genética , Análise de Sequência de DNA/métodos , Aclimatação/genética , Mapeamento de Sequências Contíguas , Desidratação , Dessecação , Secas , Genes de Plantas/genética , Genômica , Dados de Sequência MolecularRESUMO
Small RNAs trigger repressive DNA methylation at thousands of transposable elements in a process called RNA-directed DNA methylation (RdDM). The molecular mechanism of RdDM is well characterized in Arabidopsis, yet the biological function remains unclear, as loss of RdDM in Arabidopsis causes no overt defects, even after generations of inbreeding. It is known that 24 nucleotide Pol IV-dependent siRNAs, the hallmark of RdDM, are abundant in flowers and developing seeds, indicating that RdDM might be important during reproduction. Here we show that, unlike Arabidopsis, mutations in the Pol IV-dependent small RNA pathway cause severe and specific reproductive defects in Brassica rapa. High rates of abortion occur when seeds have RdDM mutant mothers, but not when they have mutant fathers. Although abortion occurs after fertilization, RdDM function is required in maternal somatic tissue, not in the female gametophyte or the developing zygote, suggesting that siRNAs from the maternal soma might function in filial tissues. We propose that recently outbreeding species such as B. rapa are key to understanding the role of RdDM during plant reproduction.
Assuntos
Brassica rapa/genética , Metilação de DNA , RNA Interferente Pequeno/genética , Sementes/genética , Brassica rapa/embriologia , Brassica rapa/enzimologia , Brassica rapa/fisiologia , Elementos de DNA Transponíveis/genética , RNA Polimerases Dirigidas por DNA/genética , RNA Polimerases Dirigidas por DNA/metabolismo , Diploide , Genótipo , Mutação , Fenótipo , Melhoramento Vegetal , Proteínas de Plantas/genética , Proteínas de Plantas/metabolismo , RNA de Plantas/genética , Reprodução , Sementes/embriologia , Sementes/enzimologia , Sementes/fisiologiaRESUMO
In vertebrates, conserved noncoding elements (CNEs) are functionally constrained sequences that can show striking conservation over >400 million years of evolutionary distance and frequently are located megabases away from target developmental genes. Conserved noncoding sequences (CNSs) in plants are much shorter, and it has been difficult to detect conservation among distantly related genomes. In this article, we show not only that CNS sequences can be detected throughout the eudicot clade of flowering plants, but also that a subset of 37 CNSs can be found in all flowering plants (diverging â¼170 million years ago). These CNSs are functionally similar to vertebrate CNEs, being highly associated with transcription factor and development genes and enriched in transcription factor binding sites. Some of the most highly conserved sequences occur in genes encoding RNA binding proteins, particularly the RNA splicing-associated SR genes. Differences in sequence conservation between plants and animals are likely to reflect differences in the biology of the organisms, with plants being much more able to tolerate genomic deletions and whole-genome duplication events due, in part, to their far greater fecundity compared with vertebrates.
Assuntos
Evolução Molecular , Plantas/genética , Vertebrados/genética , Processamento Alternativo , Animais , Sequência de Bases , Sequência Conservada , Dados de Sequência Molecular , Conformação de Ácido Nucleico , Filogenia , RNA de Plantas/química , RNA de Plantas/genética , Homologia de Sequência do Ácido Nucleico , Fatores de Transcrição/genéticaRESUMO
Whole-genome duplications happen repeatedly in a typical flowering plant lineage. Following most ancient tetraploidies, the two subgenomes are distinguishable because one subgenome, the dominant subgenome, tends to have more genes than the other subgenome. Additionally, among retained pairs, the gene on the dominant subgenome tends to be expressed more than its recessive homeolog. Using comparative genomics, we show that genome dominance is heritable. The dominant subgenome of one postpolyploidy event remains dominant through a subsequent polyploidy event. We show that transposon-derived 24-nt RNAs target and cover the upstream region of retained genes preferentially when located on the recessive subgenome, and with little regard for a gene's level of expression. We hypothesize that small RNA (smRNA)-mediated silencing of transposons near genes causes position-effect down-regulation. Unlike 24-nt smRNA coverage, transposon coverage tracks gene expression, so not all transposons behave identically. We propose that successful ancient tetraploids begin as wide crosses between two lines, each evolved for different tradeoffs between transposon silencing and negative position effects on gene expression. We hypothesize that following a chaotic wide-cross/new tetraploid period, genes acquire their new expression balances based on differences in transposon coverage in the parents. We envision patches of silenceable transposon as quantitative cis-regulators of baseline transcription rate. Attractive solutions to heterosis and the C-value paradox are mentioned.
Assuntos
Redes Reguladoras de Genes , Genoma de Planta , Poliploidia , Elementos de DNA Transponíveis , RNA de Plantas/genéticaRESUMO
Subgenome dominance is an important phenomenon observed in allopolyploids after whole genome duplication, in which one subgenome retains more genes as well as contributes more to the higher expressing gene copy of paralogous genes. To dissect the mechanism of subgenome dominance, we systematically investigated the relationships of gene expression, transposable element (TE) distribution and small RNA targeting, relating to the multicopy paralogous genes generated from whole genome triplication in Brassica rapa. The subgenome dominance was found to be regulated by a relatively stable factor established previously, then inherited by and shared among B. rapa varieties. In addition, we found a biased distribution of TEs between flanking regions of paralogous genes. Furthermore, the 24-nt small RNAs target TEs and are negatively correlated to the dominant expression of individual paralogous gene pairs. The biased distribution of TEs among subgenomes and the targeting of 24-nt small RNAs together produce the dominant expression phenomenon at a subgenome scale. Based on these findings, we propose a bucket hypothesis to illustrate subgenome dominance and hybrid vigor. Our findings and hypothesis are valuable for the evolutionary study of polyploids, and may shed light on studies of hybrid vigor, which is common to most species.
Assuntos
Brassica rapa/genética , Elementos de DNA Transponíveis , Epigênese Genética , Regulação da Expressão Gênica de Plantas , Genoma de PlantaRESUMO
Whole genome duplications (WGDs) occurred in the distant evolutionary history of many lineages and are particularly frequent in the flowering plant lineages. Following paleopolyploidization in plants, most duplicated genes are deleted by intrachromosomal recombination, a process referred to as fractionation. In the examples studied so far, genes are disproportionately lost from one of the parental subgenomes (biased fractionation) and the subgenome having lost the lowest number of genes is more expressed (genome dominance). In the present study, we analyzed the pattern of gene deletion and gene expression following the most recent WGD in banana (alpha event) and extended our analyses to seven other sequenced plant genomes: poplar, soybean, medicago, arabidopsis, sorghum, brassica, and maize. We propose a new class of ancient WGD, with Musa (alpha), poplar, and soybean as members, where genes are both deleted and expressed to an equal extent (unbiased fractionation and genome equivalence). We suggest that WGDs with genome dominance and biased fractionation (Class I) may result from ancient allotetraploidies, while WGDs without genome dominance or biased fractionation (Class II) may result from ancient autotetraploidies.
Assuntos
Genoma de Planta , Plantas/genética , Poliploidia , Substituição de Aminoácidos , Evolução Molecular , Duplicação Gênica , Genes de Plantas , Musa/genética , Filogenia , Plantas/classificação , Especificidade da Espécie , TranscriptomaRESUMO
Gene expression is controlled by the complex interaction of transcription factors binding to promoters and other regulatory DNA elements. One common characteristic of the genomic regions associated with regulatory proteins is a pronounced sensitivity to DNase I digestion. We generated genome-wide high-resolution maps of DNase I hypersensitive (DH) sites from both seedling and callus tissues of rice (Oryza sativa). Approximately 25% of the DH sites from both tissues were found in putative promoters, indicating that the vast majority of the gene regulatory elements in rice are not located in promoter regions. We found 58% more DH sites in the callus than in the seedling. For DH sites detected in both the seedling and callus, 31% displayed significantly different levels of DNase I sensitivity within the two tissues. Genes that are differentially expressed in the seedling and callus were frequently associated with DH sites in both tissues. The DNA sequences contained within the DH sites were hypomethylated, consistent with what is known about active gene regulatory elements. Interestingly, tissue-specific DH sites located in the promoters showed a higher level of DNA methylation than the average DNA methylation level of all the DH sites located in the promoters. A distinct elevation of H3K27me3 was associated with intergenic DH sites. These results suggest that epigenetic modifications play a role in the dynamic changes of the numbers and DNase I sensitivity of DH sites during development.
Assuntos
Cromatina/genética , Mapeamento Cromossômico/métodos , Genoma de Planta/genética , Oryza/genética , Plântula/genética , Desoxirribonuclease I/química , Estudo de Associação Genômica Ampla/métodosRESUMO
The knotted1 (kn1) homeobox (knox) gene family was first identified through gain-of-function dominant mutants in maize (Zea mays). Class I knox members are expressed in meristems but excluded from leaves. In maize, a loss-of-function phenotype has only been characterized for kn1. To assess the function of another knox member, we characterized a loss-of-function mutation of rough sheath1 (rs1). rs1-mum1 has no phenotype alone but exacerbates several aspects of the kn1 phenotype. In permissive backgrounds in which kn1 mutants grow to maturity, loss of a single copy of rs1 enhances the tassel branch reduction phenotype, while loss of both copies results in limited shoots. In less introgressed lines, double mutants can grow to maturity but are shorter. Using a KNOX antibody, we demonstrate that RS1 binds in vivo to some of the KN1 target genes, which could partially explain why KN1 binds many genes but modulates few. Our results demonstrate an unequal redundancy between knox genes, with a role for rs1 only revealed in the complete absence of kn1.
Assuntos
Genes Homeobox , Proteínas de Plantas/genética , Zea mays/genética , Regulação da Expressão Gênica de Plantas , Proteínas de Homeodomínio/genética , Proteínas de Homeodomínio/metabolismo , Meristema/genética , Família Multigênica , Mutação , Fenótipo , Folhas de Planta/genética , Proteínas de Plantas/metabolismo , Brotos de Planta/genética , Brotos de Planta/metabolismo , Zea mays/crescimento & desenvolvimentoRESUMO
Sorghum, an African grass related to sugar cane and maize, is grown for food, feed, fibre and fuel. We present an initial analysis of the approximately 730-megabase Sorghum bicolor (L.) Moench genome, placing approximately 98% of genes in their chromosomal context using whole-genome shotgun sequence validated by genetic, physical and syntenic information. Genetic recombination is largely confined to about one-third of the sorghum genome with gene order and density similar to those of rice. Retrotransposon accumulation in recombinationally recalcitrant heterochromatin explains the approximately 75% larger genome size of sorghum compared with rice. Although gene and repetitive DNA distributions have been preserved since palaeopolyploidization approximately 70 million years ago, most duplicated gene sets lost one member before the sorghum-rice divergence. Concerted evolution makes one duplicated chromosomal segment appear to be only a few million years old. About 24% of genes are grass-specific and 7% are sorghum-specific. Recent gene and microRNA duplications may contribute to sorghum's drought tolerance.
Assuntos
Evolução Molecular , Genoma de Planta/genética , Poaceae/genética , Sorghum/genética , Arabidopsis/genética , Cromossomos de Plantas/genética , Duplicação Gênica , Genes de Plantas , Oryza/genética , Populus/genética , Recombinação Genética/genética , Alinhamento de Sequência , Análise de Sequência de DNA , Deleção de Sequência/genética , Zea mays/genéticaRESUMO
Ultraconserved elements (UCEs) are DNA sequences that are 100% identical (no base substitutions, insertions, or deletions) and located in syntenic positions in at least two genomes. Although hundreds of UCEs have been found in animal genomes, little is known about the incidence of ultraconservation in plant genomes. Using an alignment-free information-retrieval approach, we have comprehensively identified all long identical multispecies elements (LIMEs), which include both syntenic and nonsyntenic regions, of at least 100 identical base pairs shared by at least two genomes. Among six animal genomes, we found the previously known syntenic UCEs as well as previously undescribed nonsyntenic elements. In contrast, among six plant genomes, we only found nonsyntenic LIMEs. LIMEs can also be classified as either simple (repetitive) or complex (nonrepetitive), they may occur in multiple copies in a genome, and they are often spread across multiple chromosomes. Although complex LIMEs were found in both animal and plant genomes, they differed significantly in their composition and copy number. Further analyses of plant LIMEs revealed their functional diversity, encompassing elements found near rRNA and enzyme-coding genes, as well as those found in transposons and noncoding DNA. We conclude that despite the common presence of LIMEs in both animal and plant lineages, the evolutionary processes involved in the creation and maintenance of these elements differ in the two groups and are likely attributable to several mechanisms, including transfer of genetic material from organellar to nuclear genomes, de novo sequence manufacturing, and purifying selection.
Assuntos
Sequência Conservada/genética , Evolução Molecular , Genoma de Planta/genética , Genoma/genética , Sequência de Aminoácidos , Animais , Arabidopsis/genética , Sequência de Bases , Núcleo Celular/genética , Mapeamento Cromossômico , Cromossomos de Mamíferos/genética , Cromossomos de Plantas/genética , Redes Reguladoras de Genes , Genoma Mitocondrial/genética , Humanos , Camundongos , Modelos Genéticos , Dados de Sequência Molecular , Ratos , Especificidade da Espécie , SinteniaRESUMO
Certain types of gene families, such as those encoding most families of transcription factors, maintain their chromosomal syntenic positions throughout angiosperm evolutionary time. Other nonsyntenic gene families are prone to deletion, tandem duplication, and transposition. Here, we describe the chromosomal positional history of all genes in Arabidopsis thaliana throughout the rosid superorder. We introduce a public database where researchers can look up the positional history of their favorite A. thaliana gene or gene family. Finally, we show that specific gene families transposed at specific points in evolutionary time, particularly after whole-genome duplication events in the Brassicales, and suggest that genes in mobile gene families are under different selection pressure than syntenic genes.