RESUMO
Mitochondrial and plastid functions depend on coordinated expression of proteins encoded by genomic compartments that have radical differences in copy number of organellar and nuclear genomes. In polyploids, doubling of the nuclear genome may add challenges to maintaining balanced expression of proteins involved in cytonuclear interactions. Here, we use ribo-depleted RNA sequencing (RNA-seq) to analyze transcript abundance for nuclear and organellar genomes in leaf tissue from four different polyploid angiosperms and their close diploid relatives. We find that even though plastid genomes contain <1% of the number of genes in the nuclear genome, they generate the majority (69.9 to 82.3%) of messenger RNA (mRNA) transcripts in the cell. Mitochondrial genes are responsible for a much smaller percentage (1.3 to 3.7%) of the leaf mRNA pool but still produce much higher transcript abundances per gene compared to nuclear genome. Nuclear genes encoding proteins that functionally interact with mitochondrial or plastid gene products exhibit mRNA expression levels that are consistently more than 10-fold lower than their organellar counterparts, indicating an extreme cytonuclear imbalance at the RNA level despite the predominance of equimolar interactions at the protein level. Nevertheless, interacting nuclear and organellar genes show strongly correlated transcript abundances across functional categories, suggesting that the observed mRNA stoichiometric imbalance does not preclude coordination of cytonuclear expression. Finally, we show that nuclear genome doubling does not alter the cytonuclear expression ratios observed in diploid relatives in consistent or systematic ways, indicating that successful polyploid plants are able to compensate for cytonuclear perturbations associated with nuclear genome doubling.
Assuntos
Magnoliopsida , Plastídeos , Poliploidia , Transcrição Gênica , Núcleo Celular/genética , Núcleo Celular/metabolismo , Genoma de Planta , Magnoliopsida/genética , Folhas de Planta/genética , Plastídeos/genética , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , RNA de Plantas/genética , RNA de Plantas/metabolismoRESUMO
Nuclear and plastid (chloroplast) genomes experience different mutation rates, levels of selection, and transmission modes, yet key cellular functions depend on their coordinated interactions. Functionally related proteins often show correlated changes in rates of sequence evolution across a phylogeny [evolutionary rate covariation (ERC)], offering a means to detect previously unidentified suites of coevolving and cofunctional genes. We performed phylogenomic analyses across angiosperm diversity, scanning the nuclear genome for genes that exhibit ERC with plastid genes. As expected, the strongest hits were highly enriched for genes encoding plastid-targeted proteins, providing evidence that cytonuclear interactions affect rates of molecular evolution at genome-wide scales. Many identified nuclear genes functioned in post-transcriptional regulation and the maintenance of protein homeostasis (proteostasis), including protein translation (in both the plastid and cytosol), import, quality control, and turnover. We also identified nuclear genes that exhibit strong signatures of coevolution with the plastid genome, but their encoded proteins lack organellar-targeting annotations, making them candidates for having previously undescribed roles in plastids. In sum, our genome-wide analyses reveal that plastid-nuclear coevolution extends beyond the intimate molecular interactions within chloroplast enzyme complexes and may be driven by frequent rewiring of the machinery responsible for maintenance of plastid proteostasis in angiosperms.
Assuntos
Evolução Biológica , Magnoliopsida/genética , Proteínas de Plantas/genética , Núcleo Celular/genética , Proteínas de Cloroplastos/genética , Proteínas de Cloroplastos/metabolismo , Genoma de Planta , Genomas de Plastídeos , Estudo de Associação Genômica Ampla , ProteostaseRESUMO
The chloroplast chaperone CLPC1 unfolds and delivers substrates to the stromal CLPPRT protease complex for degradation. We previously used an in vivo trapping approach to identify interactors with CLPC1 in Arabidopsis thaliana by expressing a STREPII-tagged copy of CLPC1 mutated in its Walker B domains (CLPC1-TRAP) followed by affinity purification and mass spectrometry. To create a larger pool of candidate substrates, adaptors, or regulators, we carried out a far more sensitive and comprehensive in vivo protein trapping analysis. We identified 59 highly enriched CLPC1 protein interactors, in particular proteins belonging to families of unknown functions (DUF760, DUF179, DUF3143, UVR-DUF151, HugZ/DUF2470), as well as the UVR domain proteins EXE1 and EXE2 implicated in singlet oxygen damage and signaling. Phylogenetic and functional domain analyses identified other members of these families that appear to localize (nearly) exclusively to plastids. In addition, several of these DUF proteins are of very low abundance as determined through the Arabidopsis PeptideAtlas http://www.peptideatlas.org/builds/arabidopsis/ showing that enrichment in the CLPC1-TRAP was extremely selective. Evolutionary rate covariation indicated that the HugZ/DUF2470 family coevolved with the plastid CLP machinery suggesting functional and/or physical interactions. Finally, mRNA-based coexpression networks showed that all 12 CLP protease subunits tightly coexpressed as a single cluster with deep connections to DUF760-3. Coexpression modules for other trapped proteins suggested specific functions in biological processes, e.g., UVR2 and UVR3 were associated with extraplastidic degradation, whereas DUF760-6 is likely involved in senescence. This study provides a strong foundation for discovery of substrate selection by the chloroplast CLP protease system.
Assuntos
Proteínas de Arabidopsis , Arabidopsis , Proteínas de Cloroplastos , Proteínas de Choque Térmico , Plastídeos , Arabidopsis/genética , Arabidopsis/metabolismo , Proteínas de Arabidopsis/genética , Proteínas de Arabidopsis/metabolismo , Proteínas de Cloroplastos/genética , Proteínas de Cloroplastos/metabolismo , Cloroplastos/genética , Cloroplastos/metabolismo , Endopeptidase Clp/metabolismo , Proteínas de Choque Térmico/genética , Proteínas de Choque Térmico/metabolismo , Chaperonas Moleculares/metabolismo , Filogenia , Plastídeos/genética , Plastídeos/metabolismo , ProteômicaRESUMO
While the chloroplast (plastid) is known for its role in photosynthesis, it is also involved in many other metabolic pathways essential for plant survival. As such, plastids contain an extensive suite of enzymes required for non-photosynthetic processes. The evolution of the associated genes has been especially dynamic in flowering plants (angiosperms), including examples of gene duplication and extensive rate variation. We examined the role of ongoing gene duplication in two key plastid enzymes, the acetyl-CoA carboxylase (ACCase) and the caseinolytic protease (Clp), responsible for fatty acid biosynthesis and protein turnover, respectively. In plants, there are two ACCase complexes-a homomeric version present in the cytosol and a heteromeric version present in the plastid. Duplications of the nuclear-encoded homomeric ACCase gene and retargeting of one resultant protein to the plastid have been previously reported in multiple species. We find that these retargeted homomeric ACCase proteins exhibit elevated rates of sequence evolution, consistent with neofunctionalization and/or relaxation of selection. The plastid Clp complex catalytic core is composed of nine paralogous proteins that arose via ancient gene duplication in the cyanobacterial/plastid lineage. We show that further gene duplication occurred more recently in the nuclear-encoded core subunits of this complex, yielding additional paralogs in many species of angiosperms. Moreover, in six of eight cases, subunits that have undergone recent duplication display increased rates of sequence evolution relative to those that have remained single copy. We also compared substitution patterns between pairs of Clp core paralogs to gain insight into post-duplication evolutionary routes. These results show that gene duplication and rate variation continue to shape the plastid proteome.
Assuntos
Acetil-CoA Carboxilase , Magnoliopsida , Acetil-CoA Carboxilase/genética , Acetil-CoA Carboxilase/metabolismo , Duplicação Gênica , Magnoliopsida/genética , Peptídeo Hidrolases , Filogenia , Plastídeos/genética , Plastídeos/metabolismoRESUMO
Small RNA-mediated chromatin modification is a conserved feature of eukaryotes. In flowering plants, the short interfering (si)RNAs that direct transcriptional silencing are abundant and subfunctionalization has led to specialized machinery responsible for synthesis and action of these small RNAs. In particular, plants possess polymerase (Pol) IV and Pol V, multi-subunit homologs of the canonical DNA-dependent RNA Pol II, as well as specialized members of the RNA-dependent RNA Polymerase (RDR), Dicer-like (DCL), and Argonaute (AGO) families. Together these enzymes are required for production and activity of Pol IV-dependent (p4-)siRNAs, which trigger RNA-directed DNA methylation (RdDM) at homologous sequences. p4-siRNAs accumulate highly in developing endosperm, a specialized tissue found only in flowering plants, and are rare in nonflowering plants, suggesting that the evolution of flowers might coincide with the emergence of specialized RdDM machinery. Through comprehensive identification of RdDM genes from species representing the breadth of the land plant phylogeny, we describe the ancient origin of Pol IV and Pol V, suggesting that a nearly complete and functional RdDM pathway could have existed in the earliest land plants. We also uncover innovations in these enzymes that are coincident with the emergence of seed plants and flowering plants, and recent duplications that might indicate additional subfunctionalization. Phylogenetic analysis reveals rapid evolution of Pol IV and Pol V subunits relative to their Pol II counterparts and suggests that duplicates were retained and subfunctionalized through Escape from Adaptive Conflict. Evolution within the carboxy-terminal domain of the Pol V largest subunit is particularly striking, where illegitimate recombination facilitated extreme sequence divergence.
Assuntos
RNA Polimerases Dirigidas por DNA/genética , Filogenia , Proteínas de Plantas/genética , Plantas/enzimologia , Plantas/genética , Sequência de Aminoácidos , RNA Polimerases Dirigidas por DNA/química , Evolução Molecular , Flores/genética , Duplicação Gênica , Inativação Gênica , Genes de Plantas , Magnoliopsida/enzimologia , Dados de Sequência Molecular , Proteínas de Plantas/química , Estrutura Terciária de Proteína , Subunidades Proteicas/genética , Especificidade da EspécieRESUMO
Telomeres are repetitive TG-rich DNA elements essential for maintaining the stability of genomes and replicative capacity of cells in almost all eukaryotes. Most of what is known about telomeres in plants comes from the angiosperm Arabidopsis thaliana, which has become an important comparative model for telomere biology. Arabidopsis tolerates numerous insults to its genome, many of which are catastrophic or lethal in other eukaryotic systems such as yeast and vertebrates. Despite the importance of Arabidopsis in establishing a model for the structure and regulation of plant telomeres, only a handful of studies have used this information to assay components of telomeres from across land plants, or even among the closest relatives of Arabidopsis in the plant family Brassicaceae. Here, we determined how well Arabidopsis represents Brassicaceae by comparing multiple aspects of telomere biology in species that represent major clades in the family tree. Specifically, we determined the telomeric repeat sequence, measured bulk telomere length, and analyzed variation in telomere length on syntenic chromosome arms. In addition, we used a phylogenetic approach to infer the evolutionary history of putative telomere-binding proteins, CTC1, STN1, TEN1 (CST), telomere repeat-binding factor like (TRFL), and single Myb histone (SMH). Our analyses revealed conservation of the telomeric DNA repeat sequence, but considerable variation in telomere length among the sampled species, even in comparisons of syntenic chromosome arms. We also found that the single-stranded and double-stranded telomeric DNA-binding complexes CST and TRFL, respectively, differ in their pattern of gene duplication and loss. The TRFL and SMH gene families have undergone numerous duplication events, and these duplicate copies are often retained in the genome. In contrast, CST components occur as single-copy genes in all sampled genomes, even in species that experienced recent whole genome duplication events. Taken together, our results place the Arabidopsis model in the context of other species in Brassicaceae, making the family the best characterized plant group in regard to telomere architecture.
Assuntos
Arabidopsis/genética , Genes de Plantas , Telômero/genética , Arabidopsis/classificação , Proteínas de Arabidopsis/genética , Proteínas de Arabidopsis/metabolismo , DNA de Plantas/genética , Proteínas de Ligação a DNA/genética , Proteínas de Ligação a DNA/metabolismo , Histonas/genética , Histonas/metabolismo , Filogenia , Alinhamento de Sequência , Análise de Sequência de DNA , Telômero/metabolismo , Proteínas de Ligação a Telômeros/genética , Proteínas de Ligação a Telômeros/metabolismoRESUMO
The interaction between the nuclear and chloroplast genomes in plants is crucial for preserving essential cellular functions in the face of varying rates of mutation, levels of selection, and modes of transmission. Despite this, identifying nuclear genes that coevolve with chloroplast genomes at a genome-wide level has remained a challenge. In this study, we conducted an evolutionary rate covariation analysis to identify candidate nuclear genes coevolving with chloroplast genomes in Juglandaceae. Our analysis was based on 4,894 orthologous nuclear genes and 76 genes across seven chloroplast partitions in nine Juglandaceae species. Our results indicated that 1,369 (27.97%) of the nuclear genes demonstrated signatures of coevolution, with the Ycf1/2 partition yielding the largest number of hits (765) and the ClpP1 partition yielding the fewest (13). These hits were found to be significantly enriched in biological processes related to leaf development, photoperiodism, and response to abiotic stress. Among the seven partitions, AccD, ClpP1, MatK, and RNA polymerase partitions and their respective hits exhibited a narrow range, characterized by dN/dS values below 1. In contrast, the Ribosomal, Photosynthesis, Ycf1/2 partitions and their corresponding hits, displayed a broader range of dN/dS values, with certain values exceeding 1. Our findings highlight the differences in the number of candidate nuclear genes coevolving with the seven chloroplast partitions in Juglandaceae species and the correlation between the evolution rates of these genes and their corresponding chloroplast partitions.
Assuntos
Genoma de Cloroplastos , Juglandaceae , Filogenia , Evolução Molecular , Juglandaceae/genética , Plastídeos/genética , GenômicaRESUMO
There is remarkable variation in the rate at which genetic incompatibilities in molecular interactions accumulate. In some cases, minor changes-even single-nucleotide substitutions-create major incompatibilities when hybridization forces new variants to function in a novel genetic background from an isolated population. In other cases, genes or even entire functional pathways can be horizontally transferred between anciently divergent evolutionary lineages that span the tree of life with little evidence of incompatibilities. In this review, we explore whether there are general principles that can explain why certain genes are prone to incompatibilities while others maintain interchangeability. We summarize evidence pointing to four genetic features that may contribute to greater resistance to functional replacement: (1) function in multisubunit enzyme complexes and protein-protein interactions, (2) sensitivity to changes in gene dosage, (3) rapid rate of sequence evolution, and (4) overall importance to cell viability, which creates sensitivity to small perturbations in molecular function. We discuss the relative levels of support for these different hypotheses and lay out future directions that may help explain the striking contrasts in patterns of incompatibility and interchangeability throughout the history of molecular evolution.
Assuntos
Evolução Molecular , Hibridização GenéticaRESUMO
Cytonuclear coevolution is a common feature among plants, which coordinates gene expression and protein products between the nucleus and organelles. Consequently, lineage-specific differences may result in incompatibilities between the nucleus and cytoplasm in hybrid taxa. Allopolyploidy is also a common phenomenon in plant evolution. The hybrid nature of allopolyploids may result in cytonuclear incompatibilities, but the massive nuclear redundancy created during polyploidy affords additional avenues for resolving cytonuclear conflict (i.e. cytonuclear accommodation). Here we evaluate expression changes in organelle-targeted nuclear genes for 6 allopolyploid lineages that represent 4 genera (i.e. Arabidopsis, Arachis, Chenopodium, and Gossypium) and encompass a range in polyploid ages. Because incompatibilities between the nucleus and cytoplasm could potentially result in biases toward the maternal homoeolog and/or maternal expression level, we evaluate patterns of homoeolog usage, expression bias, and expression-level dominance in cytonuclear genes relative to the background of noncytonuclear expression changes and to the diploid parents. Although we find subsets of cytonuclear genes in most lineages that match our expectations of maternal preference, these observations are not consistent among either allopolyploids or categories of organelle-targeted genes. Our results indicate that cytonuclear expression evolution may be subtle and variable among genera and genes, likely reflecting a diversity of mechanisms to resolve nuclear-cytoplasmic incompatibilities in allopolyploid species.
Assuntos
Arabidopsis , Genes de Plantas , Arabidopsis/genética , Citoplasma/genética , Citoplasma/metabolismo , Evolução Molecular , Genoma de Planta , Gossypium/genética , PoliploidiaRESUMO
Phylogenomic analyses are recovering previously hidden histories of hybridization, revealing the genomic consequences of these events on the architecture of extant genomes. We applied phylogenomic techniques and several complementary statistical tests to show that introgressive hybridization appears to have occurred between close relatives of Arabidopsis, resulting in cytonuclear discordance and impacting our understanding of species relationships in the group. The composition of introgressed and retained genes indicates that selection against incompatible cytonuclear and nuclear-nuclear interactions likely acted during introgression, whereas linkage also contributed to genome composition through the retention of ancient haplotype blocks. We also applied divergence-based tests to determine the species branching order and distinguish donor from recipient lineages. Surprisingly, these analyses suggest that cytonuclear discordance arose via extensive nuclear, rather than cytoplasmic, introgression. If true, this would mean that most of the nuclear genome was displaced during introgression whereas only a small proportion of native alleles were retained.
Assuntos
Arabidopsis/genética , Introgressão Genética , Genoma de Cloroplastos , Genoma de Planta , Filogenia , Ligação Genética , Seleção GenéticaRESUMO
Introgressive hybridization results in the transfer of genetic material between species, often with fitness implications for the recipient species. The development of statistical methods for detecting the signatures of historical introgression in whole-genome data has been a major area of focus. Although existing techniques are able to identify the taxa that exchanged genes during introgression using a four-taxon system, most methods do not explicitly distinguish which taxon served as donor and which as recipient during introgression (i.e., polarization of introgression directionality). Existing methods that do polarize introgression are often only able to do so when there is a fifth taxon available and that taxon is sister to one of the taxa involved in introgression. Here, we present divergence-based introgression polarization (DIP), a method for polarizing introgression using patterns of sequence divergence across whole genomes, which operates in a four-taxon context. Thus, DIP can be applied to infer the directionality of introgression when additional taxa are not available. We use simulations to show that DIP can polarize introgression and identify potential sources of bias in the assignment of directionality, and we apply DIP to a well-described hominin introgression event.
Assuntos
Evolução Biológica , Núcleo Celular/genética , Fluxo Gênico , Introgressão Genética , Genoma , Hominidae/genética , Animais , DNA Mitocondrial , Hominidae/classificação , HumanosRESUMO
The telomerase ribonucleoprotein complex (RNP) is essential for genome stability and performs this role through the addition of repetitive DNA to the ends of chromosomes. The telomerase enzyme is composed of a reverse transcriptase (TERT), which utilizes a template domain in an RNA subunit (TER) to reiteratively add telomeric DNA at the ends of chromosomes. Multiple TERs have been identified in the model plant Arabidopsis thaliana. Here we combine a phylogenetic and biochemical approach to understand how the telomerase RNP has evolved in Brassicaceae, the family that includes A. thaliana. Because of the complex phylogenetic pattern of template domain loss and alteration at the previously characterized A. thaliana TER loci, TER1 and TER2, across the plant family Brassicaceae, we bred double mutants from plants with a template deletion at AtTER1 and T-DNA insertion at AtTER2. These double mutants exhibited no telomere length deficiency, a definitive indication that neither of these loci encode a functional telomerase RNA. Moreover, we determined that the telomerase components TERT, Dyskerin, and the KU heterodimer are under strong purifying selection, consistent with the idea that the TER with which they interact is also conserved. To test this hypothesis further, we analyzed the substrate specificity of telomerase from species across Brassicaceae and determined that telomerase from close relatives bind and extend substrates in a similar manner, supporting the idea that TERs in different species are highly similar to one another and are likely encoded from an orthologous locus. Lastly, TERT proteins from across Brassicaceae were able to complement loss of function tert mutants in vivo, indicating TERTs from other species have the ability to recognize the native TER of A. thaliana. Finally, we immunoprecipitated the telomerase complex and identified associated RNAs via RNA-seq. Using our evolutionary data we constrained our analyses to conserved RNAs within Brassicaceae that contained a template domain. These analyses revealed a highly expressed locus whose disruption by a T-DNA resulted in a telomeric phenotype similar to the loss of other telomerase core proteins, indicating that the RNA has an important function in telomere maintenance.
Assuntos
Brassicaceae/genética , Proteínas de Plantas/genética , Ribonucleoproteínas/genética , Telomerase/genética , Evolução Molecular , Filogenia , Seleção GenéticaRESUMO
The function and evolution of eukaryotic cells depend upon direct molecular interactions between gene products encoded in nuclear and cytoplasmic genomes. Understanding how these cytonuclear interactions drive molecular evolution and generate genetic incompatibilities between isolated populations and species is of central importance to eukaryotic biology. Plants are an outstanding system to investigate such effects because of their two different genomic compartments present in the cytoplasm (mitochondria and plastids) and the extensive resources detailing subcellular targeting of nuclear-encoded proteins. However, the field lacks a consistent classification scheme for mitochondrial- and plastid-targeted proteins based on their molecular interactions with cytoplasmic genomes and gene products, which hinders efforts to standardize and compare results across studies. Here, we take advantage of detailed knowledge about the model angiosperm Arabidopsis thaliana to provide a curated database of plant cytonuclear interactions at the molecular level. CyMIRA (Cytonuclear Molecular Interactions Reference for Arabidopsis) is available at http://cymira.colostate.edu/ and https://github.com/dbsloan/cymira and will serve as a resource to aid researchers in partitioning evolutionary genomic data into functional gene classes based on organelle targeting and direct molecular interaction with cytoplasmic genomes and gene products. It includes 11 categories (and 27 subcategories) of different cytonuclear complexes and types of molecular interactions, and it reports residue-level information for cytonuclear contact sites. We hope that this framework will make it easier to standardize, interpret, and compare studies testing the functional and evolutionary consequences of cytonuclear interactions.
Assuntos
Arabidopsis/metabolismo , Núcleo Celular/metabolismo , Citoplasma/metabolismo , Evolução Molecular , Genoma de Planta , Proteínas de Plantas/metabolismo , Arabidopsis/genética , Núcleo Celular/genética , Citoplasma/genética , Proteínas de Plantas/genética , Padrões de ReferênciaRESUMO
Mitochondria, a nearly ubiquitous feature of eukaryotes, are derived from an ancient symbiosis. Despite billions of years of cooperative coevolution - in what is arguably the most important mutualism in the history of life - the persistence of mitochondrial genomes also creates conditions for genetic conflict with the nucleus. Because mitochondrial genomes are present in numerous copies per cell, they are subject to both within- and among-organism levels of selection. Accordingly, 'selfish' genotypes that increase their own proliferation can rise to high frequencies even if they decrease organismal fitness. It has been argued that uniparental (often maternal) inheritance of cytoplasmic genomes evolved to curtail such selfish replication by minimizing within-individual variation and, hence, within-individual selection. However, uniparental inheritance creates conditions for cytonuclear conflict over sex determination and sex ratio, as well as conditions for sexual antagonism when mitochondrial variants increase transmission by enhancing maternal fitness but have the side-effect of being harmful to males (i.e., 'mother's curse'). Here, we review recent advances in understanding selfish replication and sexual antagonism in the evolution of mitochondrial genomes and the mechanisms that suppress selfish interactions, drawing parallels and contrasts with other organelles (plastids) and bacterial endosymbionts that arose more recently. Although cytonuclear conflict is widespread across eukaryotes, it can be cryptic due to nuclear suppression, highly variable, and lineage-specific, reflecting the diverse biology of eukaryotes and the varying architectures of their cytoplasmic genomes.
Assuntos
Evolução Biológica , Genoma Mitocondrial/fisiologia , Fenômenos Fisiológicos Bacterianos , Eucariotos/fisiologia , Plastídeos/fisiologia , Simbiose/fisiologiaRESUMO
Expansion of the cytochrome P450 gene family is often proposed to have a critical role in the evolution of metabolic complexity, in particular in microorganisms, insects and plants. However, the molecular mechanisms underlying the evolution of this complexity are poorly understood. Here we describe the evolutionary history of a plant P450 retrogene, which emerged and underwent fixation in the common ancestor of Brassicales, before undergoing tandem duplication in the ancestor of Brassicaceae. Duplication leads first to gain of dual functions in one of the copies. Both sister genes are retained through subsequent speciation but eventually return to a single copy in two of three diverging lineages. In the lineage in which both copies are maintained, the ancestral functions are split between paralogs and a novel function arises in the copy under relaxed selection. Our work illustrates how retrotransposition and gene duplication can favour the emergence of novel metabolic functions.
Assuntos
Proteínas de Arabidopsis/genética , Arabidopsis/genética , Sistema Enzimático do Citocromo P-450/genética , Evolução Molecular , Fabaceae/genética , Genes de Plantas/genética , Turnera/genética , Arabidopsis/metabolismo , Proteínas de Arabidopsis/metabolismo , Sistema Enzimático do Citocromo P-450/metabolismo , Fabaceae/metabolismo , Duplicação Gênica/genética , Variação Genética/genética , Retroelementos/genética , Turnera/metabolismoRESUMO
Transcriptomic analyses from across eukaryotes indicate that most of the genome is transcribed at some point in the developmental trajectory of an organism. One class of these transcripts is termed long intergenic noncoding RNAs (lincRNAs). Recently, attention has focused on understanding the evolutionary dynamics of lincRNAs, particularly their conservation within genomes. Here, we take a comparative genomic and phylogenetic approach to uncover factors influencing lincRNA emergence and persistence in the plant family Brassicaceae, to which Arabidopsis thaliana belongs. We searched 10 genomes across the family for evidence of > 5000 lincRNA loci from A. thaliana From loci conserved in the genomes of multiple species, we built alignments and inferred phylogeny. We then used gene tree/species tree reconciliation to examine the duplication history and timing of emergence of these loci. Emergence of lincRNA loci appears to be linked to local duplication events, but, surprisingly, not whole genome duplication events (WGD), or transposable elements. Interestingly, WGD events are associated with the loss of loci for species having undergone relatively recent polyploidy. Lastly, we identify 1180 loci of the 6480 previously annotated A. thaliana lincRNAs (18%) with elevated levels of conservation. These conserved lincRNAs show higher expression, and are enriched for stress-responsiveness and cis-regulatory motifs known as conserved noncoding sequences (CNSs). These data highlight potential functional pathways and suggest that CNSs may regulate neighboring genes at both the genomic and transcriptomic level. In sum, we provide insight into processes that may influence lincRNA diversification by providing an evolutionary context for previously annotated lincRNAs.