RESUMO
Vertebrates have greatly elaborated the basic chordate body plan and evolved highly distinctive genomes that have been sculpted by two whole-genome duplications. Here we sequence the genome of the Mediterranean amphioxus (Branchiostoma lanceolatum) and characterize DNA methylation, chromatin accessibility, histone modifications and transcriptomes across multiple developmental stages and adult tissues to investigate the evolution of the regulation of the chordate genome. Comparisons with vertebrates identify an intermediate stage in the evolution of differentially methylated enhancers, and a high conservation of gene expression and its cis-regulatory logic between amphioxus and vertebrates that occurs maximally at an earlier mid-embryonic phylotypic period. We analyse regulatory evolution after whole-genome duplications, and find that-in vertebrates-over 80% of broadly expressed gene families with multiple paralogues derived from whole-genome duplications have members that restricted their ancestral expression, and underwent specialization rather than subfunctionalization. Counter-intuitively, paralogues that restricted their expression increased the complexity of their regulatory landscapes. These data pave the way for a better understanding of the regulatory principles that underlie key vertebrate innovations.
Assuntos
Regulação da Expressão Gênica , Genômica , Anfioxos/genética , Vertebrados/genética , Animais , Padronização Corporal/genética , Metilação de DNA , Humanos , Anfioxos/embriologia , Anotação de Sequência Molecular , Regiões Promotoras Genéticas , Transcriptoma/genéticaRESUMO
Correction of mis-splicing events is a growing therapeutic approach for neurological diseases such as spinal muscular atrophy or neuronal ceroid lipofuscinosis 7, which are caused by splicing-affecting mutations. Mis-spliced effector genes that do not harbour mutations are also good candidate therapeutic targets in diseases with more complex aetiologies such as cancer, autism, muscular dystrophies or neurodegenerative diseases. Next-generation RNA sequencing (RNA-seq) has boosted investigation of global mis-splicing in diseased tissue to identify such key pathogenic mis-spliced genes. Nevertheless, while analysis of tumour or dystrophic muscle biopsies can be informative on early stage pathogenic mis-splicing, for neurodegenerative diseases, these analyses are intrinsically hampered by neuronal loss and neuroinflammation in post-mortem brains. To infer splicing alterations relevant to Huntington's disease pathogenesis, here we performed intersect-RNA-seq analyses of human post-mortem striatal tissue and of an early symptomatic mouse model in which neuronal loss and gliosis are not yet present. Together with a human/mouse parallel motif scan analysis, this approach allowed us to identify the shared mis-splicing signature triggered by the Huntington's disease-causing mutation in both species and to infer upstream deregulated splicing factors. Moreover, we identified a plethora of downstream neurodegeneration-linked mis-spliced effector genes that-together with the deregulated splicing factors-become new possible therapeutic targets. In summary, here we report pathogenic global mis-splicing in Huntington's disease striatum captured by our new intersect-RNA-seq approach that can be readily applied to other neurodegenerative diseases for which bona fide animal models are available.
Assuntos
Processamento Alternativo/genética , Proteína Huntingtina/genética , Doença de Huntington/genética , Fatores de Processamento de RNA/genética , Animais , Corpo Estriado/patologia , Humanos , Doença de Huntington/patologia , Camundongos , Análise de Sequência de RNA/métodosRESUMO
Alternative splicing (AS) generates remarkable regulatory and proteomic complexity in metazoans. However, the functions of most AS events are not known, and programs of regulated splicing remain to be identified. To address these challenges, we describe the Vertebrate Alternative Splicing and Transcription Database (VastDB), the largest resource of genome-wide, quantitative profiles of AS events assembled to date. VastDB provides readily accessible quantitative information on the inclusion levels and functional associations of AS events detected in RNA-seq data from diverse vertebrate cell and tissue types, as well as developmental stages. The VastDB profiles reveal extensive new intergenic and intragenic regulatory relationships among different classes of AS and previously unknown and conserved landscapes of tissue-regulated exons. Contrary to recent reports concluding that nearly all human genes express a single major isoform, VastDB provides evidence that at least 48% of multiexonic protein-coding genes express multiple splice variants that are highly regulated in a cell/tissue-specific manner, and that >18% of genes simultaneously express multiple major isoforms across diverse cell and tissue types. Isoforms encoded by the latter set of genes are generally coexpressed in the same cells and are often engaged by translating ribosomes. Moreover, they are encoded by genes that are significantly enriched in functions associated with transcriptional control, implying they may have an important and wide-ranging role in controlling cellular activities. VastDB thus provides an unprecedented resource for investigations of AS function and regulation.
Assuntos
Processamento Alternativo , Bases de Dados de Ácidos Nucleicos , Éxons , Redes Reguladoras de Genes , Isoformas de Proteínas , Animais , Galinhas , Humanos , Camundongos , Isoformas de Proteínas/biossíntese , Isoformas de Proteínas/genéticaRESUMO
Alternative splicing generates multiple transcript and protein isoforms from the same gene and thus is important in gene expression regulation. To date, RNA-sequencing (RNA-seq) is the standard method for quantifying changes in alternative splicing on a genome-wide scale. Understanding the current limitations of RNA-seq is crucial for reliable analysis and the lack of high quality, comprehensive transcriptomes for most species, including model organisms such as Arabidopsis, is a major constraint in accurate quantification of transcript isoforms. To address this, we designed a novel pipeline with stringent filters and assembled a comprehensive Reference Transcript Dataset for Arabidopsis (AtRTD2) containing 82,190 non-redundant transcripts from 34 212 genes. Extensive experimental validation showed that AtRTD2 and its modified version, AtRTD2-QUASI, for use in Quantification of Alternatively Spliced Isoforms, outperform other available transcriptomes in RNA-seq analysis. This strategy can be implemented in other species to build a pipeline for transcript-level expression and alternative splicing analyses.
Assuntos
Processamento Alternativo , Arabidopsis/genética , Genes de Insetos , Transcriptoma , Variação Genética , Proteômica , RNA não Traduzido , Valores de Referência , Reprodutibilidade dos Testes , Análise de Sequência de RNA , Transcrição GênicaRESUMO
Alternative splicing (AS) diversifies transcriptomes and proteomes and is widely recognized as a key mechanism for regulating gene expression. Previously, in an analysis of intron retention events in Arabidopsis, we found unusual AS events inside annotated protein-coding exons. Here, we also identify such AS events in human and use these two sets to analyse their features, regulation, functional impact, and evolutionary origin. As these events involve introns with features of both introns and protein-coding exons, we name them exitrons (exonic introns). Though exitrons were detected as a subset of retained introns, they are clearly distinguishable, and their splicing results in transcripts with different fates. About half of the 1002 Arabidopsis and 923 human exitrons have sizes of multiples of 3 nucleotides (nt). Splicing of these exitrons results in internally deleted proteins and affects protein domains, disordered regions, and various post-translational modification sites, thus broadly impacting protein function. Exitron splicing is regulated across tissues, in response to stress and in carcinogenesis. Intriguingly, annotated intronless genes can be also alternatively spliced via exitron usage. We demonstrate that at least some exitrons originate from ancestral coding exons. Based on our findings, we propose a "splicing memory" hypothesis whereby upon intron loss imprints of former exon borders defined by vestigial splicing regulatory elements could drive the evolution of exitron splicing. Altogether, our studies show that exitron splicing is a conserved strategy for increasing proteome plasticity in plants and animals, complementing the repertoire of AS events.
Assuntos
Processamento Alternativo , Éxons , Íntrons , Fases de Leitura Aberta , Proteômica , Arabidopsis/genética , Arabidopsis/metabolismo , Neoplasias da Mama , Evolução Molecular , Feminino , Regulação da Expressão Gênica de Plantas , Humanos , Especificidade de Órgãos/genética , Biossíntese de Proteínas , Transporte de RNA , Estresse Fisiológico/genética , TranscriptomaRESUMO
Transcript annotation in plant databases is incomplete and often inaccurate, leading to misinterpretation. As more and more RNA-seq data are generated, plant scientists need to be aware of potential pitfalls and understand the nature and impact of specific alternative splicing transcripts on protein production. A primary area of concern and the topic of this article is the (mis)annotation of open reading frames and premature termination codons. The basic message is that to adequately address expression and functions of transcript isoforms, it is necessary to be able to predict their fate in terms of whether protein isoforms are generated or specific transcripts are unproductive or degraded.
Assuntos
Processamento Alternativo , Proteínas de Plantas/genética , Plantas/genética , Biossíntese de Proteínas/genética , Modelos Genéticos , Fases de Leitura Aberta/genética , Isoformas de Proteínas/genética , Estabilidade de RNA , RNA Mensageiro/genéticaRESUMO
The formation of RNA-DNA hybrids, referred to as R-loops, can promote genome instability and cancer development. Yet the mechanisms by which R-loops compromise genome instability are poorly understood. Here, we establish roles for the evolutionarily conserved Nrl1 protein in pre-mRNA splicing regulation, R-loop suppression and in maintaining genome stability. nrl1Δ mutants exhibit endogenous DNA damage, are sensitive to exogenous DNA damage, and have defects in homologous recombination (HR) repair. Concomitantly, nrl1Δ cells display significant changes in gene expression, similar to those induced by DNA damage in wild-type cells. Further, we find that nrl1Δ cells accumulate high levels of R-loops, which co-localize with HR repair factors and require Rad51 and Rad52 for their formation. Together, our findings support a model in which R-loop accumulation and subsequent DNA damage sequesters HR factors, thereby compromising HR repair at endogenously or exogenously induced DNA damage sites, leading to genome instability.
Assuntos
Processamento Alternativo/genética , Instabilidade Genômica/genética , Recombinação Homóloga/genética , Precursores de RNA/genética , Proteínas de Schizosaccharomyces pombe/genética , DNA/química , DNA/genética , Reparo do DNA/genética , RNA/química , RNA/genética , Rad51 Recombinase/genética , Proteína Rad52 de Recombinação e Reparo de DNA/genética , Schizosaccharomyces/genética , Spliceossomos/genética , Spliceossomos/metabolismoRESUMO
Alternative splicing (AS) of precursor mRNAs (pre-mRNAs) from multiexon genes allows organisms to increase their coding potential and regulate gene expression through multiple mechanisms. Recent transcriptome-wide analysis of AS using RNA sequencing has revealed that AS is highly pervasive in plants. Pre-mRNAs from over 60% of intron-containing genes undergo AS to produce a vast repertoire of mRNA isoforms. The functions of most splice variants are unknown. However, emerging evidence indicates that splice variants increase the functional diversity of proteins. Furthermore, AS is coupled to transcript stability and translation through nonsense-mediated decay and microRNA-mediated gene regulation. Widespread changes in AS in response to developmental cues and stresses suggest a role for regulated splicing in plant development and stress responses. Here, we review recent progress in uncovering the extent and complexity of the AS landscape in plants, its regulation, and the roles of AS in gene regulation. The prevalence of AS in plants has raised many new questions that require additional studies. New tools based on recent technological advances are allowing genome-wide analysis of RNA elements in transcripts and of chromatin modifications that regulate AS. Application of these tools in plants will provide significant new insights into AS regulation and crosstalk between AS and other layers of gene regulation.
Assuntos
Processamento Alternativo , Regulação da Expressão Gênica de Plantas , Desenvolvimento Vegetal , Epigênese Genética , Precursores de RNA/genética , RNA de Plantas/genética , Transdução de Sinais , Spliceossomos/metabolismoRESUMO
Alternative splicing (AS) is a key regulatory mechanism that contributes to transcriptome and proteome diversity. As very few genome-wide studies analyzing AS in plants are available, we have performed high-throughput sequencing of a normalized cDNA library which resulted in a high coverage transcriptome map of Arabidopsis. We detect â¼150,000 splice junctions derived mostly from typical plant introns, including an eightfold increase in the number of U12 introns (2069). Around 61% of multiexonic genes are alternatively spliced under normal growth conditions. Moreover, we provide experimental validation of 540 AS transcripts (from 256 genes coding for important regulatory factors) using high-resolution RT-PCR and Sanger sequencing. Intron retention (IR) is the most frequent AS event (â¼40%), but many IRs have relatively low read coverage and are less well-represented in assembled transcripts. Additionally, â¼51% of Arabidopsis genes produce AS transcripts which do not involve IR. Therefore, the significance of IR in generating transcript diversity was generally overestimated in previous assessments. IR analysis allowed the identification of a large set of cryptic introns inside annotated coding exons. Importantly, a significant fraction of these cryptic introns are spliced out in frame, indicating a role in protein diversity. Furthermore, we show extensive AS coupled to nonsense-mediated decay in AFC2, encoding a highly conserved LAMMER kinase which phosphorylates splicing factors, thus establishing a complex loop in AS regulation. We provide the most comprehensive analysis of AS to date which will serve as a valuable resource for the plant community to study transcriptome complexity and gene regulation.
Assuntos
Processamento Alternativo , Arabidopsis/genética , Genoma de Planta , Íntrons , Proteínas de Arabidopsis/genética , Proteínas de Arabidopsis/metabolismo , Éxons , Biblioteca Gênica , Sequenciamento de Nucleotídeos em Larga Escala , Proteínas Serina-Treonina Quinases/genética , TranscriptomaRESUMO
RNA-sequencing (RNA-seq) allows global gene expression analysis at the individual transcript level. Accurate quantification of transcript variants generated by alternative splicing (AS) remains a challenge. We have developed a comprehensive, nonredundant Arabidopsis reference transcript dataset (AtRTD) containing over 74 000 transcripts for use with algorithms to quantify AS transcript isoforms in RNA-seq. The AtRTD was formed by merging transcripts from TAIR10 and novel transcripts identified in an AS discovery project. We have estimated transcript abundance in RNA-seq data using the transcriptome-based alignment-free programmes Sailfish and Salmon and have validated quantification of splicing ratios from RNA-seq by high resolution reverse transcription polymerase chain reaction (HR RT-PCR). Good correlations between splicing ratios from RNA-seq and HR RT-PCR were obtained demonstrating the accuracy of abundances calculated for individual transcripts in RNA-seq. The AtRTD is a resource that will have immediate utility in analysing Arabidopsis RNA-seq data to quantify differential transcript abundance and expression.
Assuntos
Processamento Alternativo , Arabidopsis/genética , Perfilação da Expressão Gênica/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Isoformas de Proteínas/análise , RNA Mensageiro/análise , Análise de Sequência de RNA/métodos , Algoritmos , Sequência de Bases , Conjuntos de Dados como Assunto , Genes de Plantas , Splicing de RNA , Valores de Referência , Reprodutibilidade dos Testes , Reação em Cadeia da Polimerase Via Transcriptase Reversa , Software , TranscriptomaRESUMO
AtCyp59 is a multidomain cyclophilin containing a peptidyl-prolyl cis/trans isomerase (PPIase) domain and an evolutionarily highly conserved RRM domain. Deregulation of this class of cyclophilins has been shown to affect transcription and to influence phosphorylation of the C-terminal repeat domain of the largest subunit of the RNA polymerase II. We used a genomic SELEX method for identifying RNA targets of AtCyp59. Analysis of the selected RNAs revealed an RNA-binding motif (G[U/C]N[G/A]CC[A/G]) and we show that it is evolutionarily conserved. Binding to this motif was verified by gel shift assays in vitro and by RNA immunopreciptation assays of AtCyp59 in vivo. Most importantly, we show that binding also occurs on unprocessed transcripts in vivo and that binding of specific RNAs inhibits the PPIase activity of AtCyp59 in vitro. Surprisingly, genome-wide analysis showed that the RNA motif is present in about 70% of the annotated transcripts preferentially in exons. Taken together, the available data suggest that these cyclophilins might have an important function in transcription regulation.
Assuntos
Proteínas de Arabidopsis/metabolismo , Ciclofilinas/metabolismo , RNA Mensageiro/química , RNA Mensageiro/metabolismo , Proteínas de Ligação a RNA/metabolismo , Genômica/métodos , Motivos de Nucleotídeos , RNA Polimerase II/metabolismo , RNA de Plantas/química , RNA de Plantas/metabolismoRESUMO
Alternative splicing (AS) coupled to nonsense-mediated decay (NMD) is a post-transcriptional mechanism for regulating gene expression. We have used a high-resolution AS RT-PCR panel to identify endogenous AS isoforms which increase in abundance when NMD is impaired in the Arabidopsis NMD factor mutants, upf1-5 and upf3-1. Of 270 AS genes (950 transcripts) on the panel, 102 transcripts from 97 genes (32%) were identified as NMD targets. Extrapolating from these data around 13% of intron-containing genes in the Arabidopsis genome are potentially regulated by AS/NMD. This cohort of naturally occurring NMD-sensitive AS transcripts also allowed the analysis of the signals for NMD in plants. We show the importance of AS in introns in 5' or 3'UTRs in modulating NMD-sensitivity of mRNA transcripts. In particular, we identified upstream open reading frames overlapping the main start codon as a new trigger for NMD in plants and determined that NMD is induced if 3'-UTRs were >350 nt. Unexpectedly, although many intron retention transcripts possess NMD features, they are not sensitive to NMD. Finally, we have shown that AS/NMD regulates the abundance of transcripts of many genes important for plant development and adaptation including transcription factors, RNA processing factors and stress response genes.
Assuntos
Processamento Alternativo , Arabidopsis/genética , Regulação da Expressão Gênica de Plantas , Genes Reguladores , Degradação do RNAm Mediada por Códon sem Sentido , Regiões 3' não Traduzidas , Arabidopsis/efeitos dos fármacos , Proteínas de Arabidopsis/genética , Códon de Iniciação , Códon sem Sentido , Cicloeximida/farmacologia , Genes de Plantas , Íntrons , Degradação do RNAm Mediada por Códon sem Sentido/efeitos dos fármacos , Isoformas de Proteínas/genética , Isoformas de Proteínas/metabolismo , RNA Helicases/genética , RNA Mensageiro/química , RNA Mensageiro/metabolismo , Reação em Cadeia da Polimerase Via Transcriptase ReversaRESUMO
Regulation of gene expression is arguably the main mechanism underlying the phenotypic diversity of tissues within and between species. Here we assembled an extensive transcriptomic dataset covering 8 tissues across 20 bilaterian species and performed analyses using a symmetric phylogeny that allowed the combined and parallel investigation of gene expression evolution between vertebrates and insects. We specifically focused on widely conserved ancestral genes, identifying strong cores of pan-bilaterian tissue-specific genes and even larger groups that diverged to define vertebrate and insect tissues. Systematic inferences of tissue-specificity gains and losses show that nearly half of all ancestral genes have been recruited into tissue-specific transcriptomes. This occurred during both ancient and, especially, recent bilaterian evolution, with several gains being associated with the emergence of unique phenotypes (for example, novel cell types). Such pervasive evolution of tissue specificity was linked to gene duplication coupled with expression specialization of one of the copies, revealing an unappreciated prolonged effect of whole-genome duplications on recent vertebrate evolution.
Assuntos
Evolução Molecular , Insetos , Vertebrados , Animais , Insetos/genética , Vertebrados/genética , Especificidade de Órgãos , Transcriptoma , FilogeniaRESUMO
Alternative splicing (AS) can vastly expand animal transcriptomes and proteomes. Two main open questions in the field are how AS is regulated across cell/tissue types and disease, and what roles different AS events play. To facilitate AS research, we have created the computational VastDB framework, which comprises a series of complementary software and resources that we describe in this chapter. The VastDB framework is especially designed to aid biomedical researchers without a strong computational background. It offers tools and resources to: (a) quantify AS and identify differentially spliced AS events using RNA-seq data (vast-tools), (b) perform multiple genomic and sequence analyses for investigating AS events (Matt), (c) identify AS events with genomic and regulatory conservation among species (ExOrthist), and (d) help with the biological interpretation of the results, and, ultimately, with the identification of interesting AS events to design wet-lab experiments (VastDB and PastDB).
Assuntos
Processamento Alternativo , Software , Animais , Biologia Computacional/métodos , Éxons , Genoma , Genômica/métodosRESUMO
BACKGROUND: Accurate and comprehensive annotation of transcript sequences is essential for transcript quantification and differential gene and transcript expression analysis. Single-molecule long-read sequencing technologies provide improved integrity of transcript structures including alternative splicing, and transcription start and polyadenylation sites. However, accuracy is significantly affected by sequencing errors, mRNA degradation, or incomplete cDNA synthesis. RESULTS: We present a new and comprehensive Arabidopsis thaliana Reference Transcript Dataset 3 (AtRTD3). AtRTD3 contains over 169,000 transcripts-twice that of the best current Arabidopsis transcriptome and including over 1500 novel genes. Seventy-eight percent of transcripts are from Iso-seq with accurately defined splice junctions and transcription start and end sites. We develop novel methods to determine splice junctions and transcription start and end sites accurately. Mismatch profiles around splice junctions provide a powerful feature to distinguish correct splice junctions and remove false splice junctions. Stratified approaches identify high-confidence transcription start and end sites and remove fragmentary transcripts due to degradation. AtRTD3 is a major improvement over existing transcriptomes as demonstrated by analysis of an Arabidopsis cold response RNA-seq time-series. AtRTD3 provides higher resolution of transcript expression profiling and identifies cold-induced differential transcription start and polyadenylation site usage. CONCLUSIONS: AtRTD3 is the most comprehensive Arabidopsis transcriptome currently. It improves the precision of differential gene and transcript expression, differential alternative splicing, and transcription start/end site usage analysis from RNA-seq data. The novel methods for identifying accurate splice junctions and transcription start/end sites are widely applicable and will improve single-molecule sequencing analysis from any species.
Assuntos
Arabidopsis , Transcriptoma , Processamento Alternativo , Arabidopsis/genética , Perfilação da Expressão Gênica/métodos , RNA-Seq , Análise de Sequência de RNA/métodosRESUMO
BACKGROUND: The legume-rhizobium symbiosis requires the formation of root nodules, specialized organs where the nitrogen fixation process takes place. Nodule development is accompanied by the induction of specific plant genes, referred to as nodulin genes. Important roles in processes such as morphogenesis and metabolism have been assigned to nodulins during the legume-rhizobium symbiosis. RESULTS: Here we report the purification and biochemical characterization of a novel nodulin from common bean (Phaseolus vulgaris L.) root nodules. This protein, called nodulin 41 (PvNod41) was purified through affinity chromatography and was partially sequenced. A genomic clone was then isolated via PCR amplification. PvNod41 is an atypical aspartyl peptidase of the A1B subfamily with an optimal hydrolytic activity at pH 4.5. We demonstrate that PvNod41 has limited peptidase activity against casein and is partially inhibited by pepstatin A. A PvNod41-specific antiserum was used to assess the expression pattern of this protein in different plant organs and throughout root nodule development, revealing that PvNod41 is found only in bean root nodules and is confined to uninfected cells. CONCLUSIONS: To date, only a small number of atypical aspartyl peptidases have been characterized in plants. Their particular spatial and temporal expression patterns along with their unique enzymatic properties imply a high degree of functional specialization. Indeed, PvNod41 is closely related to CDR1, an Arabidopsis thaliana extracellular aspartyl protease involved in defense against bacterial pathogens. PvNod41's biochemical properties and specific cell-type localization, in uninfected cells of the common bean root nodule, strongly suggest that this aspartyl peptidase has a key role in plant defense during the symbiotic interaction.
Assuntos
Ácido Aspártico Endopeptidases/metabolismo , Proteínas de Membrana/metabolismo , Phaseolus/enzimologia , Proteínas de Plantas/metabolismo , Nódulos Radiculares de Plantas/enzimologia , Sequência de Aminoácidos , Ácido Aspártico Endopeptidases/genética , Sequência de Bases , Clonagem Molecular , Proteínas de Membrana/genética , Dados de Sequência Molecular , Phaseolus/genética , Filogenia , Proteínas de Plantas/genética , RNA de Plantas/genética , Nódulos Radiculares de Plantas/genética , Alinhamento de Sequência , Análise de Sequência de ProteínaRESUMO
BACKGROUND: Alternative splicing (AS) is a widespread regulatory mechanism in multicellular organisms. Numerous transcriptomic and single-gene studies in plants have investigated AS in response to specific conditions, especially environmental stress, unveiling substantial amounts of intron retention that modulate gene expression. However, a comprehensive study contrasting stress-response and tissue-specific AS patterns and directly comparing them with those of animal models is still missing. RESULTS: We generate a massive resource for Arabidopsis thaliana, PastDB, comprising AS and gene expression quantifications across tissues, development and environmental conditions, including abiotic and biotic stresses. Harmonized analysis of these datasets reveals that A. thaliana shows high levels of AS, similar to fruitflies, and that, compared to animals, disproportionately uses AS for stress responses. We identify core sets of genes regulated specifically by either AS or transcription upon stresses or among tissues, a regulatory specialization that is tightly mirrored by the genomic features of these genes. Unexpectedly, non-intron retention events, including exon skipping, are overrepresented across regulated AS sets in A. thaliana, being also largely involved in modulating gene expression through NMD and uORF inclusion. CONCLUSIONS: Non-intron retention events have likely been functionally underrated in plants. AS constitutes a distinct regulatory layer controlling gene expression upon internal and external stimuli whose target genes and master regulators are hardwired at the genomic level to specifically undergo post-transcriptional regulation. Given the higher relevance of AS in the response to different stresses when compared to animals, this molecular hardwiring is likely required for a proper environmental response in A. thaliana.
Assuntos
Processamento Alternativo , Arabidopsis/genética , Regulação da Expressão Gênica de Plantas , Animais , Proteínas de Arabidopsis/metabolismo , Éxons , Íntrons , Análise de Sequência de RNA , Estresse Fisiológico/genéticaRESUMO
Several bioinformatic tools have been developed for genome-wide identification of orthologous and paralogous genes. However, no corresponding tool allows the detection of exon homology relationships. Here, we present ExOrthist, a fully reproducible Nextflow-based software enabling inference of exon homologs and orthogroups, visualization of evolution of exon-intron structures, and assessment of conservation of alternative splicing patterns. ExOrthist evaluates exon sequence conservation and considers the surrounding exon-intron context to derive genome-wide multi-species exon homologies at any evolutionary distance. We demonstrate its use in different evolutionary scenarios: whole genome duplication in frogs and convergence of Nova-regulated splicing networks ( https://github.com/biocorecrg/ExOrthist ).
Assuntos
Biologia Computacional , Evolução Molecular , Éxons , Software , Processamento Alternativo , Animais , Sequência Conservada , Genoma , Humanos , Íntrons , CamundongosRESUMO
The causes and consequences of genome reduction in animals are unclear because our understanding of this process mostly relies on lineages with often exceptionally high rates of evolution. Here, we decode the compact 73.8-megabase genome of Dimorphilus gyrociliatus, a meiobenthic segmented worm. The D. gyrociliatus genome retains traits classically associated with larger and slower-evolving genomes, such as an ordered, intact Hox cluster, a generally conserved developmental toolkit and traces of ancestral bilaterian linkage. Unlike some other animals with small genomes, the analysis of the D. gyrociliatus epigenome revealed canonical features of genome regulation, excluding the presence of operons and trans-splicing. Instead, the gene-dense D. gyrociliatus genome presents a divergent Myc pathway, a key physiological regulator of growth, proliferation and genome stability in animals. Altogether, our results uncover a conservative route to genome compaction in annelids, reminiscent of that observed in the vertebrate Takifugu rubripes.
Assuntos
Anelídeos , Evolução Molecular , Animais , Anelídeos/genética , Ligação Genética , Genoma , Takifugu/genéticaRESUMO
The mechanisms by which entire programmes of gene regulation emerged during evolution are poorly understood. Neuronal microexons represent the most conserved class of alternative splicing in vertebrates, and are critical for proper brain development and function. Here, we discover neural microexon programmes in non-vertebrate species and trace their origin to bilaterian ancestors through the emergence of a previously uncharacterized 'enhancer of microexons' (eMIC) protein domain. The eMIC domain originated as an alternative, neural-enriched splice isoform of the pan-eukaryotic Srrm2/SRm300 splicing factor gene, and subsequently became fixed in the vertebrate and neuronal-specific splicing regulator Srrm4/nSR100 and its paralogue Srrm3. Remarkably, the eMIC domain is necessary and sufficient for microexon splicing, and functions by interacting with the earliest components required for exon recognition. The emergence of a novel domain with restricted expression in the nervous system thus resulted in the evolution of splicing programmes that qualitatively expanded the neuronal molecular complexity in bilaterians.