RESUMEN
Alternative splicing (AS) generates remarkable regulatory and proteomic complexity in metazoans. However, the functions of most AS events are not known, and programs of regulated splicing remain to be identified. To address these challenges, we describe the Vertebrate Alternative Splicing and Transcription Database (VastDB), the largest resource of genome-wide, quantitative profiles of AS events assembled to date. VastDB provides readily accessible quantitative information on the inclusion levels and functional associations of AS events detected in RNA-seq data from diverse vertebrate cell and tissue types, as well as developmental stages. The VastDB profiles reveal extensive new intergenic and intragenic regulatory relationships among different classes of AS and previously unknown and conserved landscapes of tissue-regulated exons. Contrary to recent reports concluding that nearly all human genes express a single major isoform, VastDB provides evidence that at least 48% of multiexonic protein-coding genes express multiple splice variants that are highly regulated in a cell/tissue-specific manner, and that >18% of genes simultaneously express multiple major isoforms across diverse cell and tissue types. Isoforms encoded by the latter set of genes are generally coexpressed in the same cells and are often engaged by translating ribosomes. Moreover, they are encoded by genes that are significantly enriched in functions associated with transcriptional control, implying they may have an important and wide-ranging role in controlling cellular activities. VastDB thus provides an unprecedented resource for investigations of AS function and regulation.
Asunto(s)
Empalme Alternativo , Bases de Datos de Ácidos Nucleicos , Exones , Redes Reguladoras de Genes , Isoformas de Proteínas , Animales , Pollos , Humanos , Ratones , Isoformas de Proteínas/biosíntesis , Isoformas de Proteínas/genéticaRESUMEN
Long non-coding RNAs (lncRNAs) are functional non-translated molecules greater than 200 nt. Their roles are diverse and they are usually involved in transcriptional regulation. LncRNAs still remain largely uninvestigated in plants with few exceptions. Experimentally validated plant lncRNAs have been shown to regulate important agronomic traits such as phosphate starvation response, flowering time and interaction with symbiotic organisms, making them of great interest in plant biology and in breeding. There is still a lack of lncRNAs in most sequenced plant species, and in those where they have been annotated, different methods have been used, so making the lncRNAs less useful in comparisons within and between species. We developed a pipeline to annotate lncRNAs and applied it to 37 plant species and six algae, resulting in the annotation of more than 120 000 lncRNAs. To facilitate the study of lncRNAs for the plant research community, the information gathered is organised in the Green Non-Coding Database (GreeNC, http://greenc.sciencedesigners.com/).
Asunto(s)
Bases de Datos de Ácidos Nucleicos , Genoma de Planta , ARN Largo no Codificante/química , ARN Largo no Codificante/genética , Anotación de Secuencia MolecularRESUMEN
Alternative splicing (AS) can vastly expand animal transcriptomes and proteomes. Two main open questions in the field are how AS is regulated across cell/tissue types and disease, and what roles different AS events play. To facilitate AS research, we have created the computational VastDB framework, which comprises a series of complementary software and resources that we describe in this chapter. The VastDB framework is especially designed to aid biomedical researchers without a strong computational background. It offers tools and resources to: (a) quantify AS and identify differentially spliced AS events using RNA-seq data (vast-tools), (b) perform multiple genomic and sequence analyses for investigating AS events (Matt), (c) identify AS events with genomic and regulatory conservation among species (ExOrthist), and (d) help with the biological interpretation of the results, and, ultimately, with the identification of interesting AS events to design wet-lab experiments (VastDB and PastDB).
Asunto(s)
Empalme Alternativo , Programas Informáticos , Animales , Biología Computacional/métodos , Exones , Genoma , Genómica/métodosRESUMEN
Several bioinformatic tools have been developed for genome-wide identification of orthologous and paralogous genes. However, no corresponding tool allows the detection of exon homology relationships. Here, we present ExOrthist, a fully reproducible Nextflow-based software enabling inference of exon homologs and orthogroups, visualization of evolution of exon-intron structures, and assessment of conservation of alternative splicing patterns. ExOrthist evaluates exon sequence conservation and considers the surrounding exon-intron context to derive genome-wide multi-species exon homologies at any evolutionary distance. We demonstrate its use in different evolutionary scenarios: whole genome duplication in frogs and convergence of Nova-regulated splicing networks ( https://github.com/biocorecrg/ExOrthist ).
Asunto(s)
Biología Computacional , Evolución Molecular , Exones , Programas Informáticos , Empalme Alternativo , Animales , Secuencia Conservada , Genoma , Humanos , Intrones , RatonesRESUMEN
The turbot is a flatfish (Pleuronectiformes) with increasing commercial value, which has prompted active genomic research aimed at more efficient selection. Here we present the sequence and annotation of the turbot genome, which represents a milestone for both boosting breeding programmes and ascertaining the origin and diversification of flatfish. We compare the turbot genome with model fish genomes to investigate teleost chromosome evolution. We observe a conserved macrosyntenic pattern within Percomorpha and identify large syntenic blocks within the turbot genome related to the teleost genome duplication. We identify gene family expansions and positive selection of genes associated with vision and metabolism of membrane lipids, which suggests adaptation to demersal lifestyle and to cold temperatures, respectively. Our data indicate a quick evolution and diversification of flatfish to adapt to benthic life and provide clues for understanding their controversial origin. Moreover, we investigate the genomic architecture of growth, sex determination and disease resistance, key traits for understanding local adaptation and boosting turbot production, by mapping candidate genes and previously reported quantitative trait loci. The genomic architecture of these productive traits has allowed the identification of candidate genes and enriched pathways that may represent useful information for future marker-assisted selection in turbot.