Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 32
Filter
Add more filters










Publication year range
1.
J Integr Bioinform ; 14(1)2017 Jun 05.
Article in English | MEDLINE | ID: mdl-28637930

ABSTRACT

The miRBase currently reports more than 25,000 microRNAs in several hundred genomes that belong to more than 1000 families of homologous sequences. Quantitative investigations of miRNA gene evolution requires the construction of data sets that are consistent in their coverage and include those genomes that are of interest in a given study. Given the size and structure of data, this can be achieved only with the help of a fully automatic pipeline that improves the available seed alignments, extends the set of available sequences by homology search, and reliably identifies true positive homology search results. Here we describe the current progress towards such a system, emphasizing the task of improving and completing the initial seed alignment.


Subject(s)
Evolution, Molecular , MicroRNAs/genetics , Sequence Alignment/methods , Sequence Homology , Animals , Automation , Databases, Genetic , Datasets as Topic , Genome/genetics , Humans
2.
Noncoding RNA ; 3(1)2017 Jan 05.
Article in English | MEDLINE | ID: mdl-29657275

ABSTRACT

The U3 small nucleolar RNA (snoRNA) is an essential player in the initial steps of ribosomal RNA biogenesis which is ubiquitously present in Eukarya. It is exceptional among the small nucleolar RNAs in its size, the presence of multiple conserved sequence boxes, a highly conserved secondary structure core, its biogenesis as an independent gene transcribed by polymerase III, and its involvement in pre-rRNA cleavage rather than chemical modification. Fungal U3 snoRNAs share many features with their sisters from other eukaryotic kingdoms but differ from them in particular in their 5' regions, which in fungi has a distinctive consensus structure and often harbours introns. Here we report on a comprehensive homology search and detailed analysis of the evolution of sequence and secondary structure features covering the entire kingdom Fungi.

3.
BMC Genomics ; 17(1): 969, 2016 11 24.
Article in English | MEDLINE | ID: mdl-27881081

ABSTRACT

BACKGROUND: Small nucleolar RNAs (snoRNAs) are one of the most ancient families amongst non-protein-coding RNAs. They are ubiquitous in Archaea and Eukarya but absent in bacteria. Their main function is to target chemical modifications of ribosomal RNAs. They fall into two classes, box C/D snoRNAs and box H/ACA snoRNAs, which are clearly distinguished by conserved sequence motifs and the type of chemical modification that they govern. Similarly to microRNAs, snoRNAs appear in distinct families of homologs that affect homologous targets. In animals, snoRNAs and their evolution have been studied in much detail. In plants, however, their evolution has attracted comparably little attention. RESULTS: In order to chart the phylogenetic distribution of individual snoRNA families in plants, we applied a sophisticated approach for identifying homologs of known plant snoRNAs across the plant kingdom. In response to the relatively fast evolution of snoRNAs, information on conserved sequence boxes, target sequences, and secondary structure is combined to identify additional snoRNAs. We identified 296 families of snoRNAs in 24 species and traced their evolution throughout the plant kingdom. Many of the plant snoRNA families comprise paralogs. We also found that targets are well-conserved for most snoRNA families. CONCLUSIONS: The sequence conservation of snoRNAs is sufficient to establish homologies between phyla. The degree of this conservation tapers off, however, between land plants and algae. Plant snoRNAs are frequently organized in highly conserved spatial clusters. As a resource for further investigations we provide carefully curated and annotated alignments for each snoRNA family under investigation.


Subject(s)
Multigene Family , Phylogeny , Plants/classification , Plants/genetics , RNA, Plant/genetics , RNA, Small Nucleolar/genetics , Base Sequence , Cluster Analysis , Computational Biology/methods , Conserved Sequence , Databases, Nucleic Acid , Evolution, Molecular
4.
Nucleic Acids Res ; 44(11): 5068-82, 2016 06 20.
Article in English | MEDLINE | ID: mdl-27174936

ABSTRACT

Small nucleolar RNAs (snoRNAs) are a class of non-coding RNAs that guide the post-transcriptional processing of other non-coding RNAs (mostly ribosomal RNAs), but have also been implicated in processes ranging from microRNA-dependent gene silencing to alternative splicing. In order to construct an up-to-date catalog of human snoRNAs we have combined data from various databases, de novo prediction and extensive literature review. In total, we list more than 750 curated genomic loci that give rise to snoRNA and snoRNA-like genes. Utilizing small RNA-seq data from the ENCODE project, our study characterizes the plasticity of snoRNA expression identifying both constitutively as well as cell type specific expressed snoRNAs. Especially, the comparison of malignant to non-malignant tissues and cell types shows a dramatic perturbation of the snoRNA expression profile. Finally, we developed a high-throughput variant of the reverse-transcriptase-based method for identifying 2'-O-methyl modifications in RNAs termed RimSeq. Using the data from this and other high-throughput protocols together with previously reported modification sites and state-of-the-art target prediction methods we re-estimate the snoRNA target RNA interaction network. Our current results assign a reliable modification site to 83% of the canonical snoRNAs, leaving only 76 snoRNA sequences as orphan.


Subject(s)
Gene Expression Profiling , RNA Processing, Post-Transcriptional , RNA, Small Nucleolar , Transcriptome , Cluster Analysis , Computational Biology/methods , Databases, Nucleic Acid , Gene Expression Regulation , Humans , Molecular Sequence Annotation , Nucleic Acid Conformation , RNA, Untranslated
6.
Nat Genet ; 48(4): 427-37, 2016 Apr.
Article in English | MEDLINE | ID: mdl-26950095

ABSTRACT

To connect human biology to fish biomedical models, we sequenced the genome of spotted gar (Lepisosteus oculatus), whose lineage diverged from teleosts before teleost genome duplication (TGD). The slowly evolving gar genome has conserved in content and size many entire chromosomes from bony vertebrate ancestors. Gar bridges teleosts to tetrapods by illuminating the evolution of immunity, mineralization and development (mediated, for example, by Hox, ParaHox and microRNA genes). Numerous conserved noncoding elements (CNEs; often cis regulatory) undetectable in direct human-teleost comparisons become apparent using gar: functional studies uncovered conserved roles for such cryptic CNEs, facilitating annotation of sequences identified in human genome-wide association studies. Transcriptomic analyses showed that the sums of expression domains and expression levels for duplicated teleost genes often approximate the patterns and levels of expression for gar genes, consistent with subfunctionalization. The gar genome provides a resource for understanding evolution after genome duplication, the origin of vertebrate genomes and the function of human regulatory sequences.


Subject(s)
Fishes/genetics , Animals , Evolution, Molecular , Female , Fishes/metabolism , Genome , Humans , Karyotype , Models, Genetic , Organ Specificity , Sequence Analysis, DNA , Transcriptome
7.
RNA Biol ; 13(2): 119-27, 2016.
Article in English | MEDLINE | ID: mdl-26828373

ABSTRACT

U6 small nuclear RNAs are part of the splicing machinery. They exhibit several unique features setting them appart from other snRNAs. Reports of introns in structured non-coding RNAs have been very rare. U6 genes, however, were found to be interrupted by an intron in several Schizosaccharomyces species and in 2 Basidiomycota. We conducted a homology search across 147 currently available fungal genome and identified the U6 genes in all but 2 of them. A detailed comparison of their sequences and predicted secondary structures showed that intron insertion events in the U6 snRNA were much more common in the fungal lineage than previously thought. Their positional distribution across the entire mature snRNA strongly suggests a large number of independent events. All the intron sequences reported here show canonical splice site and branch site motifs indicating that they require the splicesomal pathway for their removal.


Subject(s)
Evolution, Molecular , Introns/genetics , RNA, Small Nuclear/genetics , Base Sequence , Genome, Fungal , Nucleic Acid Conformation , RNA Splicing , RNA, Small Nuclear/chemistry , Schizosaccharomyces/genetics , Sequence Homology, Nucleic Acid
8.
BMC Bioinformatics ; 17(Suppl 18): 464, 2016 Dec 15.
Article in English | MEDLINE | ID: mdl-28105919

ABSTRACT

BACKGROUND: snoReport uses RNA secondary structure prediction combined with machine learning as the basis to identify the two main classes of small nucleolar RNAs, the box H/ACA snoRNAs and the box C/D snoRNAs. Here, we present snoReport 2.0, which substantially improves and extends in the original method by: extracting new features for both box C/D and H/ACA box snoRNAs; developing a more sophisticated technique in the SVM training phase with recent data from vertebrate organisms and a careful choice of the SVM parameters C and γ; and using updated versions of tools and databases used for the construction of the original version of snoReport. To validate the new version and to demonstrate its improved performance, we tested snoReport 2.0 in different organisms. RESULTS: Results of the training and test phases of boxes H/ACA and C/D snoRNAs, in both versions of snoReport, are discussed. Validation on real data was performed to evaluate the predictions of snoReport 2.0. Our program was applied to a set of previously annotated sequences, some of them experimentally confirmed, of humans, nematodes, drosophilids, platypus, chickens and leishmania. We significantly improved the predictions for vertebrates, since the training phase used information of these organisms, but H/ACA box snoRNAs identification was improved for the other ones. CONCLUSION: We presented snoReport 2.0, to predict H/ACA box and C/D box snoRNAs, an efficient method to find true positives and avoid false positives in vertebrate organisms. H/ACA box snoRNA classifier showed an F-score of 93 % (an improvement of 10 % regarding the previous version), while C/D box snoRNA classifier, an F-Score of 94 % (improvement of 14 %). Besides, both classifiers exhibited performance measures above 90 %. These results show that snoReport 2.0 avoid false positives and false negatives, allowing to predict snoRNAs with high quality. In the validation phase, snoReport 2.0 predicted 67.43 % of vertebrate organisms for both classes. For Nematodes and Drosophilids, 69 % and 76.67 %, for H/ACA box snoRNAs were predicted, respectively, showing that snoReport 2.0 is good to identify snoRNAs in vertebrates and also H/ACA box snoRNAs in invertebrates organisms.


Subject(s)
Computational Biology/methods , Eukaryota/genetics , RNA, Small Nucleolar/chemistry , Support Vector Machine , Animals , Base Sequence , Computational Biology/instrumentation , Eukaryota/chemistry , Humans , Molecular Sequence Data , RNA, Small Nucleolar/genetics , Vertebrates/genetics
10.
PLoS One ; 10(3): e0121797, 2015.
Article in English | MEDLINE | ID: mdl-25822729

ABSTRACT

Here we present the results of a large-scale bioinformatics annotation of non-coding RNA loci in 48 avian genomes. Our approach uses probabilistic models of hand-curated families from the Rfam database to infer conserved RNA families within each avian genome. We supplement these annotations with predictions from the tRNA annotation tool, tRNAscan-SE and microRNAs from miRBase. We identify 34 lncRNA-associated loci that are conserved between birds and mammals and validate 12 of these in chicken. We report several intriguing cases where a reported mammalian lncRNA, but not its function, is conserved. We also demonstrate extensive conservation of classical ncRNAs (e.g., tRNAs) and more recently discovered ncRNAs (e.g., snoRNAs and miRNAs) in birds. Furthermore, we describe numerous "losses" of several RNA families, and attribute these to either genuine loss, divergence or missing data. In particular, we show that many of these losses are due to the challenges associated with assembling avian microchromosomes. These combined results illustrate the utility of applying homology-based methods for annotating novel vertebrate genomes.


Subject(s)
Birds/genetics , RNA, Untranslated/genetics , Animals , Chickens/genetics , Computational Biology , Conserved Sequence , Gene Dosage , Genetic Variation , Genome , Humans , Mammals/genetics , MicroRNAs/genetics , Molecular Sequence Annotation , Multigene Family , Pseudogenes , RNA, Small Nucleolar/genetics , Regulatory Elements, Transcriptional , Species Specificity
11.
Life (Basel) ; 5(1): 905-20, 2015 Mar 13.
Article in English | MEDLINE | ID: mdl-25780960

ABSTRACT

MicroRNAs are important regulatory small RNAs in many eukaryotes. Due to their small size and simple structure, they are readily innovated de novo. Throughout the evolution of animals, the emergence of novel microRNA families traces key morphological innovations. Here, we use a computational approach based on homology search and parsimony-based presence/absence analysis to draw a comprehensive picture of microRNA evolution in 159 animal species. We confirm previous observations regarding bursts of innovations accompanying the three rounds of genome duplications in vertebrate evolution and in the early evolution of placental mammals. With a much better resolution for the invertebrate lineage compared to large-scale studies, we observe additional bursts of innovation, e.g., in Rhabditoidea. More importantly, we see clear evidence that loss of microRNA families is not an uncommon phenomenon. The Enoplea may serve as a second dramatic example beyond the tunicates. The large-scale analysis presented here also highlights several generic technical issues in the analysis of very large gene families that will require further research.

12.
Mol Cell ; 56(3): 389-399, 2014 Nov 06.
Article in English | MEDLINE | ID: mdl-25514182

ABSTRACT

Coilin protein scaffolds Cajal bodies (CBs)-subnuclear compartments enriched in small nuclear RNAs (snRNAs)-and promotes efficient spliceosomal snRNP assembly. The molecular function of coilin, which is intrinsically disordered with no defined motifs, is poorly understood. We use UV crosslinking and immunoprecipitation (iCLIP) to determine whether mammalian coilin binds RNA in vivo and to identify targets. Robust detection of snRNA transcripts correlated with coilin ChIP-seq peaks on snRNA genes, indicating that coilin binding to nascent snRNAs is a site-specific CB nucleator. Surprisingly, several hundred small nucleolar RNAs (snoRNAs) were identified as coilin interactors, including numerous unannotated mouse and human snoRNAs. We show that all classes of snoRNAs concentrate in CBs. Moreover, snoRNAs lacking specific CB retention signals traffic through CBs en route to nucleoli, consistent with the role of CBs in small RNP assembly. Thus, coilin couples snRNA and snoRNA biogenesis, making CBs the cellular hub of small ncRNA metabolism.


Subject(s)
Coiled Bodies/metabolism , Nuclear Proteins/metabolism , RNA, Small Untranslated/metabolism , Animals , Cell Cycle , Cell Nucleolus/metabolism , HeLa Cells , Humans , Mice , Protein Binding , RNA Transport
13.
Methods Mol Biol ; 1097: 437-56, 2014.
Article in English | MEDLINE | ID: mdl-24639171

ABSTRACT

The computational identification of novel microRNA (miRNA) genes is a challenging task in bioinformatics. Massive amounts of data describing unknown functional RNA transcripts have to be analyzed for putative miRNA candidates with automated computational pipelines. Beyond those miRNAs that meet the classical definition, high-throughput sequencing techniques have revealed additional miRNA-like molecules that are derived by alternative biogenesis pathways. Exhaustive bioinformatics analyses on such data involve statistical issues as well as precise sequence and structure inspection not only of the functional mature part but also of the whole precursor sequence of the putative miRNA. Apart from a considerable amount of species-specific miRNAs, the majority of all those genes are conserved at least among closely related organisms. Some miRNAs, however, can be traced back to very early points in the evolution of eukaryotic species. Thus, the investigation of the conservation of newly found miRNA candidates comprises an important step in the computational annotation of miRNAs.Topics covered in this chapter include a review on the obvious problem of miRNA annotation and family definition, recommended pipelines of computational miRNA annotation or detection, and an overview of current computer tools for the prediction of miRNAs and their limitations. The chapter closes discussing how those bioinformatic approaches address the problem of faithful miRNA prediction and correct annotation.


Subject(s)
Computational Biology/methods , Genomics/methods , MicroRNAs/chemistry , MicroRNAs/genetics , Databases, Nucleic Acid , Internet , Software
14.
Mol Biol Evol ; 31(2): 455-67, 2014 Feb.
Article in English | MEDLINE | ID: mdl-24162733

ABSTRACT

Ribosomal and small nuclear RNAs (snRNAs) comprise numerous modified nucleotides. The modification patterns are retained during evolution, making it even possible to project them from yeast onto human. The stringent conservation of modification sites and the slow evolution of rRNAs and snRNAs contradicts the rapid evolution of small nucleolar RNA (snoRNA) sequences. To explain this discrepancy, we investigated the coevolution of snoRNAs and their targeted sites throughout vertebrates. To measure and evaluate the conservation of RNA-RNA interactions, we defined the interaction conservation index (ICI). It combines the quality of individual interaction with the scope of its conservation in a set of species and serves as an efficient measure to evaluate the conservation of the interaction of snoRNA and target. We show that functions of homologous snoRNAs are evolutionarily stable, thus, members of the same snoRNA family guide equivalent modifications. The conservation of snoRNA sequences is high at target binding regions while the remaining sequence varies significantly. In addition to elucidating principles of correlated evolution, we were able, with the help of the ICI measure, to assign functions to previously orphan snoRNAs and to associate snoRNAs as partners to known chemical modifications unassigned to a given snoRNA. Furthermore, we used predictions of snoRNA functions in conjunction with sequence conservation to identify distant homologies. Because of the high overall entropy of snoRNA sequences, such relationships are hard to detect by means of sequence homology search methods alone.


Subject(s)
RNA, Ribosomal/metabolism , RNA, Small Nucleolar/chemistry , RNA, Small Nucleolar/genetics , Vertebrates/genetics , Animals , Binding Sites , Conserved Sequence , Evolution, Molecular , Humans , Models, Molecular , Nucleic Acid Conformation , Phylogeny , RNA, Ribosomal/genetics , Sequence Homology, Nucleic Acid , Vertebrates/metabolism
15.
Bioinformatics ; 30(1): 115-6, 2014 Jan 01.
Article in English | MEDLINE | ID: mdl-24174566

ABSTRACT

MOTIVATION: Although small nucleolar RNAs form an important class of non-coding RNAs, no comprehensive annotation efforts have been undertaken, presumably because the task is complicated by both the large number of distinct small nucleolar RNA families and their relatively rapid pace of sequence evolution. RESULTS: With snoStrip we present an automatic annotation pipeline developed specifically for comparative genomics of small nucleolar RNAs. It makes use of sequence conservation, canonical box motifs as well as secondary structure and predicts putative targets. AVAILABILITY AND IMPLEMENTATION: The snoStrip web service and the download version is available at http://snostrip.bioinf.uni-leipzig.de/


Subject(s)
High-Throughput Nucleotide Sequencing/methods , RNA, Small Nucleolar/genetics , Base Sequence , Conserved Sequence/genetics , RNA, Small Nucleolar/chemistry , Sequence Analysis, RNA , Software
16.
Curr Biol ; 22(14): 1309-13, 2012 Jul 24.
Article in English | MEDLINE | ID: mdl-22704986

ABSTRACT

The phylogeny of insects, one of the most spectacular radiations of life on earth, has received considerable attention. However, the evolutionary roots of one intriguing group of insects, the twisted-wing parasites (Strepsiptera), remain unclear despite centuries of study and debate. Strepsiptera exhibit exceptional larval developmental features, consistent with a predicted step from direct (hemimetabolous) larval development to complete metamorphosis that could have set the stage for the spectacular radiation of metamorphic (holometabolous) insects. Here we report the sequencing of a Strepsiptera genome and show that the analysis of sequence-based genomic data (comprising more than 18 million nucleotides from nearly 4,500 genes obtained from a total of 13 insect genomes), along with genomic metacharacters, clarifies the phylogenetic origin of Strepsiptera and sheds light on the evolution of holometabolous insect development. Our results provide overwhelming support for Strepsiptera as the closest living relatives of beetles (Coleoptera). They demonstrate that the larval developmental features of Strepsiptera, reminiscent of those of hemimetabolous insects, are the result of convergence. Our analyses solve the long-standing enigma of the evolutionary roots of Strepsiptera and reveal that the holometabolous mode of insect development is more malleable than previously thought.


Subject(s)
Genome, Insect , Insecta/classification , Insecta/genetics , Phylogeny , Animals , Biological Evolution , Genome, Mitochondrial , Insecta/anatomy & histology , Insecta/growth & development , Molecular Sequence Data , Sequence Alignment , Sequence Analysis, DNA , Sequence Analysis, Protein
17.
RNA Biol ; 9(3): 231-41, 2012 Mar.
Article in English | MEDLINE | ID: mdl-22617875

ABSTRACT

The increase of bodyplan complexity in early bilaterian evolution is correlates with the advent and diversification of microRNAs. These small RNAs guide animal development by regulating temporal transitions in gene expression involved in cell fate choices and transitions between pluripotency and differentiation. One of the two known microRNAs whose origins date back before the bilaterian ancestor is mir-100. In Bilateria, it appears stably associated in polycistronic transcripts with let-7 and mir-125, two key regulators of development. In vertebrates, these three microRNA families have expanded to form a complex system of developmental regulators. In this contribution, we disentangle the evolutionary history of the let-7 locus, which was restructured independently in nematodes, platyhelminths, and deuterostomes. The foundation of a second let-7 locus in the common ancestor of vertebrates and urochordates predates the vertebrate-specific genome duplications, which then caused a rapid expansion of the let-7 family.


Subject(s)
Evolution, Molecular , MicroRNAs/genetics , Multigene Family , Animals , Base Sequence , Cluster Analysis , Computational Biology/methods , Gnathostoma/genetics , Humans , Lung Neoplasms/genetics , Molecular Sequence Data , Phylogeny , Sequence Alignment
18.
Bioinformatics ; 26(5): 610-6, 2010 Mar 01.
Article in English | MEDLINE | ID: mdl-20015949

ABSTRACT

MOTIVATION: Small nucleolar RNAs are an abundant class of non-coding RNAs that guide chemical modifications of rRNAs, snRNAs and some mRNAs. In the case of many 'orphan' snoRNAs, the targeted nucleotides remain unknown, however. The box H/ACA subclass determines uridine residues that are to be converted into pseudouridines via specific complementary binding in a well-defined secondary structure configuration that is outside the scope of common RNA (co-)folding algorithms. RESULTS: RNAsnoop implements a dynamic programming algorithm that computes thermodynamically optimal H/ACA-RNA interactions in an efficient scanning variant. Complemented by an support vector machine (SVM)-based machine learning approach to distinguish true binding sites from spurious solutions and a system to evaluate comparative information, it presents an efficient and reliable tool for the prediction of H/ACA snoRNA target sites. We apply RNAsnoop to identify the snoRNAs that are responsible for several of the remaining 'orphan' pseudouridine modifications in human rRNAs, and we assign a target to one of the five orphan H/ACA snoRNAs in Drosophila. AVAILABILITY: The C source code of RNAsnoop is freely available at http://www.tbi.univie.ac.at/ -htafer/RNAsnoop


Subject(s)
Genomics/methods , RNA, Small Nucleolar/chemistry , Software , Algorithms , Binding Sites , Molecular Sequence Data , Nucleic Acid Conformation , RNA, Ribosomal/chemistry , Sequence Analysis, RNA
19.
BMC Genomics ; 10: 464, 2009 Oct 08.
Article in English | MEDLINE | ID: mdl-19814823

ABSTRACT

BACKGROUND: Schistosomes are trematode parasites of the phylum Platyhelminthes. They are considered the most important of the human helminth parasites in terms of morbidity and mortality. Draft genome sequences are now available for Schistosoma mansoni and Schistosoma japonicum. Non-coding RNA (ncRNA) plays a crucial role in gene expression regulation, cellular function and defense, homeostasis, and pathogenesis. The genome-wide annotation of ncRNAs is a non-trivial task unless well-annotated genomes of closely related species are already available. RESULTS: A homology search for structured ncRNA in the genome of S. mansoni resulted in 23 types of ncRNAs with conserved primary and secondary structure. Among these, we identified rRNA, snRNA, SL RNA, SRP, tRNAs and RNase P, and also possibly MRP and 7SK RNAs. In addition, we confirmed five miRNAs that have recently been reported in S. japonicum and found two additional homologs of known miRNAs. The tRNA complement of S. mansoni is comparable to that of the free-living planarian Schmidtea mediterranea, although for some amino acids differences of more than a factor of two are observed: Leu, Ser, and His are overrepresented, while Cys, Meth, and Ile are underrepresented in S. mansoni. On the other hand, the number of tRNAs in the genome of S. japonicum is reduced by more than a factor of four. Both schistosomes have a complete set of minor spliceosomal snRNAs. Several ncRNAs that are expected to exist in the S. mansoni genome were not found, among them the telomerase RNA, vault RNAs, and Y RNAs. CONCLUSION: The ncRNA sequences and structures presented here represent the most complete dataset of ncRNA from any lophotrochozoan reported so far. This data set provides an important reference for further analysis of the genomes of schistosomes and indeed eukaryotic genomes at large.


Subject(s)
Genome, Helminth , RNA, Helminth/genetics , RNA, Untranslated/genetics , Schistosoma japonicum/genetics , Schistosoma mansoni/genetics , Animals , Base Sequence , Conserved Sequence , MicroRNAs/genetics , Molecular Sequence Data , Nucleic Acid Conformation , RNA, Ribosomal/genetics , RNA, Small Nucleolar/genetics , RNA, Spliced Leader/genetics , RNA, Transfer/genetics , Sequence Alignment , Sequence Analysis, RNA , Sequence Homology, Nucleic Acid
20.
Nucleic Acids Res ; 37(18): 6184-93, 2009 Oct.
Article in English | MEDLINE | ID: mdl-19723687

ABSTRACT

Ribosomal RNA (rRNA) genes are probably the most frequently used data source in phylogenetic reconstruction. Individual columns of rRNA alignments are not independent as a consequence of their highly conserved secondary structures. Unless explicitly taken into account, these correlation can distort the phylogenetic signal and/or lead to gross overestimates of tree stability. Maximum likelihood and Bayesian approaches are of course amenable to using RNA-specific substitution models that treat conserved base pairs appropriately, but require accurate secondary structure models as input. So far, however, no accurate and easy-to-use tool has been available for computing structure-aware alignments and consensus structures that can deal with the large rRNAs. The RNAsalsa approach is designed to fill this gap. Capitalizing on the improved accuracy of pairwise consensus structures and informed by a priori knowledge of group-specific structural constraints, the tool provides both alignments and consensus structures that are of sufficient accuracy for routine phylogenetic analysis based on RNA-specific substitution models. The power of the approach is demonstrated using two rRNA data sets: a mitochondrial rRNA set of 26 Mammalia, and a collection of 28S nuclear rRNAs representative of the five major echinoderm groups.


Subject(s)
Phylogeny , RNA, Ribosomal/classification , Animals , Base Sequence , Echinodermata/genetics , Nucleic Acid Conformation , Primates/genetics , RNA, Ribosomal/chemistry , Sequence Alignment , Software
SELECTION OF CITATIONS
SEARCH DETAIL