Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 32
Filtrar
1.
Mol Cell ; 56(3): 389-399, 2014 Nov 06.
Artigo em Inglês | MEDLINE | ID: mdl-25514182

RESUMO

Coilin protein scaffolds Cajal bodies (CBs)-subnuclear compartments enriched in small nuclear RNAs (snRNAs)-and promotes efficient spliceosomal snRNP assembly. The molecular function of coilin, which is intrinsically disordered with no defined motifs, is poorly understood. We use UV crosslinking and immunoprecipitation (iCLIP) to determine whether mammalian coilin binds RNA in vivo and to identify targets. Robust detection of snRNA transcripts correlated with coilin ChIP-seq peaks on snRNA genes, indicating that coilin binding to nascent snRNAs is a site-specific CB nucleator. Surprisingly, several hundred small nucleolar RNAs (snoRNAs) were identified as coilin interactors, including numerous unannotated mouse and human snoRNAs. We show that all classes of snoRNAs concentrate in CBs. Moreover, snoRNAs lacking specific CB retention signals traffic through CBs en route to nucleoli, consistent with the role of CBs in small RNP assembly. Thus, coilin couples snRNA and snoRNA biogenesis, making CBs the cellular hub of small ncRNA metabolism.


Assuntos
Corpos Enovelados/metabolismo , Proteínas Nucleares/metabolismo , Pequeno RNA não Traduzido/metabolismo , Animais , Ciclo Celular , Nucléolo Celular/metabolismo , Células HeLa , Humanos , Camundongos , Ligação Proteica , Transporte de RNA
2.
Nucleic Acids Res ; 44(11): 5068-82, 2016 06 20.
Artigo em Inglês | MEDLINE | ID: mdl-27174936

RESUMO

Small nucleolar RNAs (snoRNAs) are a class of non-coding RNAs that guide the post-transcriptional processing of other non-coding RNAs (mostly ribosomal RNAs), but have also been implicated in processes ranging from microRNA-dependent gene silencing to alternative splicing. In order to construct an up-to-date catalog of human snoRNAs we have combined data from various databases, de novo prediction and extensive literature review. In total, we list more than 750 curated genomic loci that give rise to snoRNA and snoRNA-like genes. Utilizing small RNA-seq data from the ENCODE project, our study characterizes the plasticity of snoRNA expression identifying both constitutively as well as cell type specific expressed snoRNAs. Especially, the comparison of malignant to non-malignant tissues and cell types shows a dramatic perturbation of the snoRNA expression profile. Finally, we developed a high-throughput variant of the reverse-transcriptase-based method for identifying 2'-O-methyl modifications in RNAs termed RimSeq. Using the data from this and other high-throughput protocols together with previously reported modification sites and state-of-the-art target prediction methods we re-estimate the snoRNA target RNA interaction network. Our current results assign a reliable modification site to 83% of the canonical snoRNAs, leaving only 76 snoRNA sequences as orphan.


Assuntos
Perfilação da Expressão Gênica , Processamento Pós-Transcricional do RNA , RNA Nucleolar Pequeno , Transcriptoma , Análise por Conglomerados , Biologia Computacional/métodos , Bases de Dados de Ácidos Nucleicos , Regulação da Expressão Gênica , Humanos , Anotação de Sequência Molecular , Conformação de Ácido Nucleico , RNA não Traduzido
3.
BMC Bioinformatics ; 17(Suppl 18): 464, 2016 Dec 15.
Artigo em Inglês | MEDLINE | ID: mdl-28105919

RESUMO

BACKGROUND: snoReport uses RNA secondary structure prediction combined with machine learning as the basis to identify the two main classes of small nucleolar RNAs, the box H/ACA snoRNAs and the box C/D snoRNAs. Here, we present snoReport 2.0, which substantially improves and extends in the original method by: extracting new features for both box C/D and H/ACA box snoRNAs; developing a more sophisticated technique in the SVM training phase with recent data from vertebrate organisms and a careful choice of the SVM parameters C and γ; and using updated versions of tools and databases used for the construction of the original version of snoReport. To validate the new version and to demonstrate its improved performance, we tested snoReport 2.0 in different organisms. RESULTS: Results of the training and test phases of boxes H/ACA and C/D snoRNAs, in both versions of snoReport, are discussed. Validation on real data was performed to evaluate the predictions of snoReport 2.0. Our program was applied to a set of previously annotated sequences, some of them experimentally confirmed, of humans, nematodes, drosophilids, platypus, chickens and leishmania. We significantly improved the predictions for vertebrates, since the training phase used information of these organisms, but H/ACA box snoRNAs identification was improved for the other ones. CONCLUSION: We presented snoReport 2.0, to predict H/ACA box and C/D box snoRNAs, an efficient method to find true positives and avoid false positives in vertebrate organisms. H/ACA box snoRNA classifier showed an F-score of 93 % (an improvement of 10 % regarding the previous version), while C/D box snoRNA classifier, an F-Score of 94 % (improvement of 14 %). Besides, both classifiers exhibited performance measures above 90 %. These results show that snoReport 2.0 avoid false positives and false negatives, allowing to predict snoRNAs with high quality. In the validation phase, snoReport 2.0 predicted 67.43 % of vertebrate organisms for both classes. For Nematodes and Drosophilids, 69 % and 76.67 %, for H/ACA box snoRNAs were predicted, respectively, showing that snoReport 2.0 is good to identify snoRNAs in vertebrates and also H/ACA box snoRNAs in invertebrates organisms.


Assuntos
Biologia Computacional/métodos , Eucariotos/genética , RNA Nucleolar Pequeno/química , Máquina de Vetores de Suporte , Animais , Sequência de Bases , Biologia Computacional/instrumentação , Eucariotos/química , Humanos , Dados de Sequência Molecular , RNA Nucleolar Pequeno/genética , Vertebrados/genética
4.
BMC Genomics ; 17(1): 969, 2016 11 24.
Artigo em Inglês | MEDLINE | ID: mdl-27881081

RESUMO

BACKGROUND: Small nucleolar RNAs (snoRNAs) are one of the most ancient families amongst non-protein-coding RNAs. They are ubiquitous in Archaea and Eukarya but absent in bacteria. Their main function is to target chemical modifications of ribosomal RNAs. They fall into two classes, box C/D snoRNAs and box H/ACA snoRNAs, which are clearly distinguished by conserved sequence motifs and the type of chemical modification that they govern. Similarly to microRNAs, snoRNAs appear in distinct families of homologs that affect homologous targets. In animals, snoRNAs and their evolution have been studied in much detail. In plants, however, their evolution has attracted comparably little attention. RESULTS: In order to chart the phylogenetic distribution of individual snoRNA families in plants, we applied a sophisticated approach for identifying homologs of known plant snoRNAs across the plant kingdom. In response to the relatively fast evolution of snoRNAs, information on conserved sequence boxes, target sequences, and secondary structure is combined to identify additional snoRNAs. We identified 296 families of snoRNAs in 24 species and traced their evolution throughout the plant kingdom. Many of the plant snoRNA families comprise paralogs. We also found that targets are well-conserved for most snoRNA families. CONCLUSIONS: The sequence conservation of snoRNAs is sufficient to establish homologies between phyla. The degree of this conservation tapers off, however, between land plants and algae. Plant snoRNAs are frequently organized in highly conserved spatial clusters. As a resource for further investigations we provide carefully curated and annotated alignments for each snoRNA family under investigation.


Assuntos
Família Multigênica , Filogenia , Plantas/classificação , Plantas/genética , RNA de Plantas/genética , RNA Nucleolar Pequeno/genética , Sequência de Bases , Análise por Conglomerados , Biologia Computacional/métodos , Sequência Conservada , Bases de Dados de Ácidos Nucleicos , Evolução Molecular
5.
RNA Biol ; 13(2): 119-27, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-26828373

RESUMO

U6 small nuclear RNAs are part of the splicing machinery. They exhibit several unique features setting them appart from other snRNAs. Reports of introns in structured non-coding RNAs have been very rare. U6 genes, however, were found to be interrupted by an intron in several Schizosaccharomyces species and in 2 Basidiomycota. We conducted a homology search across 147 currently available fungal genome and identified the U6 genes in all but 2 of them. A detailed comparison of their sequences and predicted secondary structures showed that intron insertion events in the U6 snRNA were much more common in the fungal lineage than previously thought. Their positional distribution across the entire mature snRNA strongly suggests a large number of independent events. All the intron sequences reported here show canonical splice site and branch site motifs indicating that they require the splicesomal pathway for their removal.


Assuntos
Evolução Molecular , Íntrons/genética , RNA Nuclear Pequeno/genética , Sequência de Bases , Genoma Fúngico , Conformação de Ácido Nucleico , Splicing de RNA , RNA Nuclear Pequeno/química , Schizosaccharomyces/genética , Homologia de Sequência do Ácido Nucleico
6.
Mol Biol Evol ; 31(2): 455-67, 2014 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-24162733

RESUMO

Ribosomal and small nuclear RNAs (snRNAs) comprise numerous modified nucleotides. The modification patterns are retained during evolution, making it even possible to project them from yeast onto human. The stringent conservation of modification sites and the slow evolution of rRNAs and snRNAs contradicts the rapid evolution of small nucleolar RNA (snoRNA) sequences. To explain this discrepancy, we investigated the coevolution of snoRNAs and their targeted sites throughout vertebrates. To measure and evaluate the conservation of RNA-RNA interactions, we defined the interaction conservation index (ICI). It combines the quality of individual interaction with the scope of its conservation in a set of species and serves as an efficient measure to evaluate the conservation of the interaction of snoRNA and target. We show that functions of homologous snoRNAs are evolutionarily stable, thus, members of the same snoRNA family guide equivalent modifications. The conservation of snoRNA sequences is high at target binding regions while the remaining sequence varies significantly. In addition to elucidating principles of correlated evolution, we were able, with the help of the ICI measure, to assign functions to previously orphan snoRNAs and to associate snoRNAs as partners to known chemical modifications unassigned to a given snoRNA. Furthermore, we used predictions of snoRNA functions in conjunction with sequence conservation to identify distant homologies. Because of the high overall entropy of snoRNA sequences, such relationships are hard to detect by means of sequence homology search methods alone.


Assuntos
RNA Ribossômico/metabolismo , RNA Nucleolar Pequeno/química , RNA Nucleolar Pequeno/genética , Vertebrados/genética , Animais , Sítios de Ligação , Sequência Conservada , Evolução Molecular , Humanos , Modelos Moleculares , Conformação de Ácido Nucleico , Filogenia , RNA Ribossômico/genética , Homologia de Sequência do Ácido Nucleico , Vertebrados/metabolismo
7.
Bioinformatics ; 30(1): 115-6, 2014 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-24174566

RESUMO

MOTIVATION: Although small nucleolar RNAs form an important class of non-coding RNAs, no comprehensive annotation efforts have been undertaken, presumably because the task is complicated by both the large number of distinct small nucleolar RNA families and their relatively rapid pace of sequence evolution. RESULTS: With snoStrip we present an automatic annotation pipeline developed specifically for comparative genomics of small nucleolar RNAs. It makes use of sequence conservation, canonical box motifs as well as secondary structure and predicts putative targets. AVAILABILITY AND IMPLEMENTATION: The snoStrip web service and the download version is available at http://snostrip.bioinf.uni-leipzig.de/


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala/métodos , RNA Nucleolar Pequeno/genética , Sequência de Bases , Sequência Conservada/genética , RNA Nucleolar Pequeno/química , Análise de Sequência de RNA , Software
8.
RNA Biol ; 9(3): 231-41, 2012 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-22617875

RESUMO

The increase of bodyplan complexity in early bilaterian evolution is correlates with the advent and diversification of microRNAs. These small RNAs guide animal development by regulating temporal transitions in gene expression involved in cell fate choices and transitions between pluripotency and differentiation. One of the two known microRNAs whose origins date back before the bilaterian ancestor is mir-100. In Bilateria, it appears stably associated in polycistronic transcripts with let-7 and mir-125, two key regulators of development. In vertebrates, these three microRNA families have expanded to form a complex system of developmental regulators. In this contribution, we disentangle the evolutionary history of the let-7 locus, which was restructured independently in nematodes, platyhelminths, and deuterostomes. The foundation of a second let-7 locus in the common ancestor of vertebrates and urochordates predates the vertebrate-specific genome duplications, which then caused a rapid expansion of the let-7 family.


Assuntos
Evolução Molecular , MicroRNAs/genética , Família Multigênica , Animais , Sequência de Bases , Análise por Conglomerados , Biologia Computacional/métodos , Gnathostoma/genética , Humanos , Neoplasias Pulmonares/genética , Dados de Sequência Molecular , Filogenia , Alinhamento de Sequência
9.
Bioinformatics ; 26(5): 610-6, 2010 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-20015949

RESUMO

MOTIVATION: Small nucleolar RNAs are an abundant class of non-coding RNAs that guide chemical modifications of rRNAs, snRNAs and some mRNAs. In the case of many 'orphan' snoRNAs, the targeted nucleotides remain unknown, however. The box H/ACA subclass determines uridine residues that are to be converted into pseudouridines via specific complementary binding in a well-defined secondary structure configuration that is outside the scope of common RNA (co-)folding algorithms. RESULTS: RNAsnoop implements a dynamic programming algorithm that computes thermodynamically optimal H/ACA-RNA interactions in an efficient scanning variant. Complemented by an support vector machine (SVM)-based machine learning approach to distinguish true binding sites from spurious solutions and a system to evaluate comparative information, it presents an efficient and reliable tool for the prediction of H/ACA snoRNA target sites. We apply RNAsnoop to identify the snoRNAs that are responsible for several of the remaining 'orphan' pseudouridine modifications in human rRNAs, and we assign a target to one of the five orphan H/ACA snoRNAs in Drosophila. AVAILABILITY: The C source code of RNAsnoop is freely available at http://www.tbi.univie.ac.at/ -htafer/RNAsnoop


Assuntos
Genômica/métodos , RNA Nucleolar Pequeno/química , Software , Algoritmos , Sítios de Ligação , Dados de Sequência Molecular , Conformação de Ácido Nucleico , RNA Ribossômico/química , Análise de Sequência de RNA
10.
Nucleic Acids Res ; 37(18): 6184-93, 2009 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-19723687

RESUMO

Ribosomal RNA (rRNA) genes are probably the most frequently used data source in phylogenetic reconstruction. Individual columns of rRNA alignments are not independent as a consequence of their highly conserved secondary structures. Unless explicitly taken into account, these correlation can distort the phylogenetic signal and/or lead to gross overestimates of tree stability. Maximum likelihood and Bayesian approaches are of course amenable to using RNA-specific substitution models that treat conserved base pairs appropriately, but require accurate secondary structure models as input. So far, however, no accurate and easy-to-use tool has been available for computing structure-aware alignments and consensus structures that can deal with the large rRNAs. The RNAsalsa approach is designed to fill this gap. Capitalizing on the improved accuracy of pairwise consensus structures and informed by a priori knowledge of group-specific structural constraints, the tool provides both alignments and consensus structures that are of sufficient accuracy for routine phylogenetic analysis based on RNA-specific substitution models. The power of the approach is demonstrated using two rRNA data sets: a mitochondrial rRNA set of 26 Mammalia, and a collection of 28S nuclear rRNAs representative of the five major echinoderm groups.


Assuntos
Filogenia , RNA Ribossômico/classificação , Animais , Sequência de Bases , Equinodermos/genética , Conformação de Ácido Nucleico , Primatas/genética , RNA Ribossômico/química , Alinhamento de Sequência , Software
11.
Nucleic Acids Res ; 37(5): 1602-15, 2009 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-19151082

RESUMO

A detailed annotation of non-protein coding RNAs is typically missing in initial releases of newly sequenced genomes. Here we report on a comprehensive ncRNA annotation of the genome of Trichoplax adhaerens, the presumably most basal metazoan whose genome has been published to-date. Since blast identified only a small fraction of the best-conserved ncRNAs--in particular rRNAs, tRNAs and some snRNAs--we developed a semi-global dynamic programming tool, GotohScan, to increase the sensitivity of the homology search. It successfully identified the full complement of major and minor spliceosomal snRNAs, the genes for RNase P and MRP RNAs, the SRP RNA, as well as several small nucleolar RNAs. We did not find any microRNA candidates homologous to known eumetazoan sequences. Interestingly, most ncRNAs, including the pol-III transcripts, appear as single-copy genes or with very small copy numbers in the Trichoplax genome.


Assuntos
Genoma , Placozoa/genética , RNA não Traduzido/genética , Animais , Sequência de Bases , Endorribonucleases/química , MicroRNAs/química , Dados de Sequência Molecular , Conformação de Ácido Nucleico , RNA Ribossômico/genética , RNA Citoplasmático Pequeno/química , RNA Nuclear Pequeno/química , RNA Nuclear Pequeno/genética , RNA Nucleolar Pequeno/química , RNA Nucleolar Pequeno/genética , RNA de Transferência/genética , Ribonuclease P/genética , Partícula de Reconhecimento de Sinal/química , Software
13.
Bioinformatics ; 25(18): 2298-301, 2009 Sep 15.
Artigo em Inglês | MEDLINE | ID: mdl-19584066

RESUMO

MicroRNA-offset-RNAs (moRNAs) were recently detected as highly abundant class of small RNAs in a basal chordate. Using short read sequencing data, we show here that moRNAs are also produced from human microRNA precursors, albeit at quite low expression levels. The expression levels of moRNAs are unrelated to those of the associated microRNAs. Surprisingly, microRNA precursors that also show moRNAs are typically evolutionarily old, comprising more than half of the microRNA families that were present in early Bilateria, while evidence for moRNAs was found only for a relative small fraction of microRNA families of recent origin.


Assuntos
MicroRNAs/química , RNA Interferente Pequeno/química , RNA/química , Humanos , Análise de Sequência de RNA
14.
Nucleic Acids Res ; 36(8): 2677-89, 2008 May.
Artigo em Inglês | MEDLINE | ID: mdl-18346967

RESUMO

Small non-protein-coding RNAs (ncRNAs) have systematically been studied in various model organisms from Escherichia coli to Homo sapiens. Here, we analyse the small ncRNA transcriptome from the pathogenic filamentous fungus Aspergillus fumigatus. To that aim, we experimentally screened for ncRNAs, expressed under various growth conditions or during specific developmental stages, by generating a specialized cDNA library from size-selected small RNA species. Our screen revealed 30 novel ncRNA candidates from known ncRNA classes such as small nuclear RNAs (snRNAs) and C/D box-type small nucleolar RNAs (C/D box snoRNAs). Additionally, several candidates for H/ACA box snoRNAs could be predicted by a bioinformatical screen. We also identified 15 candidates for ncRNAs, which could not be assigned to any known ncRNA class. Some of these ncRNA species are developmentally regulated implying a possible novel function in A. fumigatus development. Surprisingly, in addition to full-length tRNAs, we also identified 5'- or 3'-halves of tRNAs, only, which are likely generated by tRNA cleavage within the anti-codon loop. We show that conidiation induces tRNA cleavage resulting in tRNA depletion within conidia. Since conidia represent the resting state of A. fumigatus we propose that conidial tRNA depletion might be a novel mechanism to down-regulate protein synthesis in a filamentous fungus.


Assuntos
Aspergillus fumigatus/genética , Regulação Fúngica da Expressão Gênica , Biossíntese de Proteínas , RNA não Traduzido/metabolismo , Aspergillus fumigatus/crescimento & desenvolvimento , Perfilação da Expressão Gênica , Biblioteca Gênica , RNA Nuclear Pequeno/metabolismo , RNA Nucleolar Pequeno/metabolismo , RNA de Transferência/metabolismo , RNA não Traduzido/classificação
15.
BMC Genomics ; 10: 464, 2009 Oct 08.
Artigo em Inglês | MEDLINE | ID: mdl-19814823

RESUMO

BACKGROUND: Schistosomes are trematode parasites of the phylum Platyhelminthes. They are considered the most important of the human helminth parasites in terms of morbidity and mortality. Draft genome sequences are now available for Schistosoma mansoni and Schistosoma japonicum. Non-coding RNA (ncRNA) plays a crucial role in gene expression regulation, cellular function and defense, homeostasis, and pathogenesis. The genome-wide annotation of ncRNAs is a non-trivial task unless well-annotated genomes of closely related species are already available. RESULTS: A homology search for structured ncRNA in the genome of S. mansoni resulted in 23 types of ncRNAs with conserved primary and secondary structure. Among these, we identified rRNA, snRNA, SL RNA, SRP, tRNAs and RNase P, and also possibly MRP and 7SK RNAs. In addition, we confirmed five miRNAs that have recently been reported in S. japonicum and found two additional homologs of known miRNAs. The tRNA complement of S. mansoni is comparable to that of the free-living planarian Schmidtea mediterranea, although for some amino acids differences of more than a factor of two are observed: Leu, Ser, and His are overrepresented, while Cys, Meth, and Ile are underrepresented in S. mansoni. On the other hand, the number of tRNAs in the genome of S. japonicum is reduced by more than a factor of four. Both schistosomes have a complete set of minor spliceosomal snRNAs. Several ncRNAs that are expected to exist in the S. mansoni genome were not found, among them the telomerase RNA, vault RNAs, and Y RNAs. CONCLUSION: The ncRNA sequences and structures presented here represent the most complete dataset of ncRNA from any lophotrochozoan reported so far. This data set provides an important reference for further analysis of the genomes of schistosomes and indeed eukaryotic genomes at large.


Assuntos
Genoma Helmíntico , RNA de Helmintos/genética , RNA não Traduzido/genética , Schistosoma japonicum/genética , Schistosoma mansoni/genética , Animais , Sequência de Bases , Sequência Conservada , MicroRNAs/genética , Dados de Sequência Molecular , Conformação de Ácido Nucleico , RNA Ribossômico/genética , RNA Nucleolar Pequeno/genética , RNA Líder para Processamento/genética , RNA de Transferência/genética , Alinhamento de Sequência , Análise de Sequência de RNA , Homologia de Sequência do Ácido Nucleico
16.
Bioinformatics ; 24(2): 158-64, 2008 Jan 15.
Artigo em Inglês | MEDLINE | ID: mdl-17895272

RESUMO

UNLABELLED: Unlike tRNAs and microRNAs, both classes of snoRNAs, which direct two distinct types of chemical modifications of uracil residues, have proved to be surprisingly difficult to find in genomic sequences. Most computational approaches so far have explicitly used the fact that snoRNAs predominantly target ribosomal RNAs and spliceosomal RNAs. The target is specified by a short stretch of sequence complementarity between the snoRNA and its target. This sequence complementarity to known targets crucially contributes to sensitivity and specificity of snoRNA gene finding algorithms. The discovery of 'orphan' snoRNAs, which either have no known target, or which target ordinary protein-coding mRNAs, however, begs the question whether this class of 'housekeeping' non-coding RNAs is much more widespread and might have a diverse set of regulatory functions. In order to approach this question, we present here a combination of RNA secondary structure prediction and machine learning that is designed to recognize the two major classes of snoRNAs, box C/D and box H/ACA snoRNAs, among ncRNA candidate sequences. The snoReport approach deliberately avoids any usage of target information. We find that the combination of the conserved sequence boxes and secondary structure constraints as a pre-filter with SVM classifiers based on a small set of structural descriptors are sufficient for a reliable identification of snoRNAs. Tests of snoReport on data from several recent experimental surveys show that the approach is feasible; the application to a dataset from a large-scale comparative genomics survey for ncRNAs suggests that there are likely hundreds of previously undescribed 'orphan' snoRNAs still hidden in the human genome. AVAILABILITY: The snoReport software is implemented in ANSI C. The source code is available under the GNU Public License at http://www.bioinf.uni-leipzig.de/Software/snoReport.


Assuntos
Algoritmos , Inteligência Artificial , Marcação de Genes/métodos , RNA Nucleolar Pequeno/genética , Análise de Sequência de RNA/métodos , Software , Sequência de Bases , Sequência Conservada/genética , Dados de Sequência Molecular , Reconhecimento Automatizado de Padrão
17.
Genomics ; 92(1): 65-74, 2008 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-18511233

RESUMO

Genome-wide multiple sequence alignments (MSAs) are a necessary prerequisite for an increasingly diverse collection of comparative genomic approaches. Here we present a versatile method that generates high-quality MSAs for non-protein-coding sequences. The NcDNAlign pipeline combines pairwise BLAST alignments to create initial MSAs, which are then locally improved and trimmed. The program is optimized for speed and hence is particulary well-suited to pilot studies. We demonstrate the practical use of NcDNAlign in three case studies: the search for ncRNAs in gammaproteobacteria and the analysis of conserved noncoding DNA in nematodes and teleost fish, in the latter case focusing on the fate of duplicated ultra-conserved regions. Compared to the currently widely used genome-wide alignment program TBA, our program results in a 20- to 30-fold reduction of CPU time necessary to generate gammaproteobacterial alignments. A showcase application of bacterial ncRNA prediction based on alignments of both algorithms results in similar sensitivity, false discovery rates, and up to 100 putatively novel ncRNA structures. Similar findings hold for our application of NcDNAlign to the identification of ultra-conserved regions in nematodes and teleosts. Both approaches yield conserved sequences of unknown function, result in novel evolutionary insights into conservation patterns among these genomes, and manifest the benefits of an efficient and reliable genome-wide alignment package. The software is available under the GNU Public License at http://www.bioinf.uni-leipzig.de/Software/NcDNAlign/.


Assuntos
Genoma , Alinhamento de Sequência/métodos , Análise de Sequência de DNA/métodos , Análise de Sequência de RNA/métodos , Software , Regiões não Traduzidas/genética , Animais , Sequência de Bases , Sequência Conservada , Peixes/genética , Dados de Sequência Molecular , Nematoides/genética , RNA Bacteriano/química , RNA Bacteriano/genética , Sensibilidade e Especificidade
18.
BMC Genomics ; 8: 406, 2007 Nov 08.
Artigo em Inglês | MEDLINE | ID: mdl-17996037

RESUMO

BACKGROUND: Recent experimental and computational studies have provided overwhelming evidence for a plethora of diverse transcripts that are unrelated to protein-coding genes. One subclass consists of those RNAs that require distinctive secondary structure motifs to exert their biological function and hence exhibit distinctive patterns of sequence conservation characteristic for positive selection on RNA secondary structure. The deep-sequencing of 12 drosophilid species coordinated by the NHGRI provides an ideal data set of comparative computational approaches to determine those genomic loci that code for evolutionarily conserved RNA motifs. This class of loci includes the majority of the known small ncRNAs as well as structured RNA motifs in mRNAs. We report here on a genome-wide survey using RNAz. RESULTS: We obtain 16 000 high quality predictions among which we recover the majority of the known ncRNAs. Taking a pessimistically estimated false discovery rate of 40% into account, this implies that at least some ten thousand loci in the Drosophila genome show the hallmarks of stabilizing selection action of RNA structure, and hence are most likely functional at the RNA level. A subset of RNAz predictions overlapping with TRF1 and BRF binding sites [Isogai et al., EMBO J. 26: 79-89 (2007)], which are plausible candidates of Pol III transcripts, have been studied in more detail. Among these sequences we identify several "clusters" of ncRNA candidates with striking structural similarities. CONCLUSION: The statistical evaluation of the RNAz predictions in comparison with a similar analysis of vertebrate genomes [Washietl et al., Nat. Biotech. 23: 1383-1390 (2005)] shows that qualitatively similar fractions of structured RNAs are found in introns, UTRs, and intergenic regions. The intergenic RNA structures, however, are concentrated much more closely around known protein-coding loci, suggesting that flies have significantly smaller complement of independent structured ncRNAs compared to mammals.


Assuntos
Drosophila melanogaster/genética , RNA/genética , Animais , Humanos , Conformação de Ácido Nucleico , Filogenia , RNA/química , Sensibilidade e Especificidade
19.
Bioinformatics ; 22(14): e197-202, 2006 Jul 15.
Artigo em Inglês | MEDLINE | ID: mdl-16873472

RESUMO

UNLABELLED: Recently, genome-wide surveys for non-coding RNAs have provided evidence for tens of thousands of previously undescribed evolutionary conserved RNAs with distinctive secondary structures. The annotation of these putative ncRNAs, however, remains a difficult problem. Here we describe an SVM-based approach that, in conjunction with a non-stringent filter for consensus secondary structures, is capable of efficiently recognizing microRNA precursors in multiple sequence alignments. The software was applied to recent genome-wide RNAz surveys of mammals, urochordates, and nematodes. AVAILABILITY: The program RNAmicro is available as source code and can be downloaded from http://www.bioinf.uni-leipzig/Software/RNAmicro.


Assuntos
Mapeamento Cromossômico/métodos , Bases de Dados Genéticas , MicroRNAs/química , Precursores de RNA/química , Alinhamento de Sequência/métodos , Análise de Sequência de RNA/métodos , Algoritmos , Animais , Inteligência Artificial , Sequência de Bases , Sequência Conservada , Genômica/métodos , Humanos , Armazenamento e Recuperação da Informação/métodos , MicroRNAs/genética , Dados de Sequência Molecular , Reconhecimento Automatizado de Padrão/métodos , Precursores de RNA/genética , Homologia de Sequência do Ácido Nucleico , Software
20.
Noncoding RNA ; 3(1)2017 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-29657275

RESUMO

The U3 small nucleolar RNA (snoRNA) is an essential player in the initial steps of ribosomal RNA biogenesis which is ubiquitously present in Eukarya. It is exceptional among the small nucleolar RNAs in its size, the presence of multiple conserved sequence boxes, a highly conserved secondary structure core, its biogenesis as an independent gene transcribed by polymerase III, and its involvement in pre-rRNA cleavage rather than chemical modification. Fungal U3 snoRNAs share many features with their sisters from other eukaryotic kingdoms but differ from them in particular in their 5' regions, which in fungi has a distinctive consensus structure and often harbours introns. Here we report on a comprehensive homology search and detailed analysis of the evolution of sequence and secondary structure features covering the entire kingdom Fungi.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA