Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Más filtros

Banco de datos
Tipo del documento
Publication year range
1.
Bioinformatics ; 29(5): 654-5, 2013 Mar 01.
Artículo en Inglés | MEDLINE | ID: mdl-23335014

RESUMEN

MOTIVATION: Comparative modeling of RNA is known to be important for making accurate secondary structure predictions. RNA structure prediction tools such as PPfold or RNAalifold use an aligned set of sequences in predictions. Obtaining a multiple alignment from a set of sequences is quite a challenging problem itself, and the quality of the alignment can affect the quality of a prediction. By implementing RNA secondary structure prediction in a statistical alignment framework, and predicting structures from multiple alignment samples instead of a single fixed alignment, it may be possible to improve predictions. RESULTS: We have extended the program StatAlign to make use of RNA-specific features, which include RNA secondary structure prediction from multiple alignments using either a thermodynamic approach (RNAalifold) or a Stochastic Context-Free Grammars (SCFGs) approach (PPfold). We also provide the user with scores relating to the quality of a secondary structure prediction, such as information entropy values for the combined space of secondary structures and sampled alignments, and a reliability score that predicts the expected number of correctly predicted base pairs. Finally, we have created RNA secondary structure visualization plugins and automated the process of setting up Markov Chain Monte Carlo runs for RNA alignments in StatAlign. AVAILABILITY AND IMPLEMENTATION: The software is available from http://statalign.github.com/statalign/.


Asunto(s)
ARN/química , Alineación de Secuencia/métodos , Análisis de Secuencia de ARN , Programas Informáticos , Algoritmos , Emparejamiento Base , Teorema de Bayes , Cadenas de Markov , Conformación de Ácido Nucleico , Termodinámica
2.
Bioinformatics ; 29(6): 704-10, 2013 Mar 15.
Artículo en Inglés | MEDLINE | ID: mdl-23396120

RESUMEN

MOTIVATION: Many computational methods for RNA secondary structure prediction, and, in particular, for the prediction of a consensus structure of an alignment of RNA sequences, have been developed. Most methods, however, ignore biophysical factors, such as the kinetics of RNA folding; no current implementation considers both evolutionary information and folding kinetics, thus losing information that, when considered, might lead to better predictions. RESULTS: We present an iterative algorithm, Oxfold, in the framework of stochastic context-free grammars, that emulates the kinetics of RNA folding in a simplified way, in combination with a molecular evolution model. This method improves considerably on existing grammatical models that do not consider folding kinetics. Additionally, the model compares favourably to non-kinetic thermodynamic models.


Asunto(s)
Algoritmos , Pliegue del ARN , ARN/química , Teorema de Bayes , Evolución Molecular , Cinética , Modelos Moleculares , Alineación de Secuencia , Análisis de Secuencia de ARN/métodos , Procesos Estocásticos , Termodinámica
3.
BMC Bioinformatics ; 14 Suppl 2: S22, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-23368905

RESUMEN

Comparative methods for RNA secondary structure prediction use evolutionary information from RNA alignments to increase prediction accuracy. The model is often described in terms of stochastic context-free grammars (SCFGs), which generate a probability distribution over secondary structures. It is, however, unclear how this probability distribution changes as a function of the input alignment. As prediction programs typically only return a single secondary structure, better characterisation of the underlying probability space of RNA secondary structures is of great interest. In this work, we show how to efficiently compute the information entropy of the probability distribution over RNA secondary structures produced for RNA alignments by a phylo-SCFG, and implement it for the PPfold model. We also discuss interpretations and applications of this quantity, including how it can clarify reasons for low prediction reliability scores. PPfold and its source code are available from http://birc.au.dk/software/ppfold/.


Asunto(s)
Algoritmos , Modelos Teóricos , Conformación de Ácido Nucleico , ARN/química , Secuencia de Bases , Biología Computacional/métodos , Entropía , Probabilidad , Programas Informáticos
4.
BMC Bioinformatics ; 14: 149, 2013 May 01.
Artículo en Inglés | MEDLINE | ID: mdl-23634662

RESUMEN

BACKGROUND: With the advancement of next-generation sequencing and transcriptomics technologies, regulatory effects involving RNA, in particular RNA structural changes are being detected. These results often rely on RNA secondary structure predictions. However, current approaches to RNA secondary structure modelling produce predictions with a high variance in predictive accuracy, and we have little quantifiable knowledge about the reasons for these variances. RESULTS: In this paper we explore a number of factors which can contribute to poor RNA secondary structure prediction quality. We establish a quantified relationship between alignment quality and loss of accuracy. Furthermore, we define two new measures to quantify uncertainty in alignment-based structure predictions. One of the measures improves on the "reliability score" reported by PPfold, and considers alignment uncertainty as well as base-pair probabilities. The other measure considers the information entropy for SCFGs over a space of input alignments. CONCLUSIONS: Our predictive accuracy improves on the PPfold reliability score. We can successfully characterize many of the underlying reasons for and variances in poor prediction. However, there is still variability unaccounted for, which we therefore suggest comes from the RNA secondary structure predictive model itself.


Asunto(s)
ARN/química , Alineación de Secuencia/métodos , Análisis de Secuencia de ARN , Algoritmos , Emparejamiento Base , Evolución Molecular , Conformación de Ácido Nucleico , Probabilidad , Reproducibilidad de los Resultados , Alineación de Secuencia/normas
5.
BMC Bioinformatics ; 13: 260, 2012 Oct 09.
Artículo en Inglés | MEDLINE | ID: mdl-23043260

RESUMEN

BACKGROUND: RNA secondary structure prediction, or folding, is a classic problem in bioinformatics: given a sequence of nucleotides, the aim is to predict the base pairs formed in its three dimensional conformation. The inverse problem of designing a sequence folding into a particular target structure has only more recently received notable interest. With a growing appreciation and understanding of the functional and structural properties of RNA motifs, and a growing interest in utilising biomolecules in nano-scale designs, the interest in the inverse RNA folding problem is bound to increase. However, whereas the RNA folding problem from an algorithmic viewpoint has an elegant and efficient solution, the inverse RNA folding problem appears to be hard. RESULTS: In this paper we present a genetic algorithm approach to solve the inverse folding problem. The main aims of the development was to address the hitherto mostly ignored extension of solving the inverse folding problem, the multi-target inverse folding problem, while simultaneously designing a method with superior performance when measured on the quality of designed sequences. The genetic algorithm has been implemented as a Python program called Frnakenstein. It was benchmarked against four existing methods and several data sets totalling 769 real and predicted single structure targets, and on 292 two structure targets. It performed as well as or better at finding sequences which folded in silico into the target structure than all existing methods, without the heavy bias towards CG base pairs that was observed for all other top performing methods. On the two structure targets it also performed well, generating a perfect design for about 80% of the targets. CONCLUSIONS: Our method illustrates that successful designs for the inverse RNA folding problem does not necessarily have to rely on heavy biases in base pair and unpaired base distributions. The design problem seems to become more difficult on larger structures when the target structures are real structures, while no deterioration was observed for predicted structures. Design for two structure targets is considerably more difficult, but far from impossible, demonstrating the feasibility of automated design of artificial riboswitches. The Python implementation is available at http://www.stats.ox.ac.uk/research/genome/software/frnakenstein.


Asunto(s)
Algoritmos , Biología Computacional/métodos , Pliegue del ARN/genética , ARN/química , ARN/genética , Programas Informáticos , Emparejamiento Base , Secuencia de Bases , Simulación por Computador , Riboswitch
SELECCIÓN DE REFERENCIAS
Detalles de la búsqueda