Knotty: efficient and accurate prediction of complex RNA pseudoknot structures.
Bioinformatics
; 34(22): 3849-3856, 2018 11 15.
Article
en En
| MEDLINE
| ID: mdl-29868872
ABSTRACT
Motivation The computational prediction of RNA secondary structure by free energy minimization has become an important tool in RNA research. However in practice, energy minimization is mostly limited to pseudoknot-free structures or rather simple pseudoknots, not covering many biologically important structures such as kissing hairpins. Algorithms capable of predicting sufficiently complex pseudoknots (for sequences of length n) used to have extreme complexities, e.g. Pknots has O(n6) time and O(n4) space complexity. The algorithm CCJ dramatically improves the asymptotic run time for predicting complex pseudoknots (handling almost all relevant pseudoknots, while being slightly less general than Pknots), but this came at the cost of large constant factors in space and time, which strongly limited its practical application (â¼200 bases already require 256 GB space). Results:
We present a CCJ-type algorithm, Knotty, that handles the same comprehensive pseudoknot class of structures as CCJ with improved space complexity of Θ(n3+Z)-due to the applied technique of sparsification, the number of 'candidates', Z, appears to grow significantly slower than n4 on our benchmark set (which include pseudoknotted RNAs up to 400 nt). In terms of run time over this benchmark, Knotty clearly outperforms Pknots and the original CCJ implementation, CCJ 1.0; Knotty's space consumption fundamentally improves over CCJ 1.0, being on a par with the space-economic Pknots. By comparing to CCJ 2.0, our unsparsified Knotty variant, we demonstrate the isolated effect of sparsification. Moreover, Knotty employs the state-of-the-art energy model of 'HotKnots DP09', which results in superior prediction accuracy over Pknots. Availability and implementation Our software is available at https//github.com/HosnaJabbari/Knotty. Supplementary information Supplementary data are available at Bioinformatics online.
Texto completo:
1
Colección:
01-internacional
Banco de datos:
MEDLINE
Asunto principal:
Programas Informáticos
/
ARN
Tipo de estudio:
Prognostic_studies
/
Risk_factors_studies
Idioma:
En
Revista:
Bioinformatics
Asunto de la revista:
INFORMATICA MEDICA
Año:
2018
Tipo del documento:
Article
País de afiliación:
Estados Unidos