Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 44
Filter
1.
Nucleic Acids Res ; 52(W1): W348-W353, 2024 Jul 05.
Article in English | MEDLINE | ID: mdl-38587206

ABSTRACT

Alignment of 3D molecular structures involves overlaying their sets of atoms in space in such a way as to minimize the distance between the corresponding atoms. The purpose of this procedure is usually to analyze and assess structural similarity on a global (e.g. evaluating predicted 3D models and clustering structures) or a local level (e.g. searching for common substructures). Although the idea of alignment is simple, combinatorial algorithms that implement it require considerable computational resources, even when processing relatively small structures. In this paper, we introduce RNAhugs, a web server for custom and flexible alignment of 3D RNA structures. Using two efficient heuristics, GEOS and GENS, it finds the longest corresponding fragments within 3D structures that may differ in sizes-given in the PDB or PDBx/mmCIF formats-that manage to align with user-specified accuracy (i.e. with an RMSD not exceeding a cutoff value given as an input parameter). A distinctive advantage of the system lies in its ability to process multi-model files and compare the results of 1-25 alignments in a single task. RNAhugs has an intuitive interface and is publicly available at https://rnahugs.cs.put.poznan.pl/.


Subject(s)
Internet , Nucleic Acid Conformation , RNA , Sequence Alignment , Software , RNA/chemistry , Sequence Alignment/methods , Algorithms , Models, Molecular
2.
Brief Bioinform ; 24(3)2023 05 19.
Article in English | MEDLINE | ID: mdl-37096592

ABSTRACT

Since the 1980s, dozens of computational methods have addressed the problem of predicting RNA secondary structure. Among them are those that follow standard optimization approaches and, more recently, machine learning (ML) algorithms. The former were repeatedly benchmarked on various datasets. The latter, on the other hand, have not yet undergone extensive analysis that could suggest to the user which algorithm best fits the problem to be solved. In this review, we compare 15 methods that predict the secondary structure of RNA, of which 6 are based on deep learning (DL), 3 on shallow learning (SL) and 6 control methods on non-ML approaches. We discuss the ML strategies implemented and perform three experiments in which we evaluate the prediction of (I) representatives of the RNA equivalence classes, (II) selected Rfam sequences and (III) RNAs from new Rfam families. We show that DL-based algorithms (such as SPOT-RNA and UFold) can outperform SL and traditional methods if the data distribution is similar in the training and testing set. However, when predicting 2D structures for new RNA families, the advantage of DL is no longer clear, and its performance is inferior or equal to that of SL and non-ML methods.


Subject(s)
Machine Learning , RNA , Humans , RNA/genetics , RNA/chemistry , Algorithms , Benchmarking
3.
PLoS Comput Biol ; 20(6): e1011959, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38900780

ABSTRACT

Unlike proteins, RNAs deposited in the Protein Data Bank do not contain topological knots. Recently, admittedly, the first trefoil knot and some lasso-type conformations have been found in experimental RNA structures, but these are still exceptional cases. Meanwhile, algorithms predicting 3D RNA models have happened to form knotted structures not so rarely. Interestingly, machine learning-based predictors seem to be more prone to generate knotted RNA folds than traditional methods. A similar situation is observed for the entanglements of structural elements. In this paper, we analyze all models submitted to the CASP15 competition in the 3D RNA structure prediction category. We show what types of topological knots and structure element entanglements appear in the submitted models and highlight what methods are behind the generation of such conformations. We also study the structural aspect of susceptibility to entanglement. We suggest that predictors take care of an evaluation of RNA models to avoid publishing structures with artifacts, such as unusual entanglements, that result from hallucinations of predictive algorithms.


Subject(s)
Algorithms , Artifacts , Computational Biology , Models, Molecular , Nucleic Acid Conformation , RNA , RNA/chemistry , Computational Biology/methods , Machine Learning , Databases, Protein
4.
Nucleic Acids Res ; 51(W1): W607-W612, 2023 07 05.
Article in English | MEDLINE | ID: mdl-37158242

ABSTRACT

Quadruplexes are four-stranded DNA/RNA motifs of high functional significance that fold into complex shapes. They are widely recognized as important regulators of genomic processes and are among the most frequently investigated potential drug targets. Despite interest in quadruplexes, few studies focus on automatic tools that help to understand the many unique features of their 3D folds. In this paper, we introduce WebTetrado, a web server for analyzing 3D structures of quadruplex structures. It has a user-friendly interface and offers many advanced features, including automatic identification, annotation, classification, and visualization of the motif. The program applies to the experimental or in silico generated 3D models provided in the PDB and PDBx/mmCIF files. It supports canonical G-quadruplexes as well as non-G-based quartets. It can process unimolecular, bimolecular, and tetramolecular quadruplexes. WebTetrado is implemented as a publicly available web server with an intuitive interface and can be freely accessed at https://webtetrado.cs.put.poznan.pl/.


Subject(s)
Computational Biology , Computer Simulation , Data Visualization , G-Quadruplexes , Software , Nucleic Acid Conformation , Nucleotide Motifs , Internet , Computational Biology/instrumentation , Computational Biology/methods
5.
Nucleic Acids Res ; 51(18): 9522-9532, 2023 Oct 13.
Article in English | MEDLINE | ID: mdl-37702120

ABSTRACT

The protein structure prediction problem has been solved for many types of proteins by AlphaFold. Recently, there has been considerable excitement to build off the success of AlphaFold and predict the 3D structures of RNAs. RNA prediction methods use a variety of techniques, from physics-based to machine learning approaches. We believe that there are challenges preventing the successful development of deep learning-based methods like AlphaFold for RNA in the short term. Broadly speaking, the challenges are the limited number of structures and alignments making data-hungry deep learning methods unlikely to succeed. Additionally, there are several issues with the existing structure and sequence data, as they are often of insufficient quality, highly biased and missing key information. Here, we discuss these challenges in detail and suggest some steps to remedy the situation. We believe that it is possible to create an accurate RNA structure prediction method, but it will require solving several data quality and volume issues, usage of data beyond simple sequence alignments, or the development of new less data-hungry machine learning methods.

6.
RNA ; 28(2): 250-262, 2022 02.
Article in English | MEDLINE | ID: mdl-34819324

ABSTRACT

In silico prediction is a well-established approach to derive a general shape of an RNA molecule based on its sequence or secondary structure. This paper reports an analysis of the stereochemical quality of the RNA three-dimensional models predicted using dedicated computer programs. The stereochemistry of 1052 RNA 3D structures, including 1030 models predicted by fully automated and human-guided approaches within 22 RNA-Puzzles challenges and reference structures, is analyzed. The evaluation is based on standards of RNA stereochemistry that the Protein Data Bank requires from deposited experimental structures. Deviations from standard bond lengths and angles, planarity, or chirality are quantified. A reduction in the number of such deviations should help in the improvement of RNA 3D structure modeling approaches.


Subject(s)
Molecular Dynamics Simulation/standards , RNA/chemistry , Animals , Humans , Nucleic Acid Conformation
7.
Bioinformatics ; 39(5)2023 05 04.
Article in English | MEDLINE | ID: mdl-37166444

ABSTRACT

MOTIVATION: Tertiary structure alignment is one of the main challenges in the computer-aided comparative study of molecular structures. Its aim is to optimally overlay the 3D shapes of two or more molecules in space to find the correspondence between their nucleotides. Alignment is the starting point for most algorithms that assess structural similarity or find common substructures. Thus, it has applications in solving a variety of bioinformatics problems, e.g. in the search for structural patterns, structure clustering, identifying structural redundancy, and evaluating the prediction accuracy of 3D models. To date, several tools have been developed to align 3D structures of RNA. However, most of them are not applicable to arbitrarily large structures and do not allow users to parameterize the optimization algorithm. RESULTS: We present two customizable heuristics for flexible alignment of 3D RNA structures, geometric search (GEOS), and genetic algorithm (GENS). They work in sequence-dependent/independent mode and find the suboptimal alignment of expected quality (below a predefined RMSD threshold). We compare their performance with those of state-of-the-art methods for aligning RNA structures. We show the results of quantitative and qualitative tests run for all of these algorithms on benchmark sets of RNA structures. AVAILABILITY AND IMPLEMENTATION: Source codes for both heuristics are hosted at https://github.com/RNApolis/rnahugs.


Subject(s)
RNA , Software , RNA/chemistry , Heuristics , Algorithms , Nucleic Acid Conformation
8.
Nucleic Acids Res ; 50(D1): D253-D258, 2022 01 07.
Article in English | MEDLINE | ID: mdl-34986600

ABSTRACT

ONQUADRO is an advanced database system that supports the study of the structures of canonical and non-canonical quadruplexes. It combines a relational database that collects comprehensive information on tetrads, quadruplexes, and G4-helices; programs to compute structure parameters and visualise the data; scripts for statistical analysis; automatic updates and newsletter modules; and a web application that provides a user interface. The database is a self-updating resource, with new information arriving once a week. The preliminary data are downloaded from the Protein Data Bank, processed, annotated, and completed. As of August 2021, ONQUADRO contains 1,661 tetrads, 518 quadruplexes, and 30 G4-helices found in 467 experimentally determined 3D structures of nucleic acids. Users can view and download their description: sequence, secondary structure (dot-bracket, classical diagram, arc diagram), tertiary structure (ball-and-stick, surface or vdw-ball model, layer diagram), planarity, twist, rise, chi angle (value and type), loop characteristics, strand directionality, metal ions, ONZ, and Webba da Silva classification (the latter by loop topology and tetrad combination), origin structure ID, assembly ID, experimental method, and molecule type. The database is freely available at https://onquadro.cs.put.poznan.pl/. It can be used on both desktop computers and mobile devices.


Subject(s)
DNA/chemistry , Databases, Nucleic Acid , G-Quadruplexes , Nucleic Acid Conformation , RNA/chemistry , User-Computer Interface , Animals , Base Sequence , Computer Graphics , DNA/genetics , DNA/metabolism , Humans , Internet , RNA/genetics , RNA/metabolism
9.
Nucleic Acids Res ; 50(W1): W663-W669, 2022 07 05.
Article in English | MEDLINE | ID: mdl-35349710

ABSTRACT

Advances in experimental and computational techniques enable the exploration of large and complex RNA 3D structures. These, in turn, reveal previously unstudied properties and motifs not characteristic for small molecules with simple architectures. Examples include entanglements of structural elements in RNA molecules and knot-like folds discovered, among others, in the genomes of RNA viruses. Recently, we presented the first classification of entanglements, determined by their topology and the type of entangled structural elements. Here, we introduce RNAspider - a web server to automatically identify, classify, and visualize primary and higher-order entanglements in RNA tertiary structures. The program applies to evaluate RNA 3D models obtained experimentally or by computational prediction. It supports the analysis of uncommon topologies in the pseudoknotted RNA structures. RNAspider is implemented as a publicly available tool with a user-friendly interface and can be freely accessed at https://rnaspider.cs.put.poznan.pl/.


Subject(s)
RNA , Software , RNA/chemistry , Nucleic Acid Conformation , Sequence Analysis, RNA
10.
Postepy Biochem ; 70(2): 128-138, 2024 07 01.
Article in Polish | MEDLINE | ID: mdl-39083468

ABSTRACT

Structural biology is focused on understanding the architecture of biomolecules, such as proteins and nucleic acids. Deciphering the structure helps to understand their function in the cell at a very precise ­ molecular level. This makes it possible to not only determine the basis of diseases but also to propose therapeutic strategies and tools. Such a strong motivation for the development of structural biology has led to the development of a number of methods, which enable determination of the structures of the molecules of life. The continuous progress has been enabled by the integration of biology, chemistry, physics, and computer science, making structural biology extremely interdisciplinary. In its 35-year history, the Institute of Bioorganic Chemistry of the Polish Academy of Sciences in Poznan has become one of the key Polish institutions conducting research in the field of structural biology. On one hand, the research has brought international recognition, and on the other hand, it has forced the implementation and development of cutting-edge methods. This review discusses the methods used in structural biology at the Institute.


Subject(s)
Proteins , Poland , Proteins/chemistry , Molecular Biology , Nucleic Acids/chemistry , Humans
11.
Proteins ; 91(12): 1790-1799, 2023 Dec.
Article in English | MEDLINE | ID: mdl-37615316

ABSTRACT

As CASP15 participants, in the new category of 3D RNA structure prediction, we applied expert modeling with the support of our proprietary system RNAComposer. Although RNAComposer is primarily known as an automated web server, its features allow it to be used interactively, for example, for homology-based modeling or assembling models from user-provided structural elements. In the paper, we present various scenarios of applying the system to predict the 3D RNA structures that we employed. Their combination with expert input, comparative analysis of models, and routines to select representative resultant structures form a ready-for-reuse workflow. With selected examples, we demonstrate its application for the in silico modeling of natural and synthetic RNA molecules targeted in CASP15.


Subject(s)
RNA , Software , Humans , RNA/chemistry , Nucleic Acid Conformation , Models, Molecular , Computer Simulation
12.
Proteins ; 91(12): 1550-1557, 2023 Dec.
Article in English | MEDLINE | ID: mdl-37306011

ABSTRACT

Prediction categories in the Critical Assessment of Structure Prediction (CASP) experiments change with the need to address specific problems in structure modeling. In CASP15, four new prediction categories were introduced: RNA structure, ligand-protein complexes, accuracy of oligomeric structures and their interfaces, and ensembles of alternative conformations. This paper lists technical specifications for these categories and describes their integration in the CASP data management system.


Subject(s)
Computational Biology , Proteins , Protein Conformation , Proteins/chemistry , Models, Molecular , Ligands
13.
Brief Bioinform ; 22(3)2021 05 20.
Article in English | MEDLINE | ID: mdl-32898859

ABSTRACT

Quadruplexes (G4s) are of interest, which increases with the number of identified G4 structures and knowledge about their biomedical potential. These unique motifs form in many organisms, including humans, where their appearance correlates with various diseases. Scientists store and analyze quadruplexes using recently developed bioinformatic tools-many of them focused on DNA structures. With an expanding collection of G4 RNAs, we check how existing tools deal with them. We review all available bioinformatics resources dedicated to quadruplexes and examine their usefulness in G4 RNA analysis. We distinguish the following subsets of resources: databases, tools to predict putative quadruplex sequences, tools to predict secondary structure with quadruplexes and tools to analyze and visualize quadruplex structures. We share the results obtained from processing specially created RNA datasets with these tools. Contact: mszachniuk@cs.put.poznan.pl Supplementary information: Supplementary data are available at Briefings in Bioinformatics online.


Subject(s)
Computational Biology/methods , Databases, Nucleic Acid , G-Quadruplexes , RNA/chemistry , Algorithms , Base Sequence , Computer Simulation , DNA/chemistry , DNA/genetics , Humans , Models, Molecular , RNA/genetics , Reproducibility of Results
14.
Bioinformatics ; 38(15): 3835-3836, 2022 08 02.
Article in English | MEDLINE | ID: mdl-35703937

ABSTRACT

MOTIVATION: Quadruplexes are specific 3D structures found in nucleic acids. Due to the exceptional properties of these motifs, their exploration with the general-purpose bioinformatics methods can be problematic or insufficient. The same applies to visualizing their structure. A hand-drawn layer diagram is the most common way to represent the quadruplex anatomy. No molecular visualization software generates such a structural model based on atomic coordinates. RESULTS: DrawTetrado is an open-source Python program for automated visualization targeting the structures of quadruplexes and G4-helices. It generates static layer diagrams that represent structural data in a pseudo-3D perspective. The possibility to set color schemes, nucleotide labels, inter-element distances or angle of view allows for easy customization of the output drawing. AVAILABILITY AND IMPLEMENTATION: The program is available under the MIT license at https://github.com/RNApolis/drawtetrado.


Subject(s)
Nucleic Acids , Software , Computational Biology , Protein Structure, Secondary , Nucleotides
15.
Bioinformatics ; 38(14): 3668-3670, 2022 07 11.
Article in English | MEDLINE | ID: mdl-35674373

ABSTRACT

MOTIVATION: The development of algorithms dedicated to RNA three-dimensional (3D) structures contributes to the demand for training, testing and benchmarking data. A reliable source of such data derived from computational prediction is the RNA-Puzzles repository. In contrast, the largest resource with experimentally determined structures is the Protein Data Bank. However, files in this archive often contain other molecular data in addition to the RNA structure itself, which-to be used by RNA processing algorithms-should be removed. RESULTS: RNAsolo is a self-updating database dedicated to RNA bioinformatics. It systematically collects experimentally determined RNA 3D structures stored in the PDB, cleans them from non-RNA chains, and groups them into equivalence classes. It allows users to download various subsets of data-clustered by resolution, source, data format, etc.-for further processing and analysis with a single click. AVAILABILITY AND IMPLEMENTATION: The repository is publicly available at https://rnasolo.cs.put.poznan.pl.


Subject(s)
RNA , Software , RNA/chemistry , Nucleic Acid Conformation , Computational Biology , Databases, Protein
16.
Bioinformatics ; 38(17): 4200-4205, 2022 09 02.
Article in English | MEDLINE | ID: mdl-35809063

ABSTRACT

MOTIVATION: Knowledge of the 3D structure of RNA supports discovering its functions and is crucial for designing drugs and modern therapeutic solutions. Thus, much attention is devoted to experimental determination and computational prediction targeting the global fold of RNA and its local substructures. The latter include multi-branched loops-functionally significant elements that highly affect the spatial shape of the entire molecule. Unfortunately, their computational modeling constitutes a weak point of structural bioinformatics. A remedy for this is in collecting these motifs and analyzing their features. RESULTS: RNAloops is a self-updating database that stores multi-branched loops identified in the PDB-deposited RNA structures. A description of each loop includes angular data-planar and Euler angles computed between pairs of adjacent helices to allow studying their mutual arrangement in space. The system enables search and analysis of multiloops, presents their structure details numerically and visually, and computes data statistics. AVAILABILITY AND IMPLEMENTATION: RNAloops is freely accessible at https://rnaloops.cs.put.poznan.pl. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
RNA , Software , RNA/chemistry , Nucleic Acid Conformation , Sequence Analysis, RNA , Databases, Factual
17.
Nucleic Acids Res ; 49(17): 9625-9632, 2021 09 27.
Article in English | MEDLINE | ID: mdl-34432024

ABSTRACT

Computational methods to predict RNA 3D structure have more and more practical applications in molecular biology and medicine. Therefore, it is crucial to intensify efforts to improve the accuracy and quality of predicted three-dimensional structures. A significant role in this is played by the RNA-Puzzles initiative that collects, evaluates, and shares RNAs built computationally within currently nearly 30 challenges. RNA-Puzzles datasets, subjected to multi-criteria analysis, allow revealing the strengths and weaknesses of computer prediction methods. Here, we study the issue of entangled RNA fragments in the predicted RNA 3D structure models. By entanglement, we mean an arrangement of two structural elements such that one of them passes through the other. We propose the classification of entanglements driven by their topology and components. It distinguishes two general classes, interlaces and lassos, and subclasses characterized by element types-loops, dinucleotide steps, open single-stranded fragments-and puncture multiplicity. Our computational pipeline for entanglement detection, applied for 1,017 non-redundant models from RNA-Puzzles, has shown the frequency of different entanglements and allowed identifying 138 structures with intersected assemblies.


Subject(s)
Models, Molecular , RNA/chemistry , Computational Biology , Nucleic Acid Conformation
18.
RNA ; 26(8): 982-995, 2020 08.
Article in English | MEDLINE | ID: mdl-32371455

ABSTRACT

RNA-Puzzles is a collective endeavor dedicated to the advancement and improvement of RNA 3D structure prediction. With agreement from crystallographers, the RNA structures are predicted by various groups before the publication of the crystal structures. We now report the prediction of 3D structures for six RNA sequences: four nucleolytic ribozymes and two riboswitches. Systematic protocols for comparing models and crystal structures are described and analyzed. In these six puzzles, we discuss (i) the comparison between the automated web servers and human experts; (ii) the prediction of coaxial stacking; (iii) the prediction of structural details and ligand binding; (iv) the development of novel prediction methods; and (v) the potential improvements to be made. We show that correct prediction of coaxial stacking and tertiary contacts is essential for the prediction of RNA architecture, while ligand binding modes can only be predicted with low resolution and simultaneous prediction of RNA structure with accurate ligand binding still remains out of reach. All the predicted models are available for the future development of force field parameters and the improvement of comparison and assessment tools.


Subject(s)
Aptamers, Nucleotide/chemistry , RNA, Catalytic/chemistry , RNA/chemistry , Base Sequence , Ligands , Nucleic Acid Conformation , Riboswitch/genetics
19.
Bioinformatics ; 36(22-23): 5507-5513, 2021 Apr 01.
Article in English | MEDLINE | ID: mdl-33367605

ABSTRACT

MOTIVATION: Viruses are the most abundant biological entities and constitute a large reservoir of genetic diversity. In recent years, knowledge about them has increased significantly as a result of dynamic development in life sciences and rapid technological progress. This knowledge is scattered across various data repositories, making a comprehensive analysis of viral data difficult. RESULTS: In response to the need for gathering a comprehensive knowledge of viruses and viral sequences, we developed Virxicon, a lexicon of all experimentally acquired sequences for RNA and DNA viruses. The ability to quickly obtain data for entire viral groups, searching sequences by levels of taxonomic hierarchy-according to the Baltimore classification and ICTV taxonomy-and tracking the distribution of viral data and its growth over time are unique features of our database compared to the other tools. AVAILABILITYAND IMPLEMENTATION: Virxicon is a publicly available resource, updated weekly. It has an intuitive web interface and can be freely accessed at http://virxicon.cs.put.poznan.pl/.

20.
Nucleic Acids Res ; 48(2): 576-588, 2020 01 24.
Article in English | MEDLINE | ID: mdl-31799609

ABSTRACT

Significant improvements have been made in the efficiency and accuracy of RNA 3D structure prediction methods during the succeeding challenges of RNA-Puzzles, a community-wide effort on the assessment of blind prediction of RNA tertiary structures. The RNA-Puzzles contest has shown, among others, that the development and validation of computational methods for RNA fold prediction strongly depend on the benchmark datasets and the structure comparison algorithms. Yet, there has been no systematic benchmark set or decoy structures available for the 3D structure prediction of RNA, hindering the standardization of comparative tests in the modeling of RNA structure. Furthermore, there has not been a unified set of tools that allows deep and complete RNA structure analysis, and at the same time, that is easy to use. Here, we present RNA-Puzzles toolkit, a computational resource including (i) decoy sets generated by different RNA 3D structure prediction methods (raw, for-evaluation and standardized datasets), (ii) 3D structure normalization, analysis, manipulation, visualization tools (RNA_format, RNA_normalizer, rna-tools) and (iii) 3D structure comparison metric tools (RNAQUA, MCQ4Structures). This resource provides a full list of computational tools as well as a standard RNA 3D structure prediction assessment protocol for the community.


Subject(s)
Computational Biology , Nucleic Acid Conformation , RNA/chemistry , Software , Algorithms , Benchmarking , RNA/genetics
SELECTION OF CITATIONS
SEARCH DETAIL