Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 26
Filtrar
1.
Brief Bioinform ; 25(2)2024 Jan 22.
Artigo em Inglês | MEDLINE | ID: mdl-38436560

RESUMO

RNA is a complex macromolecule that plays central roles in the cell. While it is well known that its structure is directly related to its functions, understanding and predicting RNA structures is challenging. Assessing the real or predictive quality of a structure is also at stake with the complex 3D possible conformations of RNAs. Metrics have been developed to measure model quality while scoring functions aim at assigning quality to guide the discrimination of structures without a known and solved reference. Throughout the years, many metrics and scoring functions have been developed, and no unique assessment is used nowadays. Each developed assessment method has its specificity and might be complementary to understanding structure quality. Therefore, to evaluate RNA 3D structure predictions, it would be important to calculate different metrics and/or scoring functions. For this purpose, we developed RNAdvisor, a comprehensive automated software that integrates and enhances the accessibility of existing metrics and scoring functions. In this paper, we present our RNAdvisor tool, as well as state-of-the-art existing metrics, scoring functions and a set of benchmarks we conducted for evaluating them. Source code is freely available on the EvryRNA platform: https://evryrna.ibisc.univ-evry.fr.


Assuntos
Benchmarking , RNA , Modelos Estruturais , RNA/genética , Software
2.
Brief Bioinform ; 24(4)2023 07 20.
Artigo em Inglês | MEDLINE | ID: mdl-37337745

RESUMO

RNAs can interact with other molecules in their environment, such as ions, proteins or other RNAs, to form complexes with important biological roles. The prediction of the structure of these complexes is therefore an important issue and a difficult task. We are interested in RNA complexes composed of several (more than two) interacting RNAs. We show how available knowledge on the considered RNAs can help predict their secondary structure. We propose an interactive tool for the prediction of RNA complexes, called C-RCPRed, that considers user knowledge and probing data (which can be generated experimentally or artificially). C-RCPred is based on a multi-objective optimization algorithm. Through an extensive benchmarking procedure, which includes state-of-the-art methods, we show the efficiency of the multi-objective approach and the positive impact of considering user knowledge and probing data on the prediction results. C-RCPred is freely available as an open-source program and web server on the EvryRNA website (https://evryrna.ibisc.univ-evry.fr).


Assuntos
RNA , Software , RNA/química , Análise de Sequência de RNA/métodos , Algoritmos , Estrutura Secundária de Proteína , Conformação de Ácido Nucleico
3.
PLoS Comput Biol ; 20(9): e1012446, 2024 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-39264986

RESUMO

The involvement of non-coding RNAs in biological processes and diseases has made the exploration of their functions crucial. Most non-coding RNAs have yet to be studied, creating the need for methods that can rapidly classify large sets of non-coding RNAs into functional groups, or classes. In recent years, the success of deep learning in various domains led to its application to non-coding RNA classification. Multiple novel architectures have been developed, but these advancements are not covered by current literature reviews. We present an exhaustive comparison of the different methods proposed in the state-of-the-art and describe their associated datasets. Moreover, the literature lacks objective benchmarks. We perform experiments to fairly evaluate the performance of various tools for non-coding RNA classification on popular datasets. The robustness of methods to non-functional sequences and sequence boundary noise is explored. We also measure computation time and CO2 emissions. With regard to these results, we assess the relevance of the different architectural choices and provide recommendations to consider in future methods.


Assuntos
Benchmarking , Biologia Computacional , Aprendizado Profundo , RNA não Traduzido , Benchmarking/métodos , Biologia Computacional/métodos , RNA não Traduzido/genética , RNA não Traduzido/classificação , Humanos , Algoritmos
4.
Nucleic Acids Res ; 51(W1): W281-W288, 2023 07 05.
Artigo em Inglês | MEDLINE | ID: mdl-37158254

RESUMO

Recent advances have shown that some biologically active non-coding RNAs (ncRNAs) are actually translated into polypeptides that have a physiological function as well. This paradigm shift requires adapted computational methods to predict this new class of 'bifunctional RNAs'. Previously, we developed IRSOM, an open-source algorithm to classify non-coding and coding RNAs. Here, we use the binary statistical model of IRSOM as a ternary classifier, called IRSOM2, to identify bifunctional RNAs as a rejection of the two other classes. We present its easy-to-use web interface, which allows users to perform predictions on large datasets of RNA sequences in a short time, to re-train the model with their own data, and to visualize and analyze the classification results thanks to the implementation of self-organizing maps (SOM). We also propose a new benchmark of experimentally validated RNAs that play both protein-coding and non-coding roles, in different organisms. Thus, IRSOM2 showed promising performance in detecting these bifunctional transcripts among ncRNAs of different types, such as circRNAs and lncRNAs (in particular those of shorter lengths). The web server is freely available on the EvryRNA platform: https://evryrna.ibisc.univ-evry.fr.


Assuntos
Algoritmos , Biologia Computacional , Simulação por Computador , RNA , RNA Longo não Codificante/química , Análise de Sequência de RNA/métodos , Biologia Computacional/instrumentação , Biologia Computacional/métodos , RNA/química , RNA/classificação , Internet
5.
Bioinformatics ; 37(9): 1218-1224, 2021 06 09.
Artigo em Inglês | MEDLINE | ID: mdl-33135044

RESUMO

MOTIVATION: Applied research in machine learning progresses faster when a clean dataset is available and ready to use. Several datasets have been proposed and released over the years for specific tasks such as image classification, speech-recognition and more recently for protein structure prediction. However, for the fundamental problem of RNA structure prediction, information is spread between several databases depending on the level we are interested in: sequence, secondary structure, 3D structure or interactions with other macromolecules. In order to speed-up advances in machine-learning based approaches for RNA secondary and/or 3D structure prediction, a dataset integrating all this information is required, to avoid spending time on data gathering and cleaning. RESULTS: Here, we propose the first attempt of a standardized and automatically generated dataset dedicated to RNA combining together: RNA sequences, homology information (under the form of position-specific scoring matrices) and information derived by annotation of available 3D structures (including secondary structure, canonical and non-canonical interactions and backbone torsion angles). The data are retrieved from public databases PDB, Rfam and SILVA. The paper describes the procedure to build such dataset and the RNA structure descriptors we provide. Some statistical descriptions of the resulting dataset are also provided. AVAILABILITY AND IMPLEMENTATION: The dataset is updated every month and available online (in flat-text file format) on the EvryRNA software platform (https://evryrna.ibisc.univ-evry.fr/evryrna/rnanet). An efficient parallel pipeline to build the dataset is also provided for easy reproduction or modification. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
RNA , Software , Algoritmos , Proteínas/genética , RNA/genética , Análise de Sequência de RNA , Homologia de Sequência
6.
Bioinformatics ; 36(8): 2451-2457, 2020 04 15.
Artigo em Inglês | MEDLINE | ID: mdl-31913439

RESUMO

MOTIVATION: RNA loops have been modelled and clustered from solved 3D structures into ordered collections of recurrent non-canonical interactions called 'RNA modules', available in databases. This work explores what information from such modules can be used to improve secondary structure prediction. We propose a bi-objective method for predicting RNA secondary structures by minimizing both an energy-based and a knowledge-based potential. The tool, called BiORSEO, outputs secondary structures corresponding to the optimal solutions from the Pareto set. RESULTS: We compare several approaches to predict secondary structures using inserted RNA modules information: two module data sources, Rna3Dmotif and the RNA 3D Motif Atlas, and different ways to score the module insertions: module size, module complexity or module probability according to models like JAR3D and BayesPairing. We benchmark them against a large set of known secondary structures, including some state-of-the-art tools, and comment on the usefulness of the half physics-based, half data-based approach. AVAILABILITY AND IMPLEMENTATION: The software is available for download on the EvryRNA website, as well as the datasets. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Algoritmos , RNA , Conformação de Ácido Nucleico , Motivos de Nucleotídeos , Análise de Sequência de RNA , Software
7.
BMC Bioinformatics ; 20(Suppl 3): 128, 2019 Mar 29.
Artigo em Inglês | MEDLINE | ID: mdl-30925864

RESUMO

BACKGROUND: RNAs can interact and form complexes, which have various biological roles. The secondary structure prediction of those complexes is a first step towards the identification of their 3D structure. We propose an original approach that takes advantage of the high number of RNA secondary structure and RNA-RNA interaction prediction tools. We formulate the problem of RNA complex prediction as the determination of the best combination (according to the free energy) of predicted RNA secondary structures and RNA-RNA interactions. RESULTS: We model those predicted structures and interactions as a graph in order to have a combinatorial optimization problem that is a constrained maximum weight clique problem. We propose an heuristic based on Breakout Local Search to solve this problem and a tool, called RCPred, that returns several solutions, including motifs like internal and external pseudoknots. On a large number of complexes, RCPred gives competitive results compared to the methods of the state of the art. CONCLUSIONS: We propose in this paper a method called RCPred for the prediction of several secondary structures of RNA complexes, including internal and external pseudoknots. As further works we will propose an improved computation of the global energy and the insertion of 3D motifs in the RNA complexes.


Assuntos
Algoritmos , RNA/química , Bases de Dados de Ácidos Nucleicos , Conformação de Ácido Nucleico , Análise de Sequência de RNA/métodos
8.
Bioinformatics ; 34(17): i620-i628, 2018 09 01.
Artigo em Inglês | MEDLINE | ID: mdl-30423081

RESUMO

Motivation: Non-coding RNAs (ncRNAs) play important roles in many biological processes and are involved in many diseases. Their identification is an important task, and many tools exist in the literature for this purpose. However, almost all of them are focused on the discrimination of coding and ncRNAs without giving more biological insight. In this paper, we propose a new reliable method called IRSOM, based on a supervised Self-Organizing Map (SOM) with a rejection option, that overcomes these limitations. The rejection option in IRSOM improves the accuracy of the method and also allows identifing the ambiguous transcripts. Furthermore, with the visualization of the SOM, we analyze the rejected predictions and highlight the ambiguity of the transcripts. Results: IRSOM was tested on datasets of several species from different reigns, and shown better results compared to state-of-art. The accuracy of IRSOM is always greater than 0.95 for all the species with an average specificity of 0.98 and an average sensitivity of 0.99. Besides, IRSOM is fast (it takes around 254 s to analyze a dataset of 147 000 transcripts) and is able to handle very large datasets. Availability and implementation: IRSOM is implemented in Python and C++. It is available on our software platform EvryRNA (http://EvryRNA.ibisc.univ-evry.fr).


Assuntos
Algoritmos , RNA não Traduzido/genética , Software
9.
BMC Bioinformatics ; 19(1): 13, 2018 01 15.
Artigo em Inglês | MEDLINE | ID: mdl-29334887

RESUMO

BACKGROUND: RNA structure prediction is an important field in bioinformatics, and numerous methods and tools have been proposed. Pseudoknots are specific motifs of RNA secondary structures that are difficult to predict. Almost all existing methods are based on a single model and return one solution, often missing the real structure. An alternative approach would be to combine different models and return a (small) set of solutions, maximizing its quality and diversity in order to increase the probability that it contains the real structure. RESULTS: We propose here an original method for predicting RNA secondary structures with pseudoknots, based on integer programming. We developed a generic bi-objective integer programming algorithm allowing to return optimal and sub-optimal solutions optimizing simultaneously two models. This algorithm was then applied to the combination of two known models of RNA secondary structure prediction, namely MEA and MFE. The resulting tool, called BiokoP, is compared with the other methods in the literature. The results show that the best solution (structure with the highest F1-score) is, in most cases, given by BiokoP. Moreover, the results of BiokoP are homogeneous, regardless of the pseudoknot type or the presence or not of pseudoknots. Indeed, the F1-scores are always higher than 70% for any number of solutions returned. CONCLUSION: The results obtained by BiokoP show that combining the MEA and the MFE models, as well as returning several optimal and several sub-optimal solutions, allow to improve the prediction of secondary structures. One perspective of our work is to combine better mono-criterion models, in particular to combine a model based on the comparative approach with the MEA and the MFE models. This leads to develop in the future a new multi-objective algorithm to combine more than two models. BiokoP is available on the EvryRNA platform: https://EvryRNA.ibisc.univ-evry.fr .


Assuntos
Biologia Computacional/métodos , Conformação de Ácido Nucleico , RNA/química , Algoritmos , Bases de Dados Genéticas , Modelos Moleculares , Fatores de Tempo
10.
Nucleic Acids Res ; 44(W1): W181-4, 2016 07 08.
Artigo em Inglês | MEDLINE | ID: mdl-27242364

RESUMO

Computational methods are required for prediction of non-coding RNAs (ncRNAs), which are involved in many biological processes, especially at post-transcriptional level. Among these ncRNAs, miRNAs have been largely studied and biologists need efficient and fast tools for their identification. In particular, ab initio methods are usually required when predicting novel miRNAs. Here we present a web server dedicated for miRNA precursors identification at a large scale in genomes. It is based on an algorithm called miRNAFold that allows predicting miRNA hairpin structures quickly with high sensitivity. miRNAFold is implemented as a web server with an intuitive and user-friendly interface, as well as a standalone version. The web server is freely available at: http://EvryRNA.ibisc.univ-evry.fr/miRNAFold.


Assuntos
Algoritmos , Genoma , MicroRNAs/genética , Precursores de RNA/genética , Software , Animais , Gráficos por Computador , Humanos , Armazenamento e Recuperação da Informação , Internet , MicroRNAs/classificação , Plantas/genética , Dobramento de RNA , Precursores de RNA/classificação , Análise de Sequência de RNA
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA