Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 21
Filtrar
Más filtros











Base de datos
Intervalo de año de publicación
1.
Methods Mol Biol ; 2726: 105-124, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38780729

RESUMEN

The structure of an RNA sequence encodes information about its biological function. Dynamic programming algorithms are often used to predict the conformation of an RNA molecule from its sequence alone, and adding experimental data as auxiliary information improves prediction accuracy. This auxiliary data is typically incorporated into the nearest neighbor thermodynamic model22 by converting the data into pseudoenergies. Here, we look at how much of the space of possible structures auxiliary data allows prediction methods to explore. We find that for a large class of RNA sequences, auxiliary data shifts the predictions significantly. Additionally, we find that predictions are highly sensitive to the parameters which define the auxiliary data pseudoenergies. In fact, the parameter space can typically be partitioned into regions where different structural predictions predominate.


Asunto(s)
Algoritmos , Biología Computacional , Conformación de Ácido Nucleico , ARN , Termodinámica , ARN/química , ARN/genética , Biología Computacional/métodos , Programas Informáticos
2.
ArXiv ; 2023 Mar 27.
Artículo en Inglés | MEDLINE | ID: mdl-37033453

RESUMEN

Understanding the base pairing of an RNA sequence provides insight into its molecular structure. By mining suboptimal sampling data, RNAprofiling 1.0 identifies the dominant helices in low-energy secondary structures as features, organizes them into profiles which partition the Boltzmann sample, and highlights key similarities/differences among the most informative, i.e. selected, profiles in a graphical format. Version 2.0 enhances every step of this approach. First, the featured substructures are expanded from helices to stems. Second, profile selection includes low-frequency pairings similar to featured ones. In conjunction, these updates extend the utility of the method to sequences up to length 600, as evaluated over a sizable dataset. Third, relationships are visualized in a decision tree which highlights the most important structural differences. Finally, this cluster analysis is made accessible to experimental researchers in a portable format as an interactive webpage, permitting a much greater understanding of trade-offs among different possible base pairing combinations.

3.
J Mol Biol ; 435(14): 168047, 2023 07 15.
Artículo en Inglés | MEDLINE | ID: mdl-36933824

RESUMEN

Understanding the base pairing of an RNA sequence provides insight into its molecular structure. By mining suboptimal sampling data, RNAprofiling 1.0 identifies the dominant helices in low-energy secondary structures as features, organizes them into profiles which partition the Boltzmann sample, and highlights key similarities/differences among the most informative, i.e. selected, profiles in a graphical format. Version 2.0 enhances every step of this approach. First, the featured substructures are expanded from helices to stems. Second, profile selection includes low-frequency pairings similar to featured ones. In conjunction, these updates extend the utility of the method to sequences up to length 600, as evaluated over a sizable dataset. Third, relationships are visualized in a decision tree which highlights the most important structural differences. Finally, this cluster analysis is made accessible to experimental researchers in a portable format as an interactive webpage, permitting a much greater understanding of trade-offs among different possible base pairing combinations.


Asunto(s)
ARN , Análisis de Secuencia de ARN , Algoritmos , Emparejamiento Base , Secuencia de Bases , Análisis por Conglomerados , Conformación de Ácido Nucleico , ARN/química
4.
ArXiv ; 2023 Mar 21.
Artículo en Inglés | MEDLINE | ID: mdl-36994148

RESUMEN

The branching of an RNA molecule is an important structural characteristic yet difficult to predict correctly, especially for longer sequences. Using plane trees as a combinatorial model for RNA folding, we consider the thermodynamic cost, known as the barrier height, of transitioning between branching configurations. Using branching skew as a coarse energy approximation, we characterize various types of paths in the discrete configuration landscape. In particular, we give sufficient conditions for a path to have both minimal length and minimal branching skew. The proofs offer some biological insights, notably the potential importance of both hairpin stability and domain architecture to higher resolution RNA barrier height analyses.

5.
Bioinformatics ; 37(20): 3660-3661, 2021 Oct 25.
Artículo en Inglés | MEDLINE | ID: mdl-33823536

RESUMEN

SUMMARY: We present a new graphical tool for RNA secondary structure analysis. The central feature is the ability to visually compare/contrast up to three base pairing configurations for a given sequence in a compact, standardized circular arc diagram layout. This is complemented by a built-in CT-style file viewer and radial layout substructure viewer which are directly linked to the arc diagram window via the zoom selection tool. Additional functionality includes the computation of some numerical information, and the ability to export images and data for later use. This tool should be of use to researchers seeking to better understand similarities and differences between structural alternatives for an RNA sequence. AVAILABILITY AND IMPLEMENTATION: https://github.com/gtDMMB/RNAStructViz/wiki.

6.
Genes (Basel) ; 12(4)2021 03 25.
Artículo en Inglés | MEDLINE | ID: mdl-33805944

RESUMEN

Minimum free energy prediction of RNA secondary structures is based on the Nearest Neighbor Thermodynamics Model. While such predictions are typically good, the accuracy can vary widely even for short sequences, and the branching thermodynamics are an important factor in this variance. Recently, the simplest model for multiloop energetics-a linear function of the number of branches and unpaired nucleotides-was found to be the best. Subsequently, a parametric analysis demonstrated that per family accuracy can be improved by changing the weightings in this linear function. However, the extent of improvement was not known due to the ad hoc method used to find the new parameters. Here we develop a branch-and-bound algorithm that finds the set of optimal parameters with the highest average accuracy for a given set of sequences. Our analysis shows that the previous ad hoc parameters are nearly optimal for tRNA and 5S rRNA sequences on both training and testing sets. Moreover, cross-family improvement is possible but more difficult because competing parameter regions favor different families. The results also indicate that restricting the unpaired nucleotide penalty to small values is warranted. This reduction makes analyzing longer sequences using the present techniques more feasible.


Asunto(s)
ARN Ribosómico 5S/química , ARN de Transferencia/química , ARN/química , Algoritmos , Entropía , Humanos , Conformación de Ácido Nucleico , ARN/genética , ARN Ribosómico 5S/genética , ARN de Transferencia/genética , Termodinámica
7.
Bull Math Biol ; 82(10): 133, 2020 10 07.
Artículo en Inglés | MEDLINE | ID: mdl-33029669

RESUMEN

A growing number of RNA sequences are now known to exist in some distribution with two or more different stable structures. Recent algorithms attempt to reconstruct such mixtures using the list of nucleotides in a sequence in conjunction with auxiliary experimental footprinting data. In this paper, we demonstrate some challenges which remain in addressing this problem; in particular we consider the difficulty of reconstructing a mixture of two RNA structures across a spectrum of different relative abundances. Although progress has been made in identifying the stable structures present, it remains nontrivial to predict the relative abundance of each within the experimentally sampled mixture. Because the ratio of structures present can change depending on experimental conditions, it is the footprinting data-and not the sequence-which must encode information on changes in the relative abundance. Here, we use simulated experimental data to demonstrate that there exist RNA sequences and relative abundance combinations which cannot be recovered by current methods. We then prove that this is not a single exception, but rather part of the rule. In particular, we show, using a Nussinov-Jacobson model, that recovering the relative abundances is difficult for a large proportion of RNA structure pairs. Lastly, we use information theory to establish a framework for quantifying how useful auxiliary data is in predicting the relative abundance of a structure. Together, these results demonstrate that aspects of the problem of reconstructing a mixture of RNA structures from experimental data remain open.


Asunto(s)
Modelos Biológicos , ARN , Algoritmos , Secuencia de Bases , Conceptos Matemáticos , Conformación de Ácido Nucleico , Nucleótidos , ARN/química , ARN/genética
8.
J Struct Biol ; 210(1): 107475, 2020 04 01.
Artículo en Inglés | MEDLINE | ID: mdl-32032754

RESUMEN

Prediction of RNA base pairings yields insight into molecular structure, and therefore function. The most common methods predict an optimal structure under the standard thermodynamic model. One component of this model is the equation which governs the cost of branching, where three or more helical "arms" radiate out from a multiloop (also known as a junction). The multiloop initiation equation has three parameters; changing those values can significantly alter the predicted structure. We give a complete analysis of the prediction accuracy, stability, and robustness for all possible parameter combinations for a diverse set of tRNA sequences, and also for 5S rRNA. We find that the accuracy can often be substantially improved on a per sequence basis. However, simultaneous improvement within families, and most especially between families, remains a challenge.


Asunto(s)
ARN Ribosómico/química , ARN/química , Algoritmos , Conformación de Ácido Nucleico , Termodinámica
9.
Comput Math Biophys ; 7(1): 48-63, 2019 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-34113790

RESUMEN

A riboswitch is a type of RNA molecule that regulates important biological functions by changing structure, typically under ligand-binding. We assess the extent that these ligand-bound structural alternatives are present in the Boltzmann sample, a standard RNA secondary structure prediction method, for three riboswitch test cases. We use the cluster analysis tool RNAStructProfiling to characterize the different modalities present among the suboptimal structures sampled. We compare these modalities to the putative base pairing models obtained from independent experiments using NMR or fluorescence spectroscopy. We find, somewhat unexpectedly, that profiling the Boltzmann sample captures evidence of ligand-bound conformations for two of three riboswitches studied. Moreover, this agreement between predicted modalities and experimental models is consistent with the classification of riboswitches into thermodynamic versus kinetic regulatory mechanisms. Our results support cluster analysis of Boltzmann samples by RNAStructProfiling as a possible basis for de novo identification of thermodynamic riboswitches, while highlighting the challenges for kinetic ones.

10.
Biophys J ; 113(2): 321-329, 2017 Jul 25.
Artículo en Inglés | MEDLINE | ID: mdl-28629618

RESUMEN

Understanding how RNA secondary structure prediction methods depend on the underlying nearest-neighbor thermodynamic model remains a fundamental challenge in the field. Minimum free energy (MFE) predictions are known to be "ill conditioned" in that small changes to the thermodynamic model can result in significantly different optimal structures. Hence, the best practice is now to sample from the Boltzmann distribution, which generates a set of suboptimal structures. Although the structural signal of this Boltzmann sample is known to be robust to stochastic noise, the conditioning and robustness under thermodynamic perturbations have yet to be addressed. We present here a mathematically rigorous model for conditioning inspired by numerical analysis, and also a biologically inspired definition for robustness under thermodynamic perturbation. We demonstrate the strong correlation between conditioning and robustness and use its tight relationship to define quantitative thresholds for well versus ill conditioning. These resulting thresholds demonstrate that the majority of the sequences are at least sample robust, which verifies the assumption of sampling's improved conditioning over the MFE prediction. Furthermore, because we find no correlation between conditioning and MFE accuracy, the presence of both well- and ill-conditioned sequences indicates the continued need for both thermodynamic model refinements and alternate RNA structure prediction methods beyond the physics-based ones.


Asunto(s)
Modelos Moleculares , Conformación de Ácido Nucleico , ARN , Termodinámica , ARN/química , Procesos Estocásticos
11.
Wiley Interdiscip Rev RNA ; 7(3): 278-94, 2016 05.
Artículo en Inglés | MEDLINE | ID: mdl-26971529

RESUMEN

A widening gap exists between the best practices for RNA secondary structure prediction developed by computational researchers and the methods used in practice by experimentalists. Minimum free energy predictions, although broadly used, are outperformed by methods which sample from the Boltzmann distribution and data mine the results. In particular, moving beyond the single structure prediction paradigm yields substantial gains in accuracy. Furthermore, the largest improvements in accuracy and precision come from viewing secondary structures not at the base pair level but at lower granularity/higher abstraction. This suggests that random errors affecting precision and systematic ones affecting accuracy are both reduced by this 'fuzzier' view of secondary structures. Thus experimentalists who are willing to adopt a more rigorous, multilayered approach to secondary structure prediction by iterating through these levels of granularity will be much better able to capture fundamental aspects of RNA base pairing. WIREs RNA 2016, 7:278-294. doi: 10.1002/wrna.1334 For further resources related to this article, please visit the WIREs website.


Asunto(s)
Biología Computacional/métodos , Conformación de Ácido Nucleico , ARN/química , Análisis por Conglomerados
12.
Nucleic Acids Res ; 42(22): e171, 2014 Dec 16.
Artículo en Inglés | MEDLINE | ID: mdl-25392423

RESUMEN

As the biomedical impact of small RNAs grows, so does the need to understand competing structural alternatives for regions of functional interest. Suboptimal structure analysis provides significantly more RNA base pairing information than a single minimum free energy prediction. Yet computational enhancements like Boltzmann sampling have not been fully adopted by experimentalists since identifying meaningful patterns in this data can be challenging. Profiling is a novel approach to mining RNA suboptimal structure data which makes the power of ensemble-based analysis accessible in a stable and reliable way. Balancing abstraction and specificity, profiling identifies significant combinations of base pairs which dominate low-energy RNA secondary structures. By design, critical similarities and differences are highlighted, yielding crucial information for molecular biologists. The code is freely available via http://gtfold.sourceforge.net/profiling.html.


Asunto(s)
ARN Pequeño no Traducido/química , Análisis de Secuencia de ARN/métodos , Emparejamiento Base , Interpretación Estadística de Datos , Modelos Moleculares , Conformación de Ácido Nucleico , ARN Bacteriano/química , Vibrio cholerae/genética
13.
J Math Biol ; 69(6-7): 1743-72, 2014 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-24384698

RESUMEN

We analyze the distribution of RNA secondary structures given by the Knudsen-Hein stochastic context-free grammar used in the prediction program Pfold. Our main theorem gives relations between the expected number of these motifs--independent of the grammar probabilities. These relations are a consequence of proving that the distribution of base pairs, of helices, and of different types of loops is asymptotically Gaussian in this model of RNA folding. Proof techniques use singularity analysis of probability generating functions. We also demonstrate that these asymptotic results capture well the expected number of RNA base pairs in native ribosomal structures, and certain other aspects of their predicted secondary structures. In particular, we find that the predicted structures largely satisfy the expected relations, although the native structures do not.


Asunto(s)
Modelos Químicos , Conformación de Ácido Nucleico , Pliegue del ARN , ARN/química , Algoritmos , Emparejamiento Base , Distribución Normal , Procesos Estocásticos , Termodinámica
14.
J Biol Phys ; 39(2): 163-72, 2013 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-23860866

RESUMEN

There are two important problems in the assembly of small, icosahedral RNA viruses. First, how does the capsid protein select the viral RNA for packaging, when there are so many other candidate RNA molecules available? Second, what is the mechanism of assembly? With regard to the first question, there are a number of cases where a particular RNA sequence or structure--often one or more stem-loops--either promotes assembly or is required for assembly, but there are others where specific packaging signals are apparently not required. With regard to the assembly pathway, in those cases where stem-loops are involved, the first step is generally believed to be binding of the capsid proteins to these "fingers" of the RNA secondary structure. In the mature virus, the core of the RNA would then occupy the center of the viral particle, and the stem-loops would reach outward, towards the capsid, like stalagmites reaching up from the floor of a grotto towards the ceiling. Those viruses whose assembly does not depend on protein binding to stem-loops could have a different structure, with the core of the RNA lying just under the capsid, and the fingers reaching down into the interior of the virus, like stalactites. We review the literature on these alternative structures, focusing on RNA selectivity and the assembly mechanism, and we propose experiments aimed at determining, in a given virus, which of the two structures actually occurs.


Asunto(s)
Genoma Viral , Virus ARN/genética , Levivirus/química , Levivirus/genética , Modelos Moleculares , Virus ARN/química , Virus Satélite del Mosaico del Tabaco/química , Virus Satélite del Mosaico del Tabaco/genética
15.
Nucleic Acids Res ; 41(5): 2807-16, 2013 Mar 01.
Artículo en Inglés | MEDLINE | ID: mdl-23325843

RESUMEN

Recent advances in RNA structure determination include using data from high-throughput probing experiments to improve thermodynamic prediction accuracy. We evaluate the extent and nature of improvements in data-directed predictions for a diverse set of 16S/18S ribosomal sequences using a stochastic model of experimental SHAPE data. The average accuracy for 1000 data-directed predictions always improves over the original minimum free energy (MFE) structure. However, the amount of improvement varies with the sequence, exhibiting a correlation with MFE accuracy. Further analysis of this correlation shows that accurate MFE base pairs are typically preserved in a data-directed prediction, whereas inaccurate ones are not. Thus, the positive predictive value of common base pairs is consistently higher than the directed prediction accuracy. Finally, we confirm sequence dependencies in the directability of thermodynamic predictions and investigate the potential for greater accuracy improvements in the worst performing test sequence.


Asunto(s)
Simulación por Computador , Modelos Moleculares , ARN Ribosómico 16S/química , ARN Ribosómico 18S/química , Programas Informáticos , Algoritmos , Animales , Funciones de Verosimilitud , Conformación de Ácido Nucleico , ARN de Archaea/química , ARN Bacteriano/química , Procesos Estocásticos , Termodinámica
16.
BMC Res Notes ; 5: 341, 2012 Jul 02.
Artículo en Inglés | MEDLINE | ID: mdl-22747589

RESUMEN

BACKGROUND: Accurate and efficient RNA secondary structure prediction remains an important open problem in computational molecular biology. Historically, advances in computing technology have enabled faster and more accurate RNA secondary structure predictions. Previous parallelized prediction programs achieved significant improvements in runtime, but their implementations were not portable from niche high-performance computers or easily accessible to most RNA researchers. With the increasing prevalence of multi-core desktop machines, a new parallel prediction program is needed to take full advantage of today's computing technology. FINDINGS: We present here the first implementation of RNA secondary structure prediction by thermodynamic optimization for modern multi-core computers. We show that GTfold predicts secondary structure in less time than UNAfold and RNAfold, without sacrificing accuracy, on machines with four or more cores. CONCLUSIONS: GTfold supports advances in RNA structural biology by reducing the timescales for secondary structure prediction. The difference will be particularly valuable to researchers working with lengthy RNA sequences, such as RNA viral genomes.


Asunto(s)
Algoritmos , Biología Computacional/métodos , ARN/química , Programas Informáticos , Biología Computacional/instrumentación , Conformación de Ácido Nucleico , Análisis de Secuencia de ARN , Termodinámica
17.
J Struct Biol ; 180(1): 110-6, 2012 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-22750417

RESUMEN

Satellite tobacco mosaic virus (STMV) is an icosahedral T=1 single-stranded RNA virus with a genome containing 1058 nucleotides. X-ray crystallography revealed a structure containing 30 double-helical RNA segments, with each helix having nine base pairs and an unpaired nucleotide at the 3' end of each strand. Based on this structure, Larson and McPherson proposed a model of 30 hairpin-loop elements occupying the edges of the icosahedron and connected by single-stranded regions. More recently, Schroeder et al. have combined the results of chemical probing with a novel helix searching algorithm to propose a specific secondary structure for the STMV genome, compatible with the Larson-McPherson model. Here we report an all-atom model of STMV, using the complete protein and RNA sequences and the Schroeder RNA secondary structure. As far as we know, this is the first all-atom model for the complete structure of any virus (100% of the atoms) using the natural genomic sequence.


Asunto(s)
Cápside/ultraestructura , Modelos Moleculares , ARN Viral/ultraestructura , Virus Satélite del Mosaico del Tabaco/ultraestructura , Cápside/química , Cristalografía por Rayos X , Secuencias Invertidas Repetidas , Conformación de Ácido Nucleico , Estructura Cuaternaria de Proteína , ARN Viral/química
18.
Bull Math Biol ; 73(4): 754-76, 2011 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-21207176

RESUMEN

Motivated by recent work in parametric sequence alignment, we study the parameter space for scoring RNA folds and construct an RNA polytope. A vertex of this polytope corresponds to RNA secondary structures with common branching. We use this polytope and its normal fan to study the effect of varying three parameters in the free energy model that are not determined experimentally. Our results indicate that variation of these specific parameters does not have a dramatic effect on the structures predicted by the free energy model. We additionally map a collection of known RNA secondary structures to the RNA polytope.


Asunto(s)
Modelos Moleculares , Conformación de Ácido Nucleico , ARN/química , Termodinámica , Algoritmos , Secuencia de Bases , Bases de Datos de Ácidos Nucleicos
19.
Nucleic Acids Res ; 37(4): e29, 2009 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-19158187

RESUMEN

The identification of small structural motifs and their organization into larger subassemblies is of fundamental interest in the analysis, prediction and design of 3D structures of large RNAs. This problem has been studied only sparsely, as most of the existing work is limited to the characterization and discovery of motifs in RNA secondary structures. We present a novel geometric method for the characterization and identification of structural motifs in 3D rRNA molecules. This method enables the efficient recognition of known 3D motifs, such as tetraloops, E-loops, kink-turns and others. Furthermore, it provides a new way of characterizing complex 3D motifs, notably junctions, that have been defined and identified in the secondary structure but have not been analyzed and classified in three dimensions. We demonstrate the relevance and utility of our approach by applying it to the Haloarcula marismortui large ribosomal unit. Pending the implementation of a dedicated web server, the code accompanying this article, written in JAVA, is available upon request from the contact author.


Asunto(s)
ARN Ribosómico/química , Biología Computacional/métodos , Haloarcula marismortui/genética , Modelos Moleculares , Conformación de Ácido Nucleico , ARN Ribosómico/clasificación , Análisis de Secuencia de ARN
20.
Bull Math Biol ; 71(1): 84-106, 2009 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-19083065

RESUMEN

We give a Large Deviation Principle (LDP) with explicit rate function for the distribution of vertex degrees in plane trees, a combinatorial model of RNA secondary structures. We calculate the typical degree distributions based on nearest neighbor free energies, and compare our results with the branching configurations found in two sets of large RNA secondary structures. We find substantial agreement overall, with some interesting deviations which merit further study.


Asunto(s)
Modelos Moleculares , Conformación de Ácido Nucleico , ARN Ribosómico 23S/ultraestructura , ARN Viral/ultraestructura , Interpretación Estadística de Datos , Árboles de Decisión , Redes Neurales de la Computación , Picornaviridae/genética , Probabilidad , Termodinámica
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA