RESUMO
MOTIVATION: Protein motions play an essential role in many biochemical processes. Lab studies often quantify these motions in terms of their kinetics such as the speed at which a protein folds or the population of certain interesting states like the native state. Kinetic metrics give quantifiable measurements of the folding process that can be compared across a group of proteins such as a wild-type protein and its mutants. RESULTS: We present two new techniques, map-based master equation solution and map-based Monte Carlo simulation, to study protein kinetics through folding rates and population kinetics from approximate folding landscapes, models called maps. From these two new techniques, interesting metrics that describe the folding process, such as reaction coordinates, can also be studied. In this article we focus on two metrics, formation of helices and structure formation around tryptophan residues. These two metrics are often studied in the lab through circular dichroism (CD) spectra analysis and tryptophan fluorescence experiments, respectively. The approximated landscape models we use here are the maps of protein conformations and their associated transitions that we have presented and validated previously. In contrast to other methods such as the traditional master equation and Monte Carlo simulation, our techniques are both fast and can easily be computed for full-length detailed protein models. We validate our map-based kinetics techniques by comparing folding rates to known experimental results. We also look in depth at the population kinetics, helix formation and structure near tryptophan residues for a variety of proteins. AVAILABILITY: We invite the community to help us enrich our publicly available database of motions and kinetics analysis by submitting to our server: http://parasol.tamu.edu/foldingserver/.
Assuntos
Algoritmos , Modelos Químicos , Dobramento de Proteína , Proteínas/química , Proteínas/ultraestrutura , Análise de Sequência de Proteína/métodos , Simulação por Computador , Cinética , Modelos Moleculares , Movimento (Física)RESUMO
Protein motions, ranging from molecular flexibility to large-scale conformational change, play an essential role in many biochemical processes. Despite the explosion in our knowledge of structural and functional data, our understanding of protein movement is still very limited. In previous work, we developed and validated a motion planning based method for mapping protein folding pathways from unstructured conformations to the native state. In this paper, we propose a novel method based on rigidity theory to sample conformation space more effectively, and we describe extensions of our framework to automate the process and to map transitions between specified conformations. Our results show that these additions both improve the accuracy of our maps and enable us to study a broader range of motions for larger proteins. For example, we show that rigidity-based sampling results in maps that capture subtle folding differences between protein G and its mutants, NuG1 and NuG2, and we illustrate how our technique can be used to study large-scale conformational changes in calmodulin, a 148 residue signaling protein known to undergo conformational changes when binding to Ca(2+). Finally, we announce our web-based protein folding server which includes a publicly available archive of protein motions: (http://parasol.tamu.edu/foldingserver/).
Assuntos
Calmodulina/química , Biologia Computacional , Proteínas de Ligação ao GTP/química , Calmodulina/metabolismo , Simulação por Computador , Proteínas de Ligação ao GTP/genética , Proteínas de Ligação ao GTP/metabolismo , Modelos Moleculares , Modelos Estatísticos , Conformação Proteica , Dobramento de Proteína , Estrutura Secundária de Proteína , TermodinâmicaRESUMO
BACKGROUND: Simulating protein folding motions is an important problem in computational biology. Motion planning algorithms, such as Probabilistic Roadmap Methods, have been successful in modeling the folding landscape. Probabilistic Roadmap Methods and variants contain several phases (i.e., sampling, connection, and path extraction). Most of the time is spent in the connection phase and selecting which variant to employ is a difficult task. Global machine learning has been applied to the connection phase but is inefficient in situations with varying topology, such as those typical of folding landscapes. RESULTS: We develop a local learning algorithm that exploits the past performance of methods within the neighborhood of the current connection attempts as a basis for learning. It is sensitive not only to different types of landscapes but also to differing regions in the landscape itself, removing the need to explicitly partition the landscape. We perform experiments on 23 proteins of varying secondary structure makeup with 52-114 residues. We compare the success rate when using our methods and other methods. We demonstrate a clear need for learning (i.e., only learning methods were able to validate against all available experimental data) and show that local learning is superior to global learning producing, in many cases, significantly higher quality results than the other methods. CONCLUSIONS: We present an algorithm that uses local learning to select appropriate connection methods in the context of roadmap construction for protein folding. Our method removes the burden of deciding which method to use, leverages the strengths of the individual input methods, and it is extendable to include other future connection methods.
Assuntos
Biologia Computacional/métodos , Aprendizado de Máquina , Dobramento de Proteína , Proteínas/química , Modelos Moleculares , Movimento , Conformação Proteica , Proteínas/metabolismo , TermodinâmicaRESUMO
We propose a novel, motion planning based approach to approximately map the energy landscape of an RNA molecule. A key feature of our method is that it provides a sparse map that captures the main features of the energy landscape which can be analyzed to compute folding kinetics. Our method is based on probabilistic roadmap motion planners that we have previously successfully applied to protein folding. In this paper, we provide evidence that this approach is also well suited to RNA. We compute population kinetics and transition rates on our roadmaps using the master equation for a few moderately sized RNA and show that our results compare favorably with results of other existing methods.
Assuntos
Biologia Computacional , Modelos Biológicos , Modelos Químicos , Conformação de Ácido Nucleico , RNA/química , RNA/metabolismo , Termodinâmica , CinéticaRESUMO
We investigate a novel approach for studying protein folding that has evolved from robotics motion planning techniques called probabilistic roadmap methods (PRMs). Our focus is to study issues related to the folding process, such as the formation of secondary and tertiary structures, assuming we know the native fold. A feature of our PRM-based framework is that the large sets of folding pathways in the roadmaps it produces, in just a few hours on a desktop PC, provide global information about the protein's energy landscape. This is an advantage over other simulation methods such as molecular dynamics or Monte Carlo methods which require more computation and produce only a single trajectory in each run. In our initial studies, we obtained encouraging results for several small proteins. In this paper, we investigate more sophisticated techniques for analyzing the folding pathways in our roadmaps. In addition to more formally revalidating our previous results, we present a case study showing that our technique captures known folding differences between the structurally similar proteins G and L.
Assuntos
Biofísica/métodos , Biologia Computacional/métodos , Dobramento de Proteína , Animais , Simulação por Computador , Humanos , Modelos Biológicos , Modelos Teóricos , Método de Monte Carlo , Movimento (Física) , Probabilidade , Conformação Proteica , Estrutura Secundária de Proteína , Software , TermodinâmicaRESUMO
Predicting protein structures and simulating protein folding are two of the most important problems in computational biology today. Simulation methods rely on a scoring function to distinguish the native structure (the most energetically stable) from non-native structures. Decoy databases are collections of non-native structures used to test and verify these functions. We present a method to evaluate and improve the quality of decoy databases by adding novel structures and removing redundant structures. We test our approach on 20 different decoy databases of varying size and type and show significant improvement across a variety of metrics. We also test our improved databases on two popular modern scoring functions and show that for most cases they contain a greater or equal number of native-like structures than the original databases, thereby producing a more rigorous database for testing scoring functions.
Assuntos
Biologia Computacional/métodos , Bases de Dados de Proteínas , Dobramento de Proteína , Proteínas/química , Algoritmos , Simulação por Computador , Conformação ProteicaRESUMO
We present a framework for studying protein folding pathways and potential landscapes which is based on techniques recently developed in the robotics motion planning community. Our focus in this work is to study the protein folding mechanism assuming we know the native fold. That is, instead of performing fold prediction, we aim to study issues related to the folding process, such as the formation of secondary and tertiary structure, and the dependence of the folding pathway on the initial denatured conformation. Our work uses probabilistic roadmap (PRM) motion planning techniques which have proven successful for problems involving high-dimensional configuration spaces. A strength of these methods is their efficiency in rapidly covering the planning space without becoming trapped in local minima. We have applied our PRM technique to several small proteins (~60 residues) and validated the pathways computed by comparing the secondary structure formation order on our paths to known hydrogen exchange experimental results. An advantage of the PRM framework over other simulation methods is that it enables one to easily and efficiently compute folding pathways from any denatured starting state to the (known) native fold. This aspect makes our approach ideal for studying global properties of the protein's potential landscape, most of which are difficult to simulate and study with other methods. For example, in the proteins we study, the folding pathways starting from different denatured states sometimes share common portions when they are close to the native fold, and moreover, the formation order of the secondary structure appears largely independent of the starting denatured conformation. Another feature of our technique is that the distribution of the sampled conformations is correlated with the formation of secondary structure and, in particular, appears to differentiate situations in which secondary structure clearly forms first and those in which the tertiary structure is obtained more directly. Overall, our results applying PRM techniques are very encouraging and indicate the promise of our approach for studying proteins for which experimental results are not available.
Assuntos
Biologia Computacional , Dobramento de Proteína , Modelos Moleculares , Modelos Estatísticos , Conformação Proteica , Estrutura Secundária de Proteína , TermodinâmicaRESUMO
We investigate a novel approach for studying the kinetics of protein folding. Our framework has evolved from robotics motion planning techniques called probabilistic roadmap methods (PRMs) that have been applied in many diverse fields with great success. In our previous work, we presented our PRM-based technique and obtained encouraging results studying protein folding pathways for several small proteins. In this paper, we describe how our motion planning framework can be used to study protein folding kinetics. In particular, we present a refined version of our PRM-based framework and describe how it can be used to produce potential energy landscapes, free energy landscapes, and many folding pathways all from a single roadmap which is computed in a few hours on a desktop PC. Results are presented for 14 proteins. Our ability to produce large sets of unrelated folding pathways may potentially provide crucial insight into some aspects of folding kinetics, such as proteins that exhibit both two-state and three-state kinetics that are not captured by other theoretical techniques.
Assuntos
Biologia Computacional , Dobramento de Proteína , CinéticaRESUMO
This paper presents a generalized framework for dynamic simulation realized in a prototype simulator called the Interactive Generalized Motion Simulator (I-GMS), which can simulate motions of multirigid-body systems with contact interaction in virtual environments. I-GMS is designed to meet two important goals: generality and interactivity. By generality, we mean a dynamic simulator which can easily support various systems of rigid bodies, ranging from a single free-flying rigid object to complex linkages such as those needed for robotic systems or human body simulation. To provide this generality, we have developed I-GMS in an object-oriented framework. The user interactivity is supported through a haptic interface for articulated bodies, introducing interactive dynamic simulation schemes. This user-interaction is achieved by performing push and pull operations via the PHANToM haptic device, which runs as an integrated part of I-GMS. Also, a hybrid scheme was used for simulating internal contacts (between bodies in the multirigid-body system) in the presence of friction, which could avoid the nonexistent solution problem often faced when solving contact problems with Coulomb friction. In our hybrid scheme, two impulse-based methods are exploited so that different methods are applied adaptively, depending on whether the current contact situation is characterized as "bouncing" or "steady." We demonstrate the user-interaction capability of I-GMS through on-line editing of trajectories of a 6-degree of freedom (dof) articulated structure.
Assuntos
Algoritmos , Simulação por Computador , Articulações/fisiologia , Modelos Biológicos , Movimento/fisiologia , Robótica/métodos , Interface Usuário-Computador , Fenômenos Biomecânicos/métodos , Humanos , Dinâmica não LinearRESUMO
We present a general computational approach to simulate RNA folding kinetics that can be used to extract population kinetics, folding rates and the formation of particular substructures that might be intermediates in the folding process. Simulating RNA folding kinetics can provide unique insight into RNA whose functions are dictated by folding kinetics and not always by nucleotide sequence or the structure of the lowest free-energy state. The method first builds an approximate map (or model) of the folding energy landscape from which the population kinetics are analyzed by solving the master equation on the map. We present results obtained using an analysis technique, map-based Monte Carlo simulation, which stochastically extracts folding pathways from the map. Our method compares favorably with other computational methods that begin with a comprehensive free-energy landscape, illustrating that the smaller, approximate map captures the major features of the complete energy landscape. As a result, our method scales to larger RNAs. For example, here we validate kinetics of RNA of more than 200 nucleotides. Our method accurately computes the kinetics-based functional rates of wild-type and mutant ColE1 RNAII and MS2 phage RNAs showing excellent agreement with experiment.
Assuntos
Simulação por Computador , Conformação de Ácido Nucleico , RNA/química , RNA/metabolismo , Animais , Sequência de Bases , Cinética , Dados de Sequência Molecular , RNA/genética , RNA Líder para Processamento/química , RNA Líder para Processamento/genética , Reprodutibilidade dos Testes , Termodinâmica , Fatores de Tempo , TrypanosomatinaRESUMO
We investigate a novel approach for studying protein folding that has evolved from robotics motion planning techniques called probabilistic roadmap methods (PRMS). Our focus is to study issues related to the folding process, such as the formation of secondary and tertiary structure, assuming we know the native fold. A feature of our PRM-based framework is that the large sets of folding pathways in the roadmaps it produces, in a few hours on a desktop PC, provide global information about the protein's energy landscape. This is an advantage over other simulation methods such as molecular dynamics or Monte Carlo methods which require more computation and produce only a single trajectory in each run. In our initial studies, we obtained encouraging results for several small proteins. In this paper, we investigate more sophisticated techniques for analyzing the folding pathways in our roadmaps. In addition to more formally revalidating our previous results, we present a case study showing our technique captures known folding differences between the structurally similar proteins G and L.