Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
1.
Proteins ; 83(9): 1616-24, 2015 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-26095680

RESUMEN

Knowledge-based protein potentials are simplified potentials designed to improve the quality of protein models, which is important as more accurate models are more useful for biological and pharmaceutical studies. Consequently, knowledge-based potentials often are designed to be efficient in ordering a given set of deformed structures denoted decoys according to how close they are to the relevant native protein structure. This, however, does not necessarily imply that energy minimization of this potential will bring the decoys closer to the native structure. In this study, we introduce an iterative strategy to improve the convergence of decoy structures. It works by adding energy optimized decoys to the pool of decoys used to construct the next and improved knowledge-based potential. We demonstrate that this strategy results in significantly improved decoy convergence on Titan high resolution decoys and refinement targets from Critical Assessment of protein Structure Prediction competitions. Our potential is formulated in Cartesian coordinates and has a fixed backbone potential to restricts motions to be close to those of a dihedral model, a fixed hydrogen-bonding potential and a variable coarse grained carbon alpha potential consisting of a pair potential and a novel solvent potential that are b-spline based as we use explicit gradient and Hessian for efficient energy optimization.


Asunto(s)
Algoritmos , Biología Computacional/métodos , Conformación Proteica , Proteínas/química , Enlace de Hidrógeno , Modelos Moleculares , Reproducibilidad de los Resultados , Termodinámica
2.
Proteins ; 81(5): 841-51, 2013 May.
Artículo en Inglés | MEDLINE | ID: mdl-23280479

RESUMEN

Protein structure prediction techniques proceed in two steps, namely the generation of many structural models for the protein of interest, followed by an evaluation of all these models to identify those that are native-like. In theory, the second step is easy, as native structures correspond to minima of their free energy surfaces. It is well known however that the situation is more complicated as the current force fields used for molecular simulations fail to recognize native states from misfolded structures. In an attempt to solve this problem, we follow an alternate approach and derive a new potential from geometric knowledge extracted from native and misfolded conformers of protein structures. This new potential, Metric Protein Potential (MPP), has two main features that are key to its success. Firstly, it is composite in that it includes local and nonlocal geometric information on proteins. At the short range level, it captures and quantifies the mapping between the sequences and structures of short (7-mer) fragments of protein backbones through the introduction of a new local energy term. The local energy term is then augmented with a nonlocal residue-based pairwise potential, and a solvent potential. Secondly, it is optimized to yield a maximized correlation between the energy of a structural model and its root mean square (RMS) to the native structure of the corresponding protein. We have shown that MPP yields high correlation values between RMS and energy and that it is able to retrieve the native structure of a protein from a set of high-resolution decoys.


Asunto(s)
Proteínas/química , Algoritmos , Conformación Proteica , Solventes , Termodinámica
3.
Bioinformatics ; 28(4): 510-5, 2012 Feb 15.
Artículo en Inglés | MEDLINE | ID: mdl-22199383

RESUMEN

MOTIVATION: Clustering protein structures is an important task in structural bioinformatics. De novo structure prediction, for example, often involves a clustering step for finding the best prediction. Other applications include assigning proteins to fold families and analyzing molecular dynamics trajectories. RESULTS: We present Pleiades, a novel approach to clustering protein structures with a rigorous mathematical underpinning. The method approximates clustering based on the root mean square deviation by first mapping structures to Gauss integral vectors--which were introduced by Røgen and co-workers--and subsequently performing K-means clustering. CONCLUSIONS: Compared to current methods, Pleiades dramatically improves on the time needed to perform clustering, and can cluster a significantly larger number of structures, while providing state-of-the-art results. The number of low energy structures generated in a typical folding study, which is in the order of 50,000 structures, can be clustered within seconds to minutes.


Asunto(s)
Análisis por Conglomerados , Biología Computacional/métodos , Proteínas/química , Adenilato Quinasa/química , Candida/química , Escherichia coli/enzimología , Proteínas Fúngicas/química , Simulación de Dinámica Molecular
4.
Algorithms Mol Biol ; 16(1): 1, 2021 Feb 27.
Artículo en Inglés | MEDLINE | ID: mdl-33639968

RESUMEN

BACKGROUND: In computational structural biology, structure comparison is fundamental for our understanding of proteins. Structure comparison is, e.g., algorithmically the starting point for computational studies of structural evolution and it guides our efforts to predict protein structures from their amino acid sequences. Most methods for structural alignment of protein structures optimize the distances between aligned and superimposed residue pairs, i.e., the distances traveled by the aligned and superimposed residues during linear interpolation. Considering such a linear interpolation, these methods do not differentiate if there is room for the interpolation, if it causes steric clashes, or more severely, if it changes the topology of the compared protein backbone curves. RESULTS: To distinguish such cases, we analyze the linear interpolation between two aligned and superimposed backbones. We quantify the amount of steric clashes and find all self-intersections in a linear backbone interpolation. To determine if the self-intersections alter the protein's backbone curve significantly or not, we present a path-finding algorithm that checks if there exists a self-avoiding path in a neighborhood of the linear interpolation. A new path is constructed by altering the linear interpolation using a novel interpretation of Reidemeister moves from knot theory working on three-dimensional curves rather than on knot diagrams. Either the algorithm finds a self-avoiding path or it returns a smallest set of essential self-intersections. Each of these indicates a significant difference between the folds of the aligned protein structures. As expected, we find at least one essential self-intersection separating most unknotted structures from a knotted structure, and we find even larger motions in proteins connected by obstruction free linear interpolations. We also find examples of homologous proteins that are differently threaded, and we find many distinct folds connected by longer but simple deformations. TM-align is one of the most restrictive alignment programs. With standard parameters, it only aligns residues superimposed within 5 Ångström distance. We find 42165 topological obstructions between aligned parts in 142068 TM-alignments. Thus, this restrictive alignment procedure still allows topological dissimilarity of the aligned parts. CONCLUSIONS: Based on the data we conclude that our program ProteinAlignmentObstruction provides significant additional information to alignment scores based solely on distances between aligned and superimposed residue pairs.

5.
Trends Biochem Sci ; 30(1): 13-9, 2005 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-15653321

RESUMEN

The mechanism by which proteins fold to their native states has been the focus of intense research in recent years. The rate-limiting event in the folding reaction is the formation of a conformation in a set known as the transition-state ensemble. The structural features present within such ensembles have now been analysed for a series of proteins using data from a combination of biochemical and biophysical experiments together with computer-simulation methods. These studies show that the topology of the transition state is determined by a set of interactions involving a small number of key residues and, in addition, that the topology of the transition state is closer to that of the native state than to that of any other fold in the protein universe. Here, we review the evidence for these conclusions and suggest a molecular mechanism that rationalizes these findings by presenting a view of protein folds that is based on the topological features of the polypeptide backbone, rather than the conventional view that depends on the arrangement of different types of secondary-structure elements. By linking the folding process to the organization of the protein structure universe, we propose an explanation for the overwhelming importance of topology in the transition states for protein folding.


Asunto(s)
Modelos Moleculares , Pliegue de Proteína , Proteínas/química , Animales , Humanos , Estructura Terciaria de Proteína , Homología Estructural de Proteína
6.
PeerJ ; 8: e9159, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-32566389

RESUMEN

The native structure of a protein is important for its function, and therefore methods for exploring protein structures have attracted much research. However, rather few methods are sensitive to topologic-geometric features, the examples being knots, slipknots, lassos, links, and pokes, and with each method aimed only for a specific set of such configurations. We here propose a general method which transforms a structure into a "fingerprint of topological-geometric values" consisting in a series of real-valued descriptors from mathematical Knot Theory. The extent to which a structure contains unusual configurations can then be judged from this fingerprint. The method is not confined to a particular pre-defined topology or geometry (like a knot or a poke), and so, unlike existing methods, it is general. To achieve this our new algorithm, GISA, as a key novelty produces the descriptors, so called Gauss integrals, not only for the full chains of a protein but for all its sub-chains. This allows fingerprinting on any scale from local to global. The Gauss integrals are known to be effective descriptors of global protein folds. Applying GISA to sets of several thousand high resolution structures, we first show how the most basic Gauss integral, the writhe, enables swift identification of pre-defined geometries such as pokes and links. We then apply GISA with no restrictions on geometry, to show how it allows identifying rare conformations by finding rare invariant values only. In this unrestricted search, pokes and links are still found, but also knotted conformations, as well as more highly entangled configurations not previously described. Thus, an application of the basic scan method in GISA's tool-box revealed 10 known cases of knots as the top positive writhe cases, while placing at the top of the negative writhe 14 cases in cis-trans isomerases sharing a spatial motif of little secondary structure content, which possibly has gone unnoticed. Possible general applications of GISA are fold classification and structural alignment based on local Gauss integrals. Others include finding errors in protein models and identifying unusual conformations that might be important for protein folding and function. By its broad potential, we believe that GISA will be of general benefit to the structural bioinformatics community. GISA is coded in C and comes as a command line tool. Source and compiled code for GISA plus read-me and examples are publicly available at GitHub (https://github.com).

7.
Math Biosci ; 182(2): 167-81, 2003 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-12591623

RESUMEN

A family of global geometric measures is constructed for protein structure classification. These measures originate from integral formulas of Vassiliev knot invariants and give rise to a unique classification scheme. Our measures can better discriminate between many known protein structures than the simple measures of the secondary structure content of these protein structures.


Asunto(s)
Modelos Químicos , Pliegue de Proteína , Proteínas/química , Modelos Moleculares , Distribución Normal , Estructura Secundaria de Proteína , Proteínas/clasificación , Estadísticas no Paramétricas , Anomalía Torsional
8.
PLoS One ; 9(11): e109335, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-25411785

RESUMEN

Knowledge-based potentials are energy functions derived from the analysis of databases of protein structures and sequences. They can be divided into two classes. Potentials from the first class are based on a direct conversion of the distributions of some geometric properties observed in native protein structures into energy values, while potentials from the second class are trained to mimic quantitatively the geometric differences between incorrectly folded models and native structures. In this paper, we focus on the relationship between energy and geometry when training the second class of knowledge-based potentials. We assume that the difference in energy between a decoy structure and the corresponding native structure is linearly related to the distance between the two structures. We trained two distance-based knowledge-based potentials accordingly, one based on all inter-residue distances (PPD), while the other had the set of all distances filtered to reflect consistency in an ensemble of decoys (PPE). We tested four types of metric to characterize the distance between the decoy and the native structure, two based on extrinsic geometry (RMSD and GTD-TS*), and two based on intrinsic geometry (Q* and MT). The corresponding eight potentials were tested on a large collection of decoy sets. We found that it is usually better to train a potential using an intrinsic distance measure. We also found that PPE outperforms PPD, emphasizing the benefits of capturing consistent information in an ensemble. The relevance of these results for the design of knowledge-based potentials is discussed.


Asunto(s)
Biología Computacional/métodos , Proteínas/química , Bases del Conocimiento , Conformación Proteica
9.
Proc Natl Acad Sci U S A ; 100(1): 119-24, 2003 Jan 07.
Artículo en Inglés | MEDLINE | ID: mdl-12506205

RESUMEN

We introduce a method of looking at, analyzing, and comparing protein structures. The topology of a protein is captured by 30 numbers inspired by Vassiliev knot invariants. To illustrate the simplicity and power of this topological approach, we construct a measure (scaled Gauss metric, SGM) of similarity of protein shapes. Under this metric, protein chains naturally separate into fold clusters. We use SGM to construct an automatic classification procedure for the CATH2.4 database. The method is very fast because it requires neither alignment of the chains nor any chain-chain comparison. It also has only one adjustable parameter. We assign 95.51% of the chains into the proper C (class), A (architecture), T (topology), and H (homologous superfamily) fold, find all new folds, and detect no false geometric positives. Using the SGM, we display a "map" of the space of folds projected onto two dimensions, show the relative locations of the major structural classes, and "zoom into" the space of proteins to show architecture, topology, and fold clusters. The existence of a simple measure of a protein fold computed from the chain path will have a major impact on automatic fold classification.


Asunto(s)
Conformación Proteica , Proteínas/química , Modelos Moleculares , Distribución Normal , Probabilidad , Estructura Secundaria de Proteína , Reproducibilidad de los Resultados
10.
J Chem Inf Comput Sci ; 43(6): 1740-7, 2003.
Artículo en Inglés | MEDLINE | ID: mdl-14632419

RESUMEN

The large-scale 3D structure of a protein can be represented by the polygonal curve through the carbon alpha atoms of the protein backbone. We introduce an algorithm for computing the average number of times that a given configuration of crossings on such polygonal curves is seen, the average being taken over all directions in space. Hereby, we introduce a new family of global geometric measures of protein structures, which we compare with the so-called generalized Gauss integrals.


Asunto(s)
Preparaciones Farmacéuticas/análisis , Detección de Abuso de Sustancias , Trastornos Relacionados con Sustancias/diagnóstico , Adolescente , Niño , Preescolar , Femenino , Cromatografía de Gases y Espectrometría de Masas , Humanos , Inmunoensayo , Masculino , Trastornos Relacionados con Sustancias/epidemiología
11.
J Chem Inf Comput Sci ; 44(3): 856-61, 2004.
Artículo en Inglés | MEDLINE | ID: mdl-15154750

RESUMEN

A recurrent problem in organic chemistry is the generation of new molecular structures that conform to some predetermined set of structural constraints that are imposed in an endeavor to build certain required properties into the newly generated structure. An example of this is the pharmacophore model, used in medicinal chemistry to guide de novo design or selection of suitable structures from compound databases. We propose here a method that efficiently links up a selected number of required atom positions while at the same time directing the emergent molecular skeleton to avoid forbidden positions. The linkage process takes place on a lattice whose unit step length and overall geometry is designed to match typical architectures of organic molecules. We use an optimization method to select from the many different graphs possible. The approach is demonstrated in an example where crystal structures of the same (in this case rigid) ligand complexed with different proteins are available.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA