Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 44
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
Proc Natl Acad Sci U S A ; 119(41): e2210249119, 2022 10 11.
Artículo en Inglés | MEDLINE | ID: mdl-36191203

RESUMEN

Computational methodologies are increasingly addressing modeling of the whole cell at the molecular level. Proteins and their interactions are the key component of cellular processes. Techniques for modeling protein interactions, thus far, have included protein docking and molecular simulation. The latter approaches account for the dynamics of the interactions but are relatively slow, if carried out at all-atom resolution, or are significantly coarse grained. Protein docking algorithms are far more efficient in sampling spatial coordinates. However, they do not account for the kinetics of the association (i.e., they do not involve the time coordinate). Our proof-of-concept study bridges the two modeling approaches, developing an approach that can reach unprecedented simulation timescales at all-atom resolution. The global intermolecular energy landscape of a large system of proteins was mapped by the pairwise fast Fourier transform docking and sampled in space and time by Monte Carlo simulations. The simulation protocol was parametrized on existing data and validated on a number of observations from experiments and molecular dynamics simulations. The simulation protocol performed consistently across very different systems of proteins at different protein concentrations. It recapitulated data on the previously observed protein diffusion rates and aggregation. The speed of calculation allows reaching second-long trajectories of protein systems that approach the size of the cells, at atomic resolution.


Asunto(s)
Simulación de Dinámica Molecular , Proteínas , Algoritmos , Fenómenos Biofísicos , Cinética , Método de Montecarlo
2.
Proteins ; 90(6): 1259-1266, 2022 06.
Artículo en Inglés | MEDLINE | ID: mdl-35072956

RESUMEN

Protein docking protocols typically involve global docking scan, followed by re-ranking of the scan predictions by more accurate scoring functions that are either computationally too expensive or algorithmically impossible to include in the global scan. Development and validation of scoring methodologies are often performed on scoring benchmark sets (docking decoys) which offer concise and nonredundant representation of the global docking scan output for a large and diverse set of protein-protein complexes. Two such protein-protein scoring benchmarks were built for the Dockground resource, which contains various datasets for the development and testing of protein docking methodologies. One set was generated based on the Dockground unbound docking benchmark 4, and the other based on protein models from the Dockground model-model benchmark 2. The docking decoys were designed to reflect the reality of the real-case docking applications (e.g., correct docking predictions defined as near-native rather than native structures), and to minimize applicability of approaches not directly related to the development of scoring functions (reducing clustering of predictions in the binding funnel and disparity in structural quality of the near-native and nonnative matches). The sets were further characterized by the source organism and the function of the protein-protein complexes. The sets, freely available to the research community on the Dockground webpage, present a unique, user-friendly resource for the developing and testing of protein-protein scoring approaches.


Asunto(s)
Benchmarking , Proteínas , Simulación del Acoplamiento Molecular , Unión Proteica , Conformación Proteica , Proteínas/química
3.
Bioinformatics ; 37(4): 497-505, 2021 05 01.
Artículo en Inglés | MEDLINE | ID: mdl-32960948

RESUMEN

MOTIVATION: Procedures for structural modeling of protein-protein complexes (protein docking) produce a number of models which need to be further analyzed and scored. Scoring can be based on independently determined constraints on the structure of the complex, such as knowledge of amino acids essential for the protein interaction. Previously, we showed that text mining of residues in freely available PubMed abstracts of papers on studies of protein-protein interactions may generate such constraints. However, absence of post-processing of the spotted residues reduced usability of the constraints, as a significant number of the residues were not relevant for the binding of the specific proteins. RESULTS: We explored filtering of the irrelevant residues by two machine learning approaches, Deep Recursive Neural Network (DRNN) and Support Vector Machine (SVM) models with different training/testing schemes. The results showed that the DRNN model is superior to the SVM model when training is performed on the PMC-OA full-text articles and applied to classification (interface or non-interface) of the residues spotted in the PubMed abstracts. When both training and testing is performed on full-text articles or on abstracts, the performance of these models is similar. Thus, in such cases, there is no need to utilize computationally demanding DRNN approach, which is computationally expensive especially at the training stage. The reason is that SVM success is often determined by the similarity in data/text patterns in the training and the testing sets, whereas the sentence structures in the abstracts are, in general, different from those in the full text articles. AVAILABILITYAND IMPLEMENTATION: The code and the datasets generated in this study are available at https://gitlab.ku.edu/vakser-lab-public/text-mining/-/tree/2020-09-04. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Minería de Datos , Aprendizaje Automático , Proteínas , PubMed , Máquina de Vectores de Soporte
4.
Proteins ; 88(8): 1070-1081, 2020 08.
Artículo en Inglés | MEDLINE | ID: mdl-31994759

RESUMEN

Comparative docking is based on experimentally determined structures of protein-protein complexes (templates), following the paradigm that proteins with similar sequences and/or structures form similar complexes. Modeling utilizing structure similarity of target monomers to template complexes significantly expands structural coverage of the interactome. Template-based docking by structure alignment can be performed for the entire structures or by aligning targets to the bound interfaces of the experimentally determined complexes. Systematic benchmarking of docking protocols based on full and interface structure alignment showed that both protocols perform similarly, with top 1 docking success rate 26%. However, in terms of the models' quality, the interface-based docking performed marginally better. The interface-based docking is preferable when one would suspect a significant conformational change in the full protein structure upon binding, for example, a rearrangement of the domains in multidomain proteins. Importantly, if the same structure is selected as the top template by both full and interface alignment, the docking success rate increases 2-fold for both top 1 and top 10 predictions. Matching structural annotations of the target and template proteins for template detection, as a computationally less expensive alternative to structural alignment, did not improve the docking performance. Sophisticated remote sequence homology detection added templates to the pool of those identified by structure-based alignment, suggesting that for practical docking, the combination of the structure alignment protocols and the remote sequence homology detection may be useful in order to avoid potential flaws in generation of the structural templates library.


Asunto(s)
Simulación del Acoplamiento Molecular , Péptidos/química , Proteínas/química , Programas Informáticos , Secuencia de Aminoácidos , Animales , Benchmarking , Sitios de Unión , Perros , Escherichia coli/química , Humanos , Ligandos , Péptidos/metabolismo , Unión Proteica , Conformación Proteica en Hélice alfa , Conformación Proteica en Lámina beta , Dominios y Motivos de Interacción de Proteínas , Mapeo de Interacción de Proteínas , Multimerización de Proteína , Proteínas/metabolismo , Proyectos de Investigación , Homología Estructural de Proteína , Termodinámica
5.
Proteins ; 88(9): 1180-1188, 2020 09.
Artículo en Inglés | MEDLINE | ID: mdl-32170770

RESUMEN

Protein docking is essential for structural characterization of protein interactions. Besides providing the structure of protein complexes, modeling of proteins and their complexes is important for understanding the fundamental principles and specific aspects of protein interactions. The accuracy of protein modeling, in general, is still less than that of the experimental approaches. Thus, it is important to investigate the applicability of docking techniques to modeled proteins. We present new comprehensive benchmark sets of protein models for the development and validation of protein docking, as well as a systematic assessment of free and template-based docking techniques on these sets. As opposed to previous studies, the benchmark sets reflect the real case modeling/docking scenario where the accuracy of the models is assessed by the modeling procedure, without reference to the native structure (which would be unknown in practical applications). We also expanded the analysis to include docking of protein pairs where proteins have different structural accuracy. The results show that, in general, the template-based docking is less sensitive to the structural inaccuracies of the models than the free docking. The near-native docking poses generated by the template-based approach, typically, also have higher ranks than those produces by the free docking (although the free docking is indispensable in modeling the multiplicity of protein interactions in a crowded cellular environment). The results show that docking techniques are applicable to protein models in a broad range of modeling accuracy. The study provides clear guidelines for practical applications of docking to protein models.


Asunto(s)
Benchmarking/estadística & datos numéricos , Simulación del Acoplamiento Molecular , Proteínas/química , Programas Informáticos , Secuencia de Aminoácidos , Sitios de Unión , Bases de Datos de Proteínas , Unión Proteica , Estructura Secundaria de Proteína
6.
Proteins ; 87(3): 245-253, 2019 03.
Artículo en Inglés | MEDLINE | ID: mdl-30520123

RESUMEN

Structural characterization of protein-protein interactions is essential for our ability to study life processes at the molecular level. Computational modeling of protein complexes (protein docking) is important as the source of their structure and as a way to understand the principles of protein interaction. Rapidly evolving comparative docking approaches utilize target/template similarity metrics, which are often based on the protein structure. Although the structural similarity, generally, yields good performance, other characteristics of the interacting proteins (eg, function, biological process, and localization) may improve the prediction quality, especially in the case of weak target/template structural similarity. For the ranking of a pool of models for each target, we tested scoring functions that quantify similarity of Gene Ontology (GO) terms assigned to target and template proteins in three ontology domains-biological process, molecular function, and cellular component (GO-score). The scoring functions were tested in docking of bound, unbound, and modeled proteins. The results indicate that the combined structural and GO-terms functions improve the scoring, especially in the twilight zone of structural similarity, typical for protein models of limited accuracy.


Asunto(s)
Biología Computacional , Ontología de Genes , Conformación Proteica , Proteínas/genética , Sitios de Unión/genética , Bases de Datos de Proteínas , Humanos , Modelos Moleculares , Simulación del Acoplamiento Molecular , Unión Proteica/genética , Mapeo de Interacción de Proteínas , Mapas de Interacción de Proteínas/genética , Proteínas/química , Programas Informáticos , Homología Estructural de Proteína
7.
Biophys J ; 115(5): 809-821, 2018 09 04.
Artículo en Inglés | MEDLINE | ID: mdl-30122295

RESUMEN

The energy function is the key component of protein modeling methodology. This work presents a semianalytical approach to the development of contact potentials for protein structure modeling. Residue-residue and atom-atom contact energies were derived by maximizing the probability of observing native sequences in a nonredundant set of protein structures. The optimization task was formulated as an inverse statistical mechanics problem applied to the Potts model. Its solution by pseudolikelihood maximization provides consistent estimates of coupling constants at atomic and residue levels. The best performance was achieved when interacting atoms were grouped according to their physicochemical properties. For individual protein structures, the performance of the contact potentials in distinguishing near-native structures from the decoys is similar to the top-performing scoring functions. The potentials also yielded significant improvement in the protein docking success rates. The potentials recapitulated experimentally determined protein stability changes upon point mutations and protein-protein binding affinities. The approach offers a different perspective on knowledge-based potentials and may serve as the basis for their further development.


Asunto(s)
Modelos Moleculares , Proteínas/química , Proteínas/metabolismo , Funciones de Verosimilitud , Mutación Puntual , Conformación Proteica , Estabilidad Proteica , Proteínas/genética , Termodinámica
8.
BMC Bioinformatics ; 19(1): 84, 2018 03 05.
Artículo en Inglés | MEDLINE | ID: mdl-29506465

RESUMEN

BACKGROUND: Structural modeling of protein-protein interactions produces a large number of putative configurations of the protein complexes. Identification of the near-native models among them is a serious challenge. Publicly available results of biomedical research may provide constraints on the binding mode, which can be essential for the docking. Our text-mining (TM) tool, which extracts binding site residues from the PubMed abstracts, was successfully applied to protein docking (Badal et al., PLoS Comput Biol, 2015; 11: e1004630). Still, many extracted residues were not relevant to the docking. RESULTS: We present an extension of the TM tool, which utilizes natural language processing (NLP) for analyzing the context of the residue occurrence. The procedure was tested using generic and specialized dictionaries. The results showed that the keyword dictionaries designed for identification of protein interactions are not adequate for the TM prediction of the binding mode. However, our dictionary designed to distinguish keywords relevant to the protein binding sites led to considerable improvement in the TM performance. We investigated the utility of several methods of context analysis, based on dissection of the sentence parse trees. The machine learning-based NLP filtered the pool of the mined residues significantly more efficiently than the rule-based NLP. Constraints generated by NLP were tested in docking of unbound proteins from the DOCKGROUND X-ray benchmark set 4. The output of the global low-resolution docking scan was post-processed, separately, by constraints from the basic TM, constraints re-ranked by NLP, and the reference constraints. The quality of a match was assessed by the interface root-mean-square deviation. The results showed significant improvement of the docking output when using the constraints generated by the advanced TM with NLP. CONCLUSIONS: The basic TM procedure for extracting protein-protein binding site residues from the PubMed abstracts was significantly advanced by the deep parsing (NLP techniques for contextual analysis) in purging of the initial pool of the extracted residues. Benchmarking showed a substantial increase of the docking success rate based on the constraints generated by the advanced TM with NLP.


Asunto(s)
Minería de Datos , Modelos Moleculares , Procesamiento de Lenguaje Natural , Proteínas/química , Aprendizaje Automático , Unión Proteica , Semántica , Máquina de Vectores de Soporte
9.
Proteins ; 86 Suppl 1: 302-310, 2018 03.
Artículo en Inglés | MEDLINE | ID: mdl-28905425

RESUMEN

The paper presents analysis of our template-based and free docking predictions in the joint CASP12/CAPRI37 round. A new scoring function for template-based docking was developed, benchmarked on the Dockground resource, and applied to the targets. The results showed that the function successfully discriminates the incorrect docking predictions. In correctly predicted targets, the scoring function was complemented by other considerations, such as consistency of the oligomeric states among templates, similarity of the biological functions, biological interface relevance, etc. The scoring function still does not distinguish well biological from crystal packing interfaces, and needs further development for the docking of bundles of α-helices. In the case of the trimeric targets, sequence-based methods did not find common templates, despite similarity of the structures, suggesting complementary use of structure- and sequence-based alignments in comparative docking. The results showed that if a good docking template is found, an accurate model of the interface can be built even from largely inaccurate models of individual subunits. Free docking however is very sensitive to the quality of the individual models. However, our newly developed contact potential detected approximate locations of the binding sites.


Asunto(s)
Biología Computacional/métodos , Bases de Datos de Proteínas , Modelos Moleculares , Conformación Proteica , Multimerización de Proteína , Proteínas/química , Programas Informáticos , Humanos , Unión Proteica , Análisis de Secuencia de Proteína
10.
J Comput Chem ; 39(24): 2012-2021, 2018 09 15.
Artículo en Inglés | MEDLINE | ID: mdl-30226647

RESUMEN

Protein-protein docking procedures typically perform the global scan of the proteins relative positions, followed by the local refinement of the putative matches. Because of the size of the search space, the global scan is usually implemented as rigid-body search, using computationally inexpensive intermolecular energy approximations. An adequate refinement has to take into account structural flexibility. Since the refinement performs conformational search of the interacting proteins, it is extremely computationally challenging, given the enormous amount of the internal degrees of freedom. Different approaches limit the search space by restricting the search to the side chains, rotameric states, coarse-grained structure representation, principal normal modes, and so on. Still, even with the approximations, the refinement presents an extreme computational challenge due to the very large number of the remaining degrees of freedom. Given the complexity of the search space, the advantage of the exhaustive search is obvious. The obstacle to such search is computational feasibility. However, the growing computational power of modern computers, especially due to the increasing utilization of Graphics Processing Unit (GPU) with large amount of specialized computing cores, extends the ranges of applicability of the brute-force search methods. This proof-of-concept study demonstrates computational feasibility of an exhaustive search of side-chain conformations in protein pocking. The procedure, implemented on the GPU architecture, was used to generate the optimal conformations in a large representative set of protein-protein complexes. © 2018 Wiley Periodicals, Inc.


Asunto(s)
Algoritmos , Biología Computacional , Conformación Proteica , Proteínas/química , Estudios de Factibilidad , Unión Proteica
11.
J Comput Aided Mol Des ; 32(7): 769-779, 2018 07.
Artículo en Inglés | MEDLINE | ID: mdl-30003468

RESUMEN

Modulating protein interaction pathways may lead to the cure of many diseases. Known protein-protein inhibitors bind to large pockets on the protein-protein interface. Such large pockets are detected also in the protein-protein complexes without known inhibitors, making such complexes potentially druggable. The inhibitor-binding site is primary defined by the side chains that form the largest pocket in the protein-bound conformation. Low-resolution ligand docking shows that the success rate for the protein-bound conformation is close to the one for the ligand-bound conformation, and significantly higher than for the apo conformation. The conformational change on the protein interface upon binding to the other protein results in a pocket employed by the ligand when it binds to that interface. This proof-of-concept study suggests that rather than using computational pocket-opening procedures, one can opt for an experimentally determined structure of the target co-crystallized protein-protein complex as a starting point for drug design.


Asunto(s)
Simulación del Acoplamiento Molecular , Proteínas/antagonistas & inhibidores , Proteínas/química , Sitios de Unión , Cristalización , Bases de Datos de Proteínas , Diseño de Fármacos , Ligandos , Prueba de Estudio Conceptual , Unión Proteica , Conformación Proteica
12.
Proteins ; 85(3): 470-478, 2017 03.
Artículo en Inglés | MEDLINE | ID: mdl-27701777

RESUMEN

Structural characterization of proteins is essential for understanding life processes at the molecular level. However, only a fraction of known proteins have experimentally determined structures. This fraction is even smaller for protein-protein complexes. Thus, structural modeling of protein-protein interactions (docking) primarily has to rely on modeled structures of the individual proteins, which typically are less accurate than the experimentally determined ones. Such "double" modeling is the Grand Challenge of structural reconstruction of the interactome. Yet it remains so far largely untested in a systematic way. We present a comprehensive validation of template-based and free docking on a set of 165 complexes, where each protein model has six levels of structural accuracy, from 1 to 6 Å Cα RMSD. Many template-based docking predictions fall into acceptable quality category, according to the CAPRI criteria, even for highly inaccurate proteins (5-6 Å RMSD), although the number of such models (and, consequently, the docking success rate) drops significantly for models with RMSD > 4 Å. The results show that the existing docking methodologies can be successfully applied to protein models with a broad range of structural accuracy, and the template-based docking is much less sensitive to inaccuracies of protein models than the free docking. Proteins 2017; 85:470-478. © 2016 Wiley Periodicals, Inc.


Asunto(s)
Algoritmos , Biología Computacional/métodos , Simulación del Acoplamiento Molecular/métodos , Proteínas/química , Programas Informáticos , Secuencias de Aminoácidos , Benchmarking , Sitios de Unión , Cristalografía por Rayos X , Unión Proteica , Conformación Proteica , Proyectos de Investigación , Termodinámica
13.
Proteins ; 85(1): 39-45, 2017 01.
Artículo en Inglés | MEDLINE | ID: mdl-27756103

RESUMEN

Structural characterization of protein-protein interactions is essential for understanding life processes at the molecular level. However, only a fraction of protein interactions have experimentally resolved structures. Thus, reliable computational methods for structural modeling of protein interactions (protein docking) are important for generating such structures and understanding the principles of protein recognition. Template-based docking techniques that utilize structural similarity between target protein-protein interaction and cocrystallized protein-protein complexes (templates) are gaining popularity due to generally higher reliability than that of the template-free docking. However, the template-based approach lacks explicit penalties for intermolecular penetration, as opposed to the typical free docking where such penalty is inherent due to the shape complementarity paradigm. Thus, template-based docking models are commonly assumed to require special treatment to remove large structural penetrations. In this study, we compared clashes in the template-based and free docking of the same proteins, with crystallographically determined and modeled structures. The results show that for the less accurate protein models, free docking produces fewer clashes than the template-based approach. However, contrary to the common expectation, in acceptable and better quality docking models of unbound crystallographically determined proteins, the clashes in the template-based docking are comparable to those in the free docking, due to the overall higher quality of the template-based docking predictions. This suggests that the free docking refinement protocols can in principle be applied to the template-based docking predictions as well. Proteins 2016; 85:39-45. © 2016 Wiley Periodicals, Inc.


Asunto(s)
Simulación del Acoplamiento Molecular , Proteínas/química , Sitios de Unión , Biología Computacional/métodos , Cristalografía por Rayos X , Unión Proteica , Dominios y Motivos de Interacción de Proteínas , Mapeo de Interacción de Proteínas , Estructura Secundaria de Proteína , Programas Informáticos , Homología Estructural de Proteína
14.
PLoS Comput Biol ; 12(9): e1005120, 2016 09.
Artículo en Inglés | MEDLINE | ID: mdl-27662342

RESUMEN

Protein-RNA complexes formed by specific recognition between RNA and RNA-binding proteins play an important role in biological processes. More than a thousand of such proteins in human are curated and many novel RNA-binding proteins are to be discovered. Due to limitations of experimental approaches, computational techniques are needed for characterization of protein-RNA interactions. Although much progress has been made, adequate methodologies reliably providing atomic resolution structural details are still lacking. Although protein-RNA free docking approaches proved to be useful, in general, the template-based approaches provide higher quality of predictions. Templates are key to building a high quality model. Sequence/structure relationships were studied based on a representative set of binary protein-RNA complexes from PDB. Several approaches were tested for pairwise target/template alignment. The analysis revealed a transition point between random and correct binding modes. The results showed that structural alignment is better than sequence alignment in identifying good templates, suitable for generating protein-RNA complexes close to the native structure, and outperforms free docking, successfully predicting complexes where the free docking fails, including cases of significant conformational change upon binding. A template-based protein-RNA interaction modeling protocol PRIME was developed and benchmarked on a representative set of complexes.


Asunto(s)
Biología Computacional/métodos , Modelos Moleculares , Proteínas de Unión al ARN , ARN , Programas Informáticos , Secuencia de Aminoácidos , Análisis por Conglomerados , Humanos , ARN/química , ARN/genética , ARN/metabolismo , Proteínas de Unión al ARN/química , Proteínas de Unión al ARN/genética , Proteínas de Unión al ARN/metabolismo , Alineación de Secuencia
15.
PLoS Comput Biol ; 11(12): e1004630, 2015 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-26650466

RESUMEN

The rapidly growing amount of publicly available information from biomedical research is readily accessible on the Internet, providing a powerful resource for predictive biomolecular modeling. The accumulated data on experimentally determined structures transformed structure prediction of proteins and protein complexes. Instead of exploring the enormous search space, predictive tools can simply proceed to the solution based on similarity to the existing, previously determined structures. A similar major paradigm shift is emerging due to the rapidly expanding amount of information, other than experimentally determined structures, which still can be used as constraints in biomolecular structure prediction. Automated text mining has been widely used in recreating protein interaction networks, as well as in detecting small ligand binding sites on protein structures. Combining and expanding these two well-developed areas of research, we applied the text mining to structural modeling of protein-protein complexes (protein docking). Protein docking can be significantly improved when constraints on the docking mode are available. We developed a procedure that retrieves published abstracts on a specific protein-protein interaction and extracts information relevant to docking. The procedure was assessed on protein complexes from Dockground (http://dockground.compbio.ku.edu). The results show that correct information on binding residues can be extracted for about half of the complexes. The amount of irrelevant information was reduced by conceptual analysis of a subset of the retrieved abstracts, based on the bag-of-words (features) approach. Support Vector Machine models were trained and validated on the subset. The remaining abstracts were filtered by the best-performing models, which decreased the irrelevant information for ~ 25% complexes in the dataset. The extracted constraints were incorporated in the docking protocol and tested on the Dockground unbound benchmark set, significantly increasing the docking success rate.

16.
BMC Bioinformatics ; 16: 243, 2015 Jul 31.
Artículo en Inglés | MEDLINE | ID: mdl-26227548

RESUMEN

BACKGROUND: Proteins play an important role in biological processes in living organisms. Many protein functions are based on interaction with other proteins. The structural information is important for adequate description of these interactions. Sets of protein structures determined in both bound and unbound states are essential for benchmarking of the docking procedures. However, the number of such proteins in PDB is relatively small. A radical expansion of such sets is possible if the unbound structures are computationally simulated. RESULTS: The DOCKGROUND public resource provides data to improve our understanding of protein-protein interactions and to assist in the development of better tools for structural modeling of protein complexes, such as docking algorithms and scoring functions. A large set of simulated unbound protein structures was generated from the bound structures. The modeling protocol was based on 1 ns Langevin dynamics simulation. The simulated structures were validated on the ensemble of experimentally determined unbound and bound structures. The set is intended for large scale benchmarking of docking algorithms and scoring functions. CONCLUSIONS: A radical expansion of the unbound protein docking benchmark set was achieved by simulating the unbound structures. The simulated unbound structures were selected according to criteria from systematic comparison of experimentally determined bound and unbound structures. The set is publicly available at http://dockground.compbio.ku.edu.


Asunto(s)
Benchmarking , Biología Computacional/métodos , Simulación por Computador , Proteínas/química , Algoritmos , Sitios de Unión , Internet , Simulación del Acoplamiento Molecular , Dominios y Motivos de Interacción de Proteínas , Estructura Terciaria de Proteína , Proteínas/metabolismo , Interfaz Usuario-Computador
17.
Proteins ; 83(9): 1563-70, 2015 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-25488330

RESUMEN

Structural characterization of protein-protein interactions is important for understanding life processes. Because of the inherent limitations of experimental techniques, such characterization requires computational approaches. Along with the traditional protein-protein docking (free search for a match between two proteins), comparative (template-based) modeling of protein-protein complexes has been gaining popularity. Its development puts an emphasis on full and partial structural similarity between the target protein monomers and the protein-protein complexes previously determined by experimental techniques (templates). The template-based docking relies on the quality and diversity of the template set. We present a carefully curated, nonredundant library of templates containing 4950 full structures of binary complexes and 5936 protein-protein interfaces extracted from the full structures at 12 Å distance cut-off. Redundancy in the libraries was removed by clustering the PDB structures based on structural similarity. The value of the clustering threshold was determined from the analysis of the clusters and the docking performance on a benchmark set. High structural quality of the interfaces in the template and validation sets was achieved by automated procedures and manual curation. The library is included in the Dockground resource for molecular recognition studies at http://dockground.bioinformatics.ku.edu.


Asunto(s)
Biología Computacional/métodos , Simulación del Acoplamiento Molecular , Mapeo de Interacción de Proteínas/métodos , Estructura Terciaria de Proteína , Proteínas/química , Sitios de Unión , Análisis por Conglomerados , Cristalografía por Rayos X , Bases de Datos de Proteínas , Internet , Unión Proteica , Proteínas/clasificación , Proteínas/metabolismo , Reproducibilidad de los Resultados
18.
Proteins ; 83(5): 891-7, 2015 May.
Artículo en Inglés | MEDLINE | ID: mdl-25712716

RESUMEN

Structural characterization of protein-protein interactions is essential for our ability to understand life processes. However, only a fraction of known proteins have experimentally determined structures. Such structures provide templates for modeling of a large part of the proteome, where individual proteins can be docked by template-free or template-based techniques. Still, the sensitivity of the docking methods to the inherent inaccuracies of protein models, as opposed to the experimentally determined high-resolution structures, remains largely untested, primarily due to the absence of appropriate benchmark set(s). Structures in such a set should have predefined inaccuracy levels and, at the same time, resemble actual protein models in terms of structural motifs/packing. The set should also be large enough to ensure statistical reliability of the benchmarking results. We present a major update of the previously developed benchmark set of protein models. For each interactor, six models were generated with the model-to-native C(α) RMSD in the 1 to 6 Å range. The models in the set were generated by a new approach, which corresponds to the actual modeling of new protein structures in the "real case scenario," as opposed to the previous set, where a significant number of structures were model-like only. In addition, the larger number of complexes (165 vs. 63 in the previous set) increases the statistical reliability of the benchmarking. We estimated the highest accuracy of the predicted complexes (according to CAPRI criteria), which can be attained using the benchmark structures. The set is available at http://dockground.bioinformatics.ku.edu.


Asunto(s)
Simulación del Acoplamiento Molecular/normas , Secuencia de Aminoácidos , Datos de Secuencia Molecular , Dominios y Motivos de Interacción de Proteínas , Mapeo de Interacción de Proteínas , Estructura Secundaria de Proteína , Proteínas/química , Estándares de Referencia
19.
Proc Natl Acad Sci U S A ; 109(24): 9438-41, 2012 Jun 12.
Artículo en Inglés | MEDLINE | ID: mdl-22645367

RESUMEN

Traditional approaches to protein-protein docking sample the binding modes with no regard to similar experimentally determined structures (templates) of protein-protein complexes. Emerging template-based docking approaches utilize such similar complexes to determine the docking predictions. The docking problem assumes the knowledge of the participating proteins' structures. Thus, it provides the possibility of aligning the structures of the proteins and the template complexes. The progress in the development of template-based docking and the vast experience in template-based modeling of individual proteins show that, generally, such approaches are more reliable than the free modeling. The key aspect of this modeling paradigm is the availability of the templates. The current common perception is that due to the difficulties in experimental structure determination of protein-protein complexes, the pool of docking templates is insignificant, and thus a broad application of template-based docking is possible only at some future time. The results of our large scale, systematic study show that, surprisingly, in spite of the limited number of protein-protein complexes in the Protein Data Bank, docking templates can be found for complexes representing almost all the known protein-protein interactions, provided the components themselves have a known structure or can be homology-built. About one-third of the templates are of good quality when they are compared to experimental structures in test sets extracted from the Protein Data Bank and would be useful starting points in modeling the complexes. This finding dramatically expands our ability to model protein interactions, and has far-reaching implications for the protein docking field in general.


Asunto(s)
Modelos Moleculares , Proteínas/química , Bases de Datos de Proteínas
20.
Proteins ; 82(2): 278-87, 2014 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-23934791

RESUMEN

Characterization of life processes at the molecular level requires structural details of protein-protein interactions (PPIs). The number of experimentally determined protein structures accounts only for a fraction of known proteins. This gap has to be bridged by modeling, typically using experimentally determined structures as templates to model related proteins. The fraction of experimentally determined PPI structures is even smaller than that for the individual proteins, due to a larger number of interactions than the number of individual proteins, and a greater difficulty of crystallizing protein-protein complexes. The approaches to structural modeling of PPI (docking) often have to rely on modeled structures of the interactors, especially in the case of large PPI networks. Structures of modeled proteins are typically less accurate than the ones determined by X-ray crystallography or nuclear magnetic resonance. Thus the utility of approaches to dock these structures should be assessed by thorough benchmarking, specifically designed for protein models. To be credible, such benchmarking has to be based on carefully curated sets of structures with levels of distortion typical for modeled proteins. This article presents such a suite of models built for the benchmark set of the X-ray structures from the Dockground resource (http://dockground.bioinformatics.ku.edu) by a combination of homology modeling and Nudged Elastic Band method. For each monomer, six models were generated with predefined C(α) root mean square deviation from the native structure (1, 2, …, 6 Å). The sets and the accompanying data provide a comprehensive resource for the development of docking methodology for modeled proteins.


Asunto(s)
Simulación del Acoplamiento Molecular/normas , Secuencia de Aminoácidos , Sitios de Unión , Datos de Secuencia Molecular , Mapeo de Interacción de Proteínas , Estructura Secundaria de Proteína , Estructura Terciaria de Proteína , Proteínas/química , Estándares de Referencia , Programas Informáticos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA