Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 29
Filtrar
1.
Nat Methods ; 18(2): 156-164, 2021 02.
Artículo en Inglés | MEDLINE | ID: mdl-33542514

RESUMEN

This paper describes outcomes of the 2019 Cryo-EM Model Challenge. The goals were to (1) assess the quality of models that can be produced from cryogenic electron microscopy (cryo-EM) maps using current modeling software, (2) evaluate reproducibility of modeling results from different software developers and users and (3) compare performance of current metrics used for model evaluation, particularly Fit-to-Map metrics, with focus on near-atomic resolution. Our findings demonstrate the relatively high accuracy and reproducibility of cryo-EM models derived by 13 participating teams from four benchmark maps, including three forming a resolution series (1.8 to 3.1 Å). The results permit specific recommendations to be made about validating near-atomic cryo-EM structures both in the context of individual experiments and structure data archives such as the Protein Data Bank. We recommend the adoption of multiple scoring parameters to provide full and objective annotation and assessment of the model, reflective of the observed cryo-EM map density.


Asunto(s)
Microscopía por Crioelectrón/métodos , Modelos Moleculares , Cristalografía por Rayos X , Conformación Proteica , Proteínas/química
2.
Bioinformatics ; 35(6): 937-944, 2019 03 15.
Artículo en Inglés | MEDLINE | ID: mdl-30169622

RESUMEN

MOTIVATION: Measuring discrepancies between protein models and native structures is at the heart of development of protein structure prediction methods and comparison of their performance. A number of different evaluation methods have been developed; however, their comprehensive and unbiased comparison has not been performed. RESULTS: We carried out a comparative analysis of several popular model assessment methods (RMSD, TM-score, GDT, QCS, CAD-score, LDDT, SphereGrinder and RPF) to reveal their relative strengths and weaknesses. The analysis, performed on a large and diverse model set derived in the course of three latest community-wide CASP experiments (CASP10-12), had two major directions. First, we looked at general differences between the scores by analyzing distribution, correspondence and correlation of their values as well as differences in selecting best models. Second, we examined the score differences taking into account various structural properties of models (stereochemistry, hydrogen bonds, packing of domains and chain fragments, missing residues, protein length and secondary structure). Our results provide a solid basis for an informed selection of the most appropriate score or combination of scores depending on the task at hand. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Biología Computacional , Proteínas/química , Modelos Moleculares , Conformación Proteica
3.
Proteins ; 87(12): 1351-1360, 2019 12.
Artículo en Inglés | MEDLINE | ID: mdl-31436360

RESUMEN

Scoring model structure is an essential component of protein structure prediction that can affect the prediction accuracy tremendously. Users of protein structure prediction results also need to score models to select the best models for their application studies. In Critical Assessment of techniques for protein Structure Prediction (CASP), model accuracy estimation methods have been tested in a blind fashion by providing models submitted by the tertiary structure prediction servers for scoring. In CASP13, model accuracy estimation results were evaluated in terms of both global and local structure accuracy. Global structure accuracy estimation was evaluated by the quality of the models selected by the global structure scores and by the absolute estimates of the global scores. Residue-wise, local structure accuracy estimations were evaluated by three different measures. A new measure introduced in CASP13 evaluates the ability to predict inaccurately modeled regions that may be improved by refinement. An intensive comparative analysis on CASP13 and the previous CASPs revealed that the tertiary structure models generated by the CASP13 servers show very distinct features. Higher consensus toward models of higher global accuracy appeared even for free modeling targets, and many models of high global accuracy were not well optimized at the atomic level. This is related to the new technology in CASP13, deep learning for tertiary contact prediction. The tertiary model structures generated by deep learning pose a new challenge for EMA (estimation of model accuracy) method developers. Model accuracy estimation itself is also an area where deep learning can potentially have an impact, although current EMA methods have not fully explored that direction.


Asunto(s)
Biología Computacional , Modelos Moleculares , Conformación Proteica , Proteínas/ultraestructura , Algoritmos , Bases de Datos de Proteínas , Aprendizaje Profundo , Proteínas/química , Proteínas/genética , Análisis de Secuencia de Proteína , Programas Informáticos
4.
Proteins ; 87(12): 1021-1036, 2019 12.
Artículo en Inglés | MEDLINE | ID: mdl-31294862

RESUMEN

Protein target structures for the Critical Assessment of Structure Prediction round 13 (CASP13) were split into evaluation units (EUs) based on their structural domains, the domain organization of available templates, and the performance of servers on whole targets compared to split target domains. Eighty targets were split into 112 EUs. The EUs were classified into categories suitable for assessment of high accuracy modeling (or template-based modeling [TBM]) and topology (or free modeling [FM]) based on target difficulty. Assignment into assessment categories considered the following criteria: (a) the evolutionary relationship of target domains to existing fold space as defined by the Evolutionary Classification of Protein Domains (ECOD) database; (b) the clustering of target domains using eight objective sequence, structure, and performance measures; and (c) the placement of target domains in a scatter plot of target difficulty against server performance used in the previous CASP. Generally, target domains with good server predictions had close template homologs and were classified as TBM. Alternately, targets with poor server predictions represent a mixture of fast evolving homologs, structure analogs, and new folds, and were classified as FM or FM/TBM overlap.


Asunto(s)
Secuencia de Aminoácidos/genética , Biología Computacional , Estructura Terciaria de Proteína/genética , Proteínas/ultraestructura , Bases de Datos de Proteínas , Modelos Moleculares , Conformación Proteica , Pliegue de Proteína , Proteínas/química , Proteínas/genética , Alineación de Secuencia
5.
Proteins ; 87(12): 1190-1199, 2019 12.
Artículo en Inglés | MEDLINE | ID: mdl-31374138

RESUMEN

We present the assembly category assessment in the 13th edition of the CASP community-wide experiment. For the second time, protein assemblies constitute an independent assessment category. Compared to the last edition we see a clear uptake in participation, more oligomeric targets released, and consistent, albeit modest, improvement of the predictions quality. Looking at the tertiary structure predictions, we observe that ignoring the oligomeric state of the targets hinders modeling success. We also note that some contact prediction groups successfully predicted homomeric interfacial contacts, though it appears that these predictions were not used for assembly modeling. Homology modeling with sizeable human intervention appears to form the basis of the assembly prediction techniques in this round of CASP. Future developments should see more integrated approaches where subunits are modeled in the context of the assemblies they form.


Asunto(s)
Biología Computacional , Conformación Proteica , Proteínas/ultraestructura , Programas Informáticos , Algoritmos , Humanos , Simulación de Dinámica Molecular , Proteínas/química , Proteínas/genética , Análisis de Secuencia de Proteína
6.
Proteins ; 87(12): 1058-1068, 2019 12.
Artículo en Inglés | MEDLINE | ID: mdl-31587357

RESUMEN

The accuracy of sequence-based tertiary contact predictions was assessed in a blind prediction experiment at the CASP13 meeting. After 4 years of significant improvements in prediction accuracy, another dramatic advance has taken place since CASP12 was held 2 years ago. The precision of predicting the top L/5 contacts in the free modeling category, where L is the corresponding length of the protein in residues, has exceeded 70%. As a comparison, the best-performing group at CASP12 with a 47% precision would have finished below the top 1/3 of the CASP13 groups. Extensively trained deep neural network approaches dominate the top performing algorithms, which appear to efficiently integrate information on coevolving residues and interacting fragments or possibly utilize memories of sequence similarities and sometimes can deliver accurate results even in the absence of virtually any target specific evolutionary information. If the current performance is evaluated by F-score on L contacts, it stands around 24% right now, which, despite the tremendous impact and advance in improving its utility for structure modeling, also suggests that there is much room left for further improvement.


Asunto(s)
Biología Computacional/métodos , Congresos como Asunto/estadística & datos numéricos , Conformación Proteica , Proteínas/química , Análisis de Secuencia de Proteína/métodos , Algoritmos , Congresos como Asunto/normas , Cristalografía por Rayos X , Entropía , Humanos , Modelos Moleculares , Reproducibilidad de los Resultados
7.
Proteins ; 87(12): 1128-1140, 2019 12.
Artículo en Inglés | MEDLINE | ID: mdl-31576602

RESUMEN

Structures of seven CASP13 targets were determined using cryo-electron microscopy (cryo-EM) technique with resolution between 3.0 and 4.0 Å. We provide an overview of the experimentally derived structures and describe results of the numerical evaluation of the submitted models. The evaluation is carried out by comparing coordinates of models to those of reference structures (CASP-style evaluation), as well as checking goodness-of-fit of modeled structures to the cryo-EM density maps. The performance of contributing research groups in the CASP-style evaluation is measured in terms of backbone accuracy, all-atom local geometry and similarity of inter-subunit interfaces. The results on the cryo-EM targets are compared with those on the whole set of eighty CASP13 targets. A posteriori refinement of the best models in their corresponding cryo-EM density maps resulted in structures that are very close to the reference structure, including some regions with better fit to the density.


Asunto(s)
Conformación Proteica , Proteínas/ultraestructura , Microscopía por Crioelectrón , Modelos Moleculares , Proteínas/química , Proteínas/genética
8.
Proteins ; 87(12): 1298-1314, 2019 12.
Artículo en Inglés | MEDLINE | ID: mdl-31589784

RESUMEN

Small angle X-ray scattering (SAXS) measures comprehensive distance information on a protein's structure, which can constrain and guide computational structure prediction algorithms. Here, we evaluate structure predictions of 11 monomeric and oligomeric proteins for which SAXS data were collected and provided to predictors in the 13th round of the Critical Assessment of protein Structure Prediction (CASP13). The category for SAXS-assisted predictions made gains in certain areas for CASP13 compared to CASP12. Improvements included higher quality data with size exclusion chromatography-SAXS (SEC-SAXS) and better selection of targets and communication of results by CASP organizers. In several cases, we can track improvements in model accuracy with use of SAXS data. For hard multimeric targets where regular folding algorithms were unsuccessful, SAXS data helped predictors to build models better resembling the global shape of the target. For most models, however, no significant improvement in model accuracy at the domain level was registered from use of SAXS data, when rigorously comparing SAXS-assisted models to the best regular server predictions. To promote future progress in this category, we identify successes, challenges, and opportunities for improved strategies in prediction, assessment, and communication of SAXS data to predictors. An important observation is that, for many targets, SAXS data were inconsistent with crystal structures, suggesting that these proteins adopt different conformation(s) in solution. This CASP13 result, if representative of PDB structures and future CASP targets, may have substantive implications for the structure training databases used for machine learning, CASP, and use of prediction models for biology.


Asunto(s)
Biología Computacional , Conformación Proteica , Proteínas/ultraestructura , Algoritmos , Modelos Moleculares , Pliegue de Proteína , Proteínas/química , Proteínas/genética , Dispersión del Ángulo Pequeño , Soluciones/química , Difracción de Rayos X
9.
Proteins ; 87(12): 1283-1297, 2019 12.
Artículo en Inglés | MEDLINE | ID: mdl-31569265

RESUMEN

With the advance of experimental procedures obtaining chemical crosslinking information is becoming a fast and routine practice. Information on crosslinks can greatly enhance the accuracy of protein structure modeling. Here, we review the current state of the art in modeling protein structures with the assistance of experimentally determined chemical crosslinks within the framework of the 13th meeting of Critical Assessment of Structure Prediction approaches. This largest-to-date blind assessment reveals benefits of using data assistance in difficult to model protein structure prediction cases. However, in a broader context, it also suggests that with the unprecedented advance in accuracy to predict contacts in recent years, experimental crosslinks will be useful only if their specificity and accuracy further improved and they are better integrated into computational workflows.


Asunto(s)
Biología Computacional/métodos , Reactivos de Enlaces Cruzados/química , Modelos Moleculares , Conformación Proteica , Proteínas/química , Algoritmos , Cromatografía Liquida , Modelos Químicos , Reproducibilidad de los Resultados , Espectrometría de Masas en Tándem
10.
Proteins ; 86 Suppl 1: 345-360, 2018 03.
Artículo en Inglés | MEDLINE | ID: mdl-28833563

RESUMEN

The record high 42 model accuracy estimation methods were tested in CASP12. The paper presents results of the assessment of these methods in the whole-model and per-residue accuracy modes. Scores from four different model evaluation packages were used as the "ground truth" for assessing accuracy of methods' estimates. They include a rigid-body score-GDT_TS, and three local-structure based scores-LDDT, CAD and SphereGrinder. The ability of methods to identify best models from among several available, predict model's absolute accuracy score, distinguish between good and bad models, predict accuracy of the coordinate error self-estimates, and discriminate between reliable and unreliable regions in the models was assessed. Single-model methods advanced to the point where they are better than clustering methods in picking the best models from decoy sets. On the other hand, consensus methods, taking advantage of the availability of large number of models for the same target protein, are still better in distinguishing between good and bad models and predicting local accuracy of models. The best accuracy estimation methods were shown to perform better with respect to the frozen in time reference clustering method and the results of the best method in the corresponding class of methods from the previous CASP. Top performing single-model methods were shown to do better than all but three CASP12 tertiary structure predictors when evaluated as model selectors.


Asunto(s)
Biología Computacional/métodos , Modelos Moleculares , Conformación Proteica , Proteínas/química , Análisis por Conglomerados , Bases de Datos de Proteínas , Humanos , Alineación de Secuencia , Análisis de Secuencia de Proteína
11.
Proteins ; 86 Suppl 1: 51-66, 2018 03.
Artículo en Inglés | MEDLINE | ID: mdl-29071738

RESUMEN

Following up on the encouraging results of residue-residue contact prediction in the CASP11 experiment, we present the analysis of predictions submitted for CASP12. The submissions include predictions of 34 groups for 38 domains classified as free modeling targets which are not accessible to homology-based modeling due to a lack of structural templates. CASP11 saw a rise of coevolution-based methods outperforming other approaches. The improvement of these methods coupled to machine learning and sequence database growth are most likely the main driver for a significant improvement in average precision from 27% in CASP11 to 47% in CASP12. In more than half of the targets, especially those with many homologous sequences accessible, precisions above 90% were achieved with the best predictors reaching a precision of 100% in some cases. We furthermore tested the impact of using these contacts as restraints in ab initio modeling of 14 single-domain free modeling targets using Rosetta. Adding contacts to the Rosetta calculations resulted in improvements of up to 26% in GDT_TS within the top five structures.


Asunto(s)
Algoritmos , Biología Computacional/métodos , Modelos Moleculares , Conformación Proteica , Dominios y Motivos de Interacción de Proteínas , Proteínas/química , Cristalografía por Rayos X , Bases de Datos de Proteínas , Humanos , Aprendizaje Automático , Pliegue de Proteína , Programas Informáticos
12.
Proteins ; 86 Suppl 1: 321-334, 2018 03.
Artículo en Inglés | MEDLINE | ID: mdl-29159950

RESUMEN

The article describes results of numerical evaluation of CASP12 models submitted on targets for which structural templates could be identified and for which servers produced models of relatively high accuracy. The emphasis is on analysis of details of models, and how well the models compete with experimental structures. Performance of contributing research groups is measured in terms of backbone accuracy, all-atom local geometry, and the ability to estimate local errors in models. Separate analyses for all participating groups and automatic servers were carried out. Compared with the last CASP, two years ago, there have been significant improvements in a number of areas, particularly the accuracy of protein backbone atoms, accuracy of sequence alignment between models and available structures, increased accuracy over that which can be obtained from simple copying of a closest template, and accuracy of modeling of sub-structures not present in the closest template. These advancements are likely associated with more effective strategies to build non-template regions of the targets ab initio, better algorithms to combine information from multiple templates, enhanced refinement methods, and better methods for estimating model accuracy.


Asunto(s)
Biología Computacional/métodos , Modelos Moleculares , Modelos Estadísticos , Conformación Proteica , Proteínas/química , Bases de Datos de Proteínas , Humanos , Alineación de Secuencia , Análisis de Secuencia de Proteína
13.
Proteins ; 86 Suppl 1: 97-112, 2018 03.
Artículo en Inglés | MEDLINE | ID: mdl-29139163

RESUMEN

We present our assessment of CASP12 modeling efforts for targets with no obvious templates of high sequence/structure similarity in the PDB, that is for evaluation units of the free modeling (FM) and free modeling/template-based modeling (FM/TBM) categories. Models were clustered and ranked using the Global Distance Test-Total Score and 5 additional metrics developed in previous CASP rounds, producing short lists of models that were subject to visual inspection in comparison to the target structures. The whole procedure was implemented as a web app that facilitates model selection and visual inspection, and could become useful to facilitate and standardize future assessments. We describe cases of (1) targets with remarkably good predictions, (2) targets whose models captured some global shape and topology features, and (3) targets for which models fail to capture even coarse features. We note that despite this CASP being among the most challenging ones, a measurable improvement of the top predictions is apparent, that we attribute to the emergence of accurate contact prediction methods and the increased number of available sequences. We also briefly discuss current limitations in tertiary structure prediction exemplified by CASP12 targets. Overall, the Baker, Zhang, and Lee manual groups and servers were identified as the top global performing groups.


Asunto(s)
Algoritmos , Biología Computacional/métodos , Modelos Moleculares , Conformación Proteica , Proteínas/química , Cristalografía por Rayos X , Bases de Datos de Proteínas , Humanos , Pliegue de Proteína , Alineación de Secuencia , Análisis de Secuencia de Proteína
14.
Proteins ; 86 Suppl 1: 16-26, 2018 03.
Artículo en Inglés | MEDLINE | ID: mdl-29044714

RESUMEN

For assessment purposes, CASP targets are split into evaluation units. We herein present the official definition of CASP12 evaluation units (EUs) and their classification into difficulty categories. Each target can be evaluated as one EU (the whole target) or/and several EUs (separate structural domains or groups of structural domains). The specific scenario for a target split is determined by the domain organization of available templates, the difference in server performance on separate domains versus combination of the domains, and visual inspection. In the end, 71 targets were split into 96 EUs. Classification of the EUs into difficulty categories was done semi-automatically with the assistance of metrics provided by the Prediction Center. These metrics account for sequence and structural similarities of the EUs to potential structural templates from the Protein Data Bank, and for the baseline performance of automated server predictions. The metrics readily separate the 96 EUs into 38 EUs that should be straightforward for template-based modeling (TBM) and 39 that are expected to be hard for homology modeling and are thus left for free modeling (FM). The remaining 19 borderline evaluation units were dubbed FM/TBM, and were inspected case by case. The article also overviews structural and evolutionary features of selected targets relevant to our accompanying article presenting the assessment of FM and FM/TBM predictions, and overviews structural features of the hardest evaluation units from the FM category. We finally suggest improvements for the EU definition and classification procedures.


Asunto(s)
Biología Computacional/métodos , Modelos Moleculares , Estructura Terciaria de Proteína , Proteínas/química , Proteínas/clasificación , Bases de Datos de Proteínas , Humanos , Simulación de Dinámica Molecular
15.
Proteins ; 86 Suppl 1: 247-256, 2018 03.
Artículo en Inglés | MEDLINE | ID: mdl-29071742

RESUMEN

We present the results of the first independent assessment of protein assemblies in CASP. A total of 1624 oligomeric models were submitted by 108 predictor groups for the 30 oligomeric targets in the CASP12 edition. We evaluated the accuracy of oligomeric predictions by comparison to their reference structures at the interface patch and residue contact levels. We find that interface patches are more reliably predicted than the specific residue contacts. Whereas none of the 15 hard oligomeric targets have successful predictions for the residue contacts at the interface, six have models with resemblance in the interface patch. Successful predictions of interface patch and contacts exist for all targets suitable for homology modeling, with at least one group improving over the best available template for each target. However, the participation in protein assembly prediction is low and uneven. Three human groups are closely ranked at the top by overall performance, but a server outperforms all other predictors for targets suitable for homology modeling. The state of the art of protein assembly prediction methods is in development and has apparent room for improvement, especially for assemblies without templates.


Asunto(s)
Biología Computacional/métodos , Bases de Datos de Proteínas , Modelos Moleculares , Simulación de Dinámica Molecular , Conformación Proteica , Proteínas/química , Algoritmos , Humanos , Pliegue de Proteína , Análisis de Secuencia de Proteína
16.
Proteins ; 84 Suppl 1: 15-9, 2016 09.
Artículo en Inglés | MEDLINE | ID: mdl-26857434

RESUMEN

We outline the role of the Protein Structure Prediction Center (predictioncenter.org) in conducting the CASP11 and CASP ROLL experiments, discuss the experiment statistics, and provide an overview of the present CASP infrastructure. The biggest changes compared to the previous CASPs are the implementation of the evaluation system incorporating practically all evaluation measures, statistical tests, and visualization tools historically used by the CASP assessors, the expansion of the infrastructure to incorporate new categories of contact-assisted and multimeric predictions, and the redesign of the assessors' web-workspace enabling assessments based on multiple measures for different group categories and target sets. Proteins 2016; 84(Suppl 1):15-19. © 2016 Wiley Periodicals, Inc.


Asunto(s)
Biología Computacional/estadística & datos numéricos , Modelos Moleculares , Modelos Estadísticos , Proteínas/química , Programas Informáticos , Benchmarking , Biología Computacional/métodos , Gráficos por Computador , Humanos , Cooperación Internacional , Internet , Conformación Proteica , Pliegue de Proteína
17.
Proteins ; 84 Suppl 1: 349-69, 2016 09.
Artículo en Inglés | MEDLINE | ID: mdl-26344049

RESUMEN

The article presents assessment of the model accuracy estimation methods participating in CASP11. The results of the assessment are expected to be useful to both-developers of the methods and users who way too often are presented with structural models without annotations of accuracy. The main emphasis is placed on the ability of techniques to identify the best models from among several available. Bivariate descriptive statistics and ROC analysis are used to additionally assess the overall correctness of the predicted model accuracy scores, the correlation between the predicted and observed accuracy of models, the effectiveness in distinguishing between good and bad models, the ability to discriminate between reliable and unreliable regions in models, and the accuracy of the coordinate error self-estimates. A rigid-body measure (GDT_TS) and three local-structure-based scores (LDDT, CADaa, and SphereGrinder) are used as reference measures for evaluating methods' performance. Consensus methods, taking advantage of the availability of several models for the same target protein, perform well on the majority of tasks. Methods that predict accuracy on the basis of a single model perform comparably to consensus methods in picking the best models and in the estimation of how accurate is the local structure. More groups than in previous experiments submitted reasonable error estimates of their own models, most likely in response to a recommendation from CASP and the increasing demand from users. Proteins 2016; 84(Suppl 1):349-369. © 2015 Wiley Periodicals, Inc.


Asunto(s)
Benchmarking , Biología Computacional/estadística & datos numéricos , Modelos Moleculares , Modelos Estadísticos , Proteínas/química , Programas Informáticos , Algoritmos , Biología Computacional/métodos , Humanos , Internet , Pliegue de Proteína , Estructura Secundaria de Proteína , Estructura Terciaria de Proteína , Curva ROC , Termodinámica
18.
Proteins ; 84 Suppl 1: 131-44, 2016 09.
Artículo en Inglés | MEDLINE | ID: mdl-26474083

RESUMEN

This article provides a report on the state-of-the-art in the prediction of intra-molecular residue-residue contacts in proteins based on the assessment of the predictions submitted to the CASP11 experiment. The assessment emphasis is placed on the accuracy in predicting long-range contacts. Twenty-nine groups participated in contact prediction in CASP11. At least eight of them used the recently developed evolutionary coupling techniques, with the top group (CONSIP2) reaching precision of 27% on target proteins that could not be modeled by homology. This result indicates a breakthrough in the development of methods based on the correlated mutation approach. Successful prediction of contacts was shown to be practically helpful in modeling three-dimensional structures; in particular target T0806 was modeled exceedingly well with accuracy not yet seen for ab initio targets of this size (>250 residues). Proteins 2016; 84(Suppl 1):131-144. © 2015 Wiley Periodicals, Inc.


Asunto(s)
Biología Computacional/estadística & datos numéricos , Proteínas de Escherichia coli/química , Modelos Moleculares , Modelos Estadísticos , Proteínas/química , Programas Informáticos , Algoritmos , Secuencia de Aminoácidos , Bacterias/química , Biología Computacional/métodos , Simulación por Computador , Bases de Datos de Proteínas , Humanos , Cooperación Internacional , Internet , Pliegue de Proteína , Dominios y Motivos de Interacción de Proteínas , Estructura Secundaria de Proteína , Alineación de Secuencia
19.
Proteins ; 84 Suppl 1: 51-66, 2016 09.
Artículo en Inglés | MEDLINE | ID: mdl-26677002

RESUMEN

We present an assessment of 'template-free modeling' (FM) in CASP11and ROLL. Community-wide server performance suggested the use of automated scores similar to previous CASPs would provide a good system of evaluating performance, even in the absence of comprehensive manual assessment. The CASP11 FM category included several outstanding examples, including successful prediction by the Baker group of a 256-residue target (T0806-D1) that lacked sequence similarity to any existing template. The top server model prediction by Zhang's Quark, which was apparently selected and refined by several manual groups, encompassed the entire fold of target T0837-D1. Methods from the same two groups tended to dominate overall CASP11 FM and ROLL rankings. Comparison of top FM predictions with those from the previous CASP experiment revealed progress in the category, particularly reflected in high prediction accuracy for larger protein domains. FM prediction models for two cases were sufficient to provide functional insights that were otherwise not obtainable by traditional sequence analysis methods. Importantly, CASP11 abstracts revealed that alignment-based contact prediction methods brought about much of the CASP11 progress, producing both of the functionally relevant models as well as several of the other outstanding structure predictions. These methodological advances enabled de novo modeling of much larger domain structures than was previously possible and allowed prediction of functional sites. Proteins 2016; 84(Suppl 1):51-66. © 2015 Wiley Periodicals, Inc.


Asunto(s)
Biología Computacional/estadística & datos numéricos , Modelos Moleculares , Modelos Estadísticos , Proteínas/química , Programas Informáticos , Secuencia de Aminoácidos , Bacterias/química , Biología Computacional/métodos , Gráficos por Computador , Bases de Datos de Proteínas , Humanos , Cooperación Internacional , Internet , Pliegue de Proteína , Dominios y Motivos de Interacción de Proteínas , Multimerización de Proteína , Estructura Secundaria de Proteína , Alineación de Secuencia , Virus/química
20.
Proteins ; 84 Suppl 1: 164-80, 2016 09.
Artículo en Inglés | MEDLINE | ID: mdl-26889875

RESUMEN

We present an overview of contact-assisted predictions in the eleventh round of critical assessment of protein structure prediction (CASP11), which included four categories: predicted contacts (Tp), correct contacts (Tc), simulated sparse NMR contacts (Ts), and cross-linking contacts (Tx). Comparison of assisted to unassisted model quality highlighted a relatively poor overall performance in CASP11 using predicted Tp and crosslinked Tx contact information. However, average model quality significantly improved in the correct Tc and simulated NMR Ts categories for most targets, where maximum improvement of unassisted models reached an impressive 70 GDT_TS. Comparison of the performance in the correct Tc category to CASP10 suggested the improvement in CASP11 model quality originated from an increased number of provided contacts per target. Group rankings based on a combination of scores used in the CASP11 free modeling (FM) assessment for each category highlight four top-performing groups, with three from the Lee lab and one from the Baker lab. We used the overall performance of these groups in each category to develop hypotheses for their relative outperformance in the correct Tc and simulated NMR Ts categories, which stemmed from the fraction of correct contacts provided (correct Tc category) and a reduced fraction of correct contacts offset by an increased coverage of the correct contacts (simulated NMR Ts category). Proteins 2016; 84(Suppl 1):164-180. © 2016 Wiley Periodicals, Inc.


Asunto(s)
Biología Computacional/estadística & datos numéricos , Reactivos de Enlaces Cruzados/química , Modelos Moleculares , Modelos Estadísticos , Proteínas/química , Programas Informáticos , Algoritmos , Biología Computacional/métodos , Simulación por Computador , Bases de Datos de Proteínas , Cooperación Internacional , Internet , Pliegue de Proteína , Dominios y Motivos de Interacción de Proteínas , Estructura Secundaria de Proteína
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA