Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 112
Filtrar
1.
Nature ; 622(7983): 646-653, 2023 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-37704037

RESUMO

We are now entering a new era in protein sequence and structure annotation, with hundreds of millions of predicted protein structures made available through the AlphaFold database1. These models cover nearly all proteins that are known, including those challenging to annotate for function or putative biological role using standard homology-based approaches. In this study, we examine the extent to which the AlphaFold database has structurally illuminated this 'dark matter' of the natural protein universe at high predicted accuracy. We further describe the protein diversity that these models cover as an annotated interactive sequence similarity network, accessible at https://uniprot3d.org/atlas/AFDB90v4 . By searching for novelties from sequence, structure and semantic perspectives, we uncovered the ß-flower fold, added several protein families to Pfam database2 and experimentally demonstrated that one of these belongs to a new superfamily of translation-targeting toxin-antitoxin systems, TumE-TumA. This work underscores the value of large-scale efforts in identifying, annotating and prioritizing new protein families. By leveraging the recent deep learning revolution in protein bioinformatics, we can now shed light into uncharted areas of the protein universe at an unprecedented scale, paving the way to innovations in life sciences and biotechnology.


Assuntos
Bases de Dados de Proteínas , Aprendizado Profundo , Anotação de Sequência Molecular , Dobramento de Proteína , Proteínas , Homologia Estrutural de Proteína , Sequência de Aminoácidos , Internet , Proteínas/química , Proteínas/classificação , Proteínas/metabolismo
2.
Nucleic Acids Res ; 52(W1): W318-W323, 2024 Jul 05.
Artigo em Inglês | MEDLINE | ID: mdl-38634802

RESUMO

The 'structure assessment' web server is a one-stop shop for interactive evaluation and benchmarking of structural models of macromolecular complexes including proteins and nucleic acids. A user-friendly web dashboard links sequence with structure information and results from a variety of state-of-the-art tools, which facilitates the visual exploration and evaluation of structure models. The dashboard integrates stereochemistry information, secondary structure information, global and local model quality assessment of the tertiary structure of comparative protein models, as well as prediction of membrane location. In addition, a benchmarking mode is available where a model can be compared to a reference structure, providing easy access to scores that have been used in recent CASP experiments and CAMEO. The structure assessment web server is available at https://swissmodel.expasy.org/assess.


Assuntos
Internet , Modelos Moleculares , Software , Proteínas/química , Benchmarking , Conformação Proteica
3.
Bioinformatics ; 40(1)2024 01 02.
Artigo em Inglês | MEDLINE | ID: mdl-38175775

RESUMO

MOTIVATION: Language models are routinely used for text classification and generative tasks. Recently, the same architectures were applied to protein sequences, unlocking powerful new approaches in the bioinformatics field. Protein language models (pLMs) generate high-dimensional embeddings on a per-residue level and encode a "semantic meaning" of each individual amino acid in the context of the full protein sequence. These representations have been used as a starting point for downstream learning tasks and, more recently, for identifying distant homologous relationships between proteins. RESULTS: In this work, we introduce a new method that generates embedding-based protein sequence alignments (EBA) and show how these capture structural similarities even in the twilight zone, outperforming both classical methods as well as other approaches based on pLMs. The method shows excellent accuracy despite the absence of training and parameter optimization. We demonstrate that the combination of pLMs with alignment methods is a valuable approach for the detection of relationships between proteins in the twilight-zone. AVAILABILITY AND IMPLEMENTATION: The code to run EBA and reproduce the analysis described in this article is available at: https://git.scicore.unibas.ch/schwede/EBA and https://git.scicore.unibas.ch/schwede/eba_benchmark.


Assuntos
Aminoácidos , Proteínas , Proteínas/química , Sequência de Aminoácidos , Alinhamento de Sequência , Idioma
4.
Proteins ; 92(1): 3-14, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-37465978

RESUMO

Most proteins found in the outer membrane of gram-negative bacteria share a common domain: the transmembrane ß-barrel. These outer membrane ß-barrels (OMBBs) occur in multiple sizes and different families with a wide range of functions evolved independently by amplification from a pool of homologous ancestral ßß-hairpins. This is part of the reason why predicting their three-dimensional (3D) structure, especially by homology modeling, is a major challenge. Recently, DeepMind's AlphaFold v2 (AF2) became the first structure prediction method to reach close-to-experimental atomic accuracy in CASP even for difficult targets. However, membrane proteins, especially OMBBs, were not abundant during their training, raising the question of how accurate the predictions are for these families. In this study, we assessed the performance of AF2 in the prediction of OMBBs and OMBB-like folds of various topologies using an in-house-developed tool for the analysis of OMBB 3D structures, and barrOs. In agreement with previous studies on other membrane protein classes, our results indicate that AF2 predicts transmembrane ß-barrel structures at high accuracy independently of the use of templates, even for novel topologies absent from the training set. These results provide confidence on the models generated by AF2 and open the door to the structural elucidation of novel transmembrane ß-barrel topologies identified in high-throughput OMBB annotation studies or designed de novo.


Assuntos
Furilfuramida , Proteínas de Membrana , Humanos , Proteínas de Membrana/química , Proteínas da Membrana Bacteriana Externa/química
5.
Proteomics ; 23(17): e2200323, 2023 09.
Artigo em Inglês | MEDLINE | ID: mdl-37365936

RESUMO

Reliably scoring and ranking candidate models of protein complexes and assigning their oligomeric state from the structure of the crystal lattice represent outstanding challenges. A community-wide effort was launched to tackle these challenges. The latest resources on protein complexes and interfaces were exploited to derive a benchmark dataset consisting of 1677 homodimer protein crystal structures, including a balanced mix of physiological and non-physiological complexes. The non-physiological complexes in the benchmark were selected to bury a similar or larger interface area than their physiological counterparts, making it more difficult for scoring functions to differentiate between them. Next, 252 functions for scoring protein-protein interfaces previously developed by 13 groups were collected and evaluated for their ability to discriminate between physiological and non-physiological complexes. A simple consensus score generated using the best performing score of each of the 13 groups, and a cross-validated Random Forest (RF) classifier were created. Both approaches showed excellent performance, with an area under the Receiver Operating Characteristic (ROC) curve of 0.93 and 0.94, respectively, outperforming individual scores developed by different groups. Additionally, AlphaFold2 engines recalled the physiological dimers with significantly higher accuracy than the non-physiological set, lending support to the reliability of our benchmark dataset annotations. Optimizing the combined power of interface scoring functions and evaluating it on challenging benchmark datasets appears to be a promising strategy.


Assuntos
Proteínas , Reprodutibilidade dos Testes , Proteínas/metabolismo , Ligação Proteica
6.
Proteins ; 91(12): 1850-1860, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-37858934

RESUMO

Predicting model quality is a fundamental component of any modeling procedure, and blind assessment of these methods constitutes a crucial aspect of the Critical Assessment of Protein Structure Prediction (CASP) experiment. Historically, the main focus was on assessing methods that predict global and per-residue accuracies in tertiary structure models. This focus shifted with the community's increased efforts in modeling complexes and assemblies. We asked the community to process the models from the CASP15 assembly category and provide estimates of the accuracy of the predicted quaternary structure, both globally and at the local interface level. Besides identifying remarkable accuracy of modeling groups in assessing their own predictions, we set up a benchmarking pipeline to highlight different aspects of quaternary structure models and introduced a simple consensus EMA method as baseline. While participating methods showed commendable performance, the baseline was difficult to surpass. It is important to point out that prediction performance varies for the individual CASP targets, highlighting potential areas of improvement and challenges ahead.


Assuntos
Biologia Computacional , Proteínas , Conformação Proteica , Modelos Moleculares , Biologia Computacional/métodos , Proteínas/química , Benchmarking
7.
Proteins ; 91(12): 1912-1924, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-37885318

RESUMO

The prediction of protein-ligand complexes (PLC), using both experimental and predicted structures, is an active and important area of research, underscored by the inclusion of the Protein-Ligand Interaction category in the latest round of the Critical Assessment of Protein Structure Prediction experiment CASP15. The prediction task in CASP15 consisted of predicting both the three-dimensional structure of the receptor protein as well as the position and conformation of the ligand. This paper addresses the challenges and proposed solutions for devising automated benchmarking techniques for PLC prediction. The reliability of experimentally solved PLC as ground truth reference structures is assessed using various validation criteria. Similarity of PLC to previously released complexes are employed to judge PLC diversity and the difficulty of a PLC as a prediction target. We show that the commonly used PDBBind time-split test-set is inappropriate for comprehensive PLC evaluation, with state-of-the-art tools showing conflicting results on a more representative and high quality dataset constructed for benchmarking purposes. We also show that redocking on crystal structures is a much simpler task than docking into predicted protein models, demonstrated by the two PLC-prediction-specific scoring metrics created. Finally, we introduce a fully automated pipeline that predicts PLC and evaluates the accuracy of the protein structure, ligand pose, and protein-ligand interactions.


Assuntos
Benchmarking , Proteínas , Sítios de Ligação , Ligação Proteica , Ligantes , Reprodutibilidade dos Testes , Simulação de Acoplamento Molecular , Proteínas/química , Conformação Proteica
8.
Proteins ; 91(12): 1539-1549, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-37920879

RESUMO

Computing protein structure from amino acid sequence information has been a long-standing grand challenge. Critical assessment of structure prediction (CASP) conducts community experiments aimed at advancing solutions to this and related problems. Experiments are conducted every 2 years. The 2020 experiment (CASP14) saw major progress, with the second generation of deep learning methods delivering accuracy comparable with experiment for many single proteins. There is an expectation that these methods will have much wider application in computational structural biology. Here we summarize results from the most recent experiment, CASP15, in 2022, with an emphasis on new deep learning-driven progress. Other papers in this special issue of proteins provide more detailed analysis. For single protein structures, the AlphaFold2 deep learning method is still superior to other approaches, but there are two points of note. First, although AlphaFold2 was the core of all the most successful methods, there was a wide variety of implementation and combination with other methods. Second, using the standard AlphaFold2 protocol and default parameters only produces the highest quality result for about two thirds of the targets, and more extensive sampling is required for the others. The major advance in this CASP is the enormous increase in the accuracy of computed protein complexes, achieved by the use of deep learning methods, although overall these do not fully match the performance for single proteins. Here too, AlphaFold2 based method perform best, and again more extensive sampling than the defaults is often required. Also of note are the encouraging early results on the use of deep learning to compute ensembles of macromolecular structures. Critically for the usability of computed structures, for both single proteins and protein complexes, deep learning derived estimates of both local and global accuracy are of high quality, however the estimates in interface regions are slightly less reliable. CASP15 also included computation of RNA structures for the first time. Here, the classical approaches produced better agreement with experiment than the new deep learning ones, and accuracy is limited. Also, for the first time, CASP included the computation of protein-ligand complexes, an area of special interest for drug design. Here too, classical methods were still superior to deep learning ones. Many new approaches were discussed at the CASP conference, and it is clear methods will continue to advance.


Assuntos
Biologia Computacional , Proteínas , Conformação Proteica , Modelos Moleculares , Proteínas/química , Sequência de Aminoácidos , Biologia Computacional/métodos
9.
Proteins ; 91(12): 1811-1821, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-37795762

RESUMO

CASP15 introduced a new category, ligand prediction, where participants were provided with a protein or nucleic acid sequence, SMILES line notation, and stoichiometry for ligands and tasked with generating computational models for the three-dimensional structure of the corresponding protein-ligand complex. These models were subsequently compared with experimental structures determined by x-ray crystallography or cryoEM. To assess these predictions, two novel scores were developed. The Binding-Site Superposed, Symmetry-Corrected Pose Root Mean Square Deviation (BiSyRMSD) evaluated the absolute deviations of the models from the experimental structures. At the same time, the Local Distance Difference Test for Protein-Ligand Interactions (lDDT-PLI) assessed the ability of models to reproduce the protein-ligand interactions in the experimental structures. The ligands evaluated in this challenge range from single-atom ions to large flexible organic molecules. More than 1800 submissions were evaluated for their ability to predict 23 different protein-ligand complexes. Overall, the best models could faithfully reproduce the geometries of more than half of the prediction targets. The ligands' size and flexibility were the primary factors influencing the predictions' quality. Small ions and organic molecules with limited flexibility were predicted with high fidelity, while reproducing the binding poses of larger, flexible ligands proved more challenging.


Assuntos
Modelos Moleculares , Humanos , Ligantes , Sítios de Ligação , Íons , Ligação Proteica , Cristalografia por Raios X
10.
Proteins ; 91(12): 1550-1557, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-37306011

RESUMO

Prediction categories in the Critical Assessment of Structure Prediction (CASP) experiments change with the need to address specific problems in structure modeling. In CASP15, four new prediction categories were introduced: RNA structure, ligand-protein complexes, accuracy of oligomeric structures and their interfaces, and ensembles of alternative conformations. This paper lists technical specifications for these categories and describes their integration in the CASP data management system.


Assuntos
Biologia Computacional , Proteínas , Conformação Proteica , Proteínas/química , Modelos Moleculares , Ligantes
11.
PLoS Comput Biol ; 17(1): e1008667, 2021 01.
Artigo em Inglês | MEDLINE | ID: mdl-33507980

RESUMO

Computational methods for protein structure modelling are routinely used to complement experimental structure determination, thus they help to address a broad spectrum of scientific questions in biomedical research. The most accurate methods today are based on homology modelling, i.e. detecting a homologue to the desired target sequence that can be used as a template for modelling. Here we present a versatile open source homology modelling toolbox as foundation for flexible and computationally efficient modelling workflows. ProMod3 is a fully scriptable software platform that can perform all steps required to generate a protein model by homology. Its modular design aims at fast prototyping of novel algorithms and implementing flexible modelling pipelines. Common modelling tasks, such as loop modelling, sidechain modelling or generating a full protein model by homology, are provided as production ready pipelines, forming the starting point for own developments and enhancements. ProMod3 is the central software component of the widely used SWISS-MODEL web-server.


Assuntos
Biologia Computacional/métodos , Modelos Moleculares , Proteínas/química , Software , Homologia Estrutural de Proteína , Algoritmos , Bases de Dados de Proteínas , Internet , Conformação Proteica
12.
Proteins ; 89(12): 1607-1617, 2021 12.
Artigo em Inglês | MEDLINE | ID: mdl-34533838

RESUMO

Critical assessment of structure prediction (CASP) is a community experiment to advance methods of computing three-dimensional protein structure from amino acid sequence. Core components are rigorous blind testing of methods and evaluation of the results by independent assessors. In the most recent experiment (CASP14), deep-learning methods from one research group consistently delivered computed structures rivaling the corresponding experimental ones in accuracy. In this sense, the results represent a solution to the classical protein-folding problem, at least for single proteins. The models have already been shown to be capable of providing solutions for problematic crystal structures, and there are broad implications for the rest of structural biology. Other research groups also substantially improved performance. Here, we describe these results and outline some of the many implications. Other related areas of CASP, including modeling of protein complexes, structure refinement, estimation of model accuracy, and prediction of inter-residue contacts and distances, are also described.


Assuntos
Conformação Proteica , Dobramento de Proteína , Proteínas , Software , Sequência de Aminoácidos , Biologia Computacional , Modelos Estatísticos , Simulação de Dinâmica Molecular , Proteínas/química , Proteínas/metabolismo , Análise de Sequência de Proteína
13.
Proteins ; 89(12): 1977-1986, 2021 12.
Artigo em Inglês | MEDLINE | ID: mdl-34387007

RESUMO

The Continuous Automated Model EvaluatiOn (CAMEO) platform complements the biennial CASP experiment by conducting fully automated blind evaluations of three-dimensional protein prediction servers based on the weekly prerelease of sequences of those structures, which are going to be published in the upcoming release of the Protein Data Bank. While in CASP14, significant success was observed in predicting the structures of individual protein chains with high accuracy, significant challenges remain in correctly predicting the structures of complexes. By implementing fully automated evaluation of predictions for protein-protein complexes, as well as for proteins in complex with ligands, peptides, nucleic acids, or proteins containing noncanonical amino acid residues, CAMEO will assist new developments in those challenging areas of active research.


Assuntos
Biologia Computacional/métodos , Bases de Dados de Proteínas , Conformação Proteica , Análise de Sequência de Proteína , Software , Benchmarking , Análise por Conglomerados , Modelos Moleculares , Proteínas/química , Proteínas/genética
14.
Proteins ; 89(12): 1647-1672, 2021 12.
Artigo em Inglês | MEDLINE | ID: mdl-34561912

RESUMO

The biological and functional significance of selected Critical Assessment of Techniques for Protein Structure Prediction 14 (CASP14) targets are described by the authors of the structures. The authors highlight the most relevant features of the target proteins and discuss how well these features were reproduced in the respective submitted predictions. The overall ability to predict three-dimensional structures of proteins has improved remarkably in CASP14, and many difficult targets were modeled with impressive accuracy. For the first time in the history of CASP, the experimentalists not only highlighted that computational models can accurately reproduce the most critical structural features observed in their targets, but also envisaged that models could serve as a guidance for further studies of biologically-relevant properties of proteins.


Assuntos
Modelos Moleculares , Conformação Proteica , Proteínas/química , Software , Sequência de Aminoácidos , Biologia Computacional , Microscopia Crioeletrônica , Cristalografia por Raios X , Análise de Sequência de Proteína
15.
Bioinformatics ; 36(6): 1765-1771, 2020 03 01.
Artigo em Inglês | MEDLINE | ID: mdl-31697312

RESUMO

MOTIVATION: Methods that estimate the quality of a 3D protein structure model in absence of an experimental reference structure are crucial to determine a model's utility and potential applications. Single model methods assess individual models whereas consensus methods require an ensemble of models as input. In this work, we extend the single model composite score QMEAN that employs statistical potentials of mean force and agreement terms by introducing a consensus-based distance constraint (DisCo) score. RESULTS: DisCo exploits distance distributions from experimentally determined protein structures that are homologous to the model being assessed. Feed-forward neural networks are trained to adaptively weigh contributions by the multi-template DisCo score and classical single model QMEAN parameters. The result is the composite score QMEANDisCo, which combines the accuracy of consensus methods with the broad applicability of single model approaches. We also demonstrate that, despite being the de-facto standard for structure prediction benchmarking, CASP models are not the ideal data source to train predictive methods for model quality estimation. For performance assessment, QMEANDisCo is continuously benchmarked within the CAMEO project and participated in CASP13. For both, it ranks among the top performers and excels with low response times. AVAILABILITY AND IMPLEMENTATION: QMEANDisCo is available as web-server at https://swissmodel.expasy.org/qmean. The source code can be downloaded from https://git.scicore.unibas.ch/schwede/QMEAN. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Proteínas , Software , Modelos Moleculares , Redes Neurais de Computação , Conformação Proteica
16.
Nucleic Acids Res ; 46(W1): W296-W303, 2018 07 02.
Artigo em Inglês | MEDLINE | ID: mdl-29788355

RESUMO

Homology modelling has matured into an important technique in structural biology, significantly contributing to narrowing the gap between known protein sequences and experimentally determined structures. Fully automated workflows and servers simplify and streamline the homology modelling process, also allowing users without a specific computational expertise to generate reliable protein models and have easy access to modelling results, their visualization and interpretation. Here, we present an update to the SWISS-MODEL server, which pioneered the field of automated modelling 25 years ago and been continuously further developed. Recently, its functionality has been extended to the modelling of homo- and heteromeric complexes. Starting from the amino acid sequences of the interacting proteins, both the stoichiometry and the overall structure of the complex are inferred by homology modelling. Other major improvements include the implementation of a new modelling engine, ProMod3 and the introduction a new local model quality estimation method, QMEANDisCo. SWISS-MODEL is freely available at https://swissmodel.expasy.org.


Assuntos
Internet , Conformação Proteica , Proteínas/genética , Software , Bases de Dados de Proteínas , Modelos Químicos , Simulação de Dinâmica Molecular , Proteínas/química , Homologia de Sequência de Aminoácidos , Homologia Estrutural de Proteína
17.
Int J Mol Sci ; 21(14)2020 Jul 21.
Artigo em Inglês | MEDLINE | ID: mdl-32708196

RESUMO

(1) Background: Virtual screening studies on the therapeutically relevant proteins of the severe acute respiratory syndrome Coronavirus 2 (SARS-CoV-2) require a detailed characterization of their druggable binding sites, and, more generally, a convenient pocket mapping represents a key step for structure-based in silico studies; (2) Methods: Along with a careful literature search on SARS-CoV-2 protein targets, the study presents a novel strategy for pocket mapping based on the combination of pocket (as performed by the well-known FPocket tool) and docking searches (as performed by PLANTS or AutoDock/Vina engines); such an approach is implemented by the Pockets 2.0 plug-in for the VEGA ZZ suite of programs; (3) Results: The literature analysis allowed the identification of 16 promising binding cavities within the SARS-CoV-2 proteins and the here proposed approach was able to recognize them showing performances clearly better than those reached by the sole pocket detection; and (4) Conclusions: Even though the presented strategy should require more extended validations, this proved successful in precisely characterizing a set of SARS-CoV-2 druggable binding pockets including both orthosteric and allosteric sites, which are clearly amenable for virtual screening campaigns and drug repurposing studies. All results generated by the study and the Pockets 2.0 plug-in are available for download.


Assuntos
Antivirais/química , Betacoronavirus/efeitos dos fármacos , Infecções por Coronavirus/tratamento farmacológico , Pneumonia Viral/tratamento farmacológico , Proteínas Virais/química , Sítios de Ligação/efeitos dos fármacos , COVID-19 , Reposicionamento de Medicamentos , Humanos , Simulação de Acoplamento Molecular , Pandemias , Ligação Proteica/efeitos dos fármacos , Conformação Proteica , SARS-CoV-2
18.
Proteins ; 87(12): 1011-1020, 2019 12.
Artigo em Inglês | MEDLINE | ID: mdl-31589781

RESUMO

CASP (critical assessment of structure prediction) assesses the state of the art in modeling protein structure from amino acid sequence. The most recent experiment (CASP13 held in 2018) saw dramatic progress in structure modeling without use of structural templates (historically "ab initio" modeling). Progress was driven by the successful application of deep learning techniques to predict inter-residue distances. In turn, these results drove dramatic improvements in three-dimensional structure accuracy: With the proviso that there are an adequate number of sequences known for the protein family, the new methods essentially solve the long-standing problem of predicting the fold topology of monomeric proteins. Further, the number of sequences required in the alignment has fallen substantially. There is also substantial improvement in the accuracy of template-based models. Other areas-model refinement, accuracy estimation, and the structure of protein assemblies-have again yielded interesting results. CASP13 placed increased emphasis on the use of sparse data together with modeling and chemical crosslinking, SAXS, and NMR all yielded more mature results. This paper summarizes the key outcomes of CASP13. The special issue of PROTEINS contains papers describing the CASP13 assessments in each modeling category and contributions from the participants.


Assuntos
Sequência de Aminoácidos/genética , Biologia Computacional , Conformação Proteica , Proteínas/ultraestrutura , Humanos , Modelos Moleculares , Proteínas/química , Proteínas/genética , Espalhamento a Baixo Ângulo , Difração de Raios X
19.
Proteins ; 87(12): 1378-1387, 2019 12.
Artigo em Inglês | MEDLINE | ID: mdl-31571280

RESUMO

Critical blind assessment of structure prediction techniques is crucial for the scientific community to establish the state of the art, identify bottlenecks, and guide future developments. In Critical Assessment of Techniques in Structure Prediction (CASP), human experts assess the performance of participating methods in relation to the difficulty of the prediction task in a biennial experiment on approximately 100 targets. Yet, the development of automated computational modeling methods requires more frequent evaluation cycles and larger sets of data. The "Continuous Automated Model EvaluatiOn (CAMEO)" platform complements CASP by conducting fully automated blind prediction evaluations based on the weekly pre-release of sequences of those structures, which are going to be published in the next release of the Protein Data Bank (PDB). Each week, CAMEO publishes benchmarking results for predictions corresponding to a set of about 20 targets collected during a 4-day prediction window. CAMEO benchmarking data are generated consistently for all methods at the same point in time, enabling developers to cross-validate their method's performance, and referring to their results in publications. Many successful participants of CASP have used CAMEO-either by directly benchmarking their methods within the system or by comparing their own performance to CAMEO reference data. CAMEO offers a variety of scores reflecting different aspects of structure modeling, for example, binding site accuracy, homo-oligomer interface quality, or accuracy of local model confidence estimates. By introducing the "bestSingleTemplate" method based on structure superpositions as a reference for the accuracy of 3D modeling predictions, CAMEO facilitates objective comparison of techniques and fosters the development of advanced methods.


Assuntos
Biologia Computacional , Conformação Proteica , Proteínas/ultraestrutura , Software , Algoritmos , Benchmarking , Sítios de Ligação , Bases de Dados de Proteínas , Humanos , Modelos Moleculares , Dobramento de Proteína , Proteínas/química , Proteínas/genética , Análise de Sequência de Proteína
20.
Proteins ; 87(12): 1361-1377, 2019 12.
Artigo em Inglês | MEDLINE | ID: mdl-31265154

RESUMO

Methods to reliably estimate the accuracy of 3D models of proteins are both a fundamental part of most protein folding pipelines and important for reliable identification of the best models when multiple pipelines are used. Here, we describe the progress made from CASP12 to CASP13 in the field of estimation of model accuracy (EMA) as seen from the progress of the most successful methods in CASP13. We show small but clear progress, that is, several methods perform better than the best methods from CASP12 when tested on CASP13 EMA targets. Some progress is driven by applying deep learning and residue-residue contacts to model accuracy prediction. We show that the best EMA methods select better models than the best servers in CASP13, but that there exists a great potential to improve this further. Also, according to the evaluation criteria based on local similarities, such as lDDT and CAD, it is now clear that single model accuracy methods perform relatively better than consensus-based methods.


Assuntos
Biologia Computacional , Conformação Proteica , Proteínas/ultraestrutura , Software , Algoritmos , Bases de Dados de Proteínas , Modelos Moleculares , Dobramento de Proteína , Proteínas/química , Proteínas/genética , Alinhamento de Sequência , Análise de Sequência de Proteína
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA