Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 18 de 18
Filtrar
Más filtros













Base de datos
Intervalo de año de publicación
1.
J Struct Biol ; 201(2): 130-138, 2018 02.
Artículo en Inglés | MEDLINE | ID: mdl-29017817

RESUMEN

In recent years, a number of new protein structures that possess tandem repeats have emerged. Many of these proteins are comprised of tandem arrays of ß-hairpins. Today, the amount and variety of the data on these ß-hairpin repeat (BHR) structures have reached a level that requires detailed analysis and further classification. In this paper, we classified the BHR proteins, compared structures, sequences of repeat motifs, functions and distribution across the major taxonomic kingdoms of life and within organisms. As a result, we identified six different BHR folds in tandem repeat proteins of Class III (elongated structures) and one BHR fold (up-and-down ß-barrel) in Class IV ("closed" structures). Our survey reveals the high incidence of the BHR proteins among bacteria and viruses and their possible relationship to the structures of amyloid fibrils. It indicates that BHR folds will be an attractive target for future structural studies, especially in the context of age-related amyloidosis and emerging infectious diseases. This work allowed us to update the RepeatsDB database, which contains annotated tandem repeat protein structures and to construct sequence profiles based on BHR structural alignments.


Asunto(s)
Pliegue de Proteína , Proteínas/química , Proteínas/clasificación , Secuencias de Aminoácidos , Amiloide/química , Proteínas Bacterianas/química , Bases de Datos de Proteínas , Humanos , Internet , Modelos Moleculares , Priones/química , Conformación Proteica , Conformación Proteica en Lámina beta , Secuencias Repetitivas de Aminoácido , Proteínas Virales/química , Zinc/metabolismo
2.
Proteins ; 86 Suppl 1: 335-344, 2018 03.
Artículo en Inglés | MEDLINE | ID: mdl-28748648

RESUMEN

Our aim in CASP12 was to improve our Template-Based Modeling (TBM) methods through better model selection, accuracy self-estimate (ASE) scores and refinement. To meet this aim, we developed two new automated methods, which we used to score, rank, and improve upon the provided server models. Firstly, the ModFOLD6_rank method, for improved global Quality Assessment (QA), model ranking and the detection of local errors. Secondly, the ReFOLD method for fixing errors through iterative QA guided refinement. For our automated predictions we developed the IntFOLD4-TS protocol, which integrates the ModFOLD6_rank method for scoring the multiple-template models that were generated using a number of alternative sequence-structure alignments. Overall, our selection of top models and ASE scores using ModFOLD6_rank was an improvement on our previous approaches. In addition, it was worthwhile attempting to repair the detected errors in the top selected models using ReFOLD, which gave us an overall gain in performance. According to the assessors' formula, the IntFOLD4 server ranked 3rd/5th (average Z-score > 0.0/-2.0) on the server only targets, and our manual predictions (McGuffin group) ranked 1st/2nd (average Z-score > -2.0/0.0) compared to all other groups.


Asunto(s)
Biología Computacional/métodos , Modelos Moleculares , Conformación Proteica , Pliegue de Proteína , Proteínas/química , Programas Informáticos , Bases de Datos de Proteínas , Humanos , Alineación de Secuencia , Análisis de Secuencia de Proteína
3.
Malar J ; 16(1): 493, 2017 Dec 19.
Artículo en Inglés | MEDLINE | ID: mdl-29258508

RESUMEN

BACKGROUND: Plasmodium falciparum malaria is one of the most widespread parasitic infections in humans and remains a leading global health concern. Malaria elimination efforts are threatened by the emergence and spread of resistance to artemisinin-based combination therapy, the first-line treatment of malaria. Promising molecular markers and pathways associated with artemisinin drug resistance have been identified, but the underlying molecular mechanisms of resistance remains unknown. The genomic data from early period of emergence of artemisinin resistance (2008-2011) was evaluated, with aim to define k13 associated genetic background in Cambodia, the country identified as epicentre of anti-malarial drug resistance, through characterization of 167 parasite isolates using a panel of 21,257 SNPs. RESULTS: Eight subpopulations were identified suggesting a process of acquisition of artemisinin resistance consistent with an emergence-selection-diffusion model, supported by the shifting balance theory. Identification of population specific mutations facilitated the characterization of a core set of 57 background genes associated with artemisinin resistance and associated pathways. The analysis indicates that the background of artemisinin resistance was not acquired after drug pressure, rather is the result of fixation followed by selection on the daughter subpopulations derived from the ancestral population. CONCLUSIONS: Functional analysis of artemisinin resistance subpopulations illustrates the strong interplay between ubiquitination and cell division or differentiation in artemisinin resistant parasites. The relationship of these pathways with the P. falciparum resistant subpopulation and presence of drug resistance markers in addition to k13, highlights the major role of admixed parasite population in the diffusion of artemisinin resistant background. The diffusion of resistant genes in the Cambodian admixed population after selection resulted from mating of gametocytes of sensitive and resistant parasite populations.


Asunto(s)
Artemisininas/farmacología , Resistencia a Medicamentos , Malaria Falciparum/epidemiología , Plasmodium falciparum/efectos de los fármacos , Plasmodium falciparum/genética , Antimaláricos/farmacología , Cambodia/epidemiología , Genotipo , Humanos , Malaria Falciparum/parasitología , Mutación , Plasmodium falciparum/clasificación , Plasmodium falciparum/metabolismo , Polimorfismo de Nucleótido Simple , Proteínas Protozoarias/genética
4.
Protein Sci ; 26(9): 1864-1869, 2017 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-28685932

RESUMEN

There has been an increased interest in computational methods for amyloid and (or) aggregate prediction, due to the prevalence of these aggregates in numerous diseases and their recently discovered functional importance. To evaluate these methods, several datasets have been compiled. Typically, aggregation-prone regions of proteins, which form aggregates or amyloids in vivo, are more than 15 residues long and intrinsically disordered. However, the number of such experimentally established amyloid forming and non-forming sequences are limited, not exceeding one hundred entries in existing databases. In this work, we parsed all available NMR-resolved protein structures from the PDB and assembled a new, sevenfold larger, dataset of unfolded sequences, soluble at high concentrations. We proposed to use these sequences as a negative set for evaluating methods for predicting aggregation in vivo. We also present the results of benchmarking cutting edge tools for the prediction of aggregation versus solubility propensity.


Asunto(s)
Algoritmos , Amiloide/química , Amiloide/metabolismo , Bases de Datos de Proteínas , Amiloide/análisis , Modelos Estadísticos , Resonancia Magnética Nuclear Biomolecular , Solubilidad
6.
Nucleic Acids Res ; 45(D1): D219-D227, 2017 01 04.
Artículo en Inglés | MEDLINE | ID: mdl-27899601

RESUMEN

The Database of Protein Disorder (DisProt, URL: www.disprot.org) has been significantly updated and upgraded since its last major renewal in 2007. The current release holds information on more than 800 entries of IDPs/IDRs, i.e. intrinsically disordered proteins or regions that exist and function without a well-defined three-dimensional structure. We have re-curated previous entries to purge DisProt from conflicting cases, and also upgraded the functional classification scheme to reflect continuous advance in the field in the past 10 years or so. We define IDPs as proteins that are disordered along their entire sequence, i.e. entirely lack structural elements, and IDRs as regions that are at least five consecutive residues without well-defined structure. We base our assessment of disorder strictly on experimental evidence, such as X-ray crystallography and nuclear magnetic resonance (primary techniques) and a broad range of other experimental approaches (secondary techniques). Confident and ambiguous annotations are highlighted separately. DisProt 7.0 presents classified knowledge regarding the experimental characterization and functional annotations of IDPs/IDRs, and is intended to provide an invaluable resource for the research community for a better understanding structural disorder and for developing better computational tools for studying disordered proteins.


Asunto(s)
Bases de Datos de Proteínas , Proteínas Intrínsecamente Desordenadas , Animales , Cristalografía por Rayos X , Transferencia Resonante de Energía de Fluorescencia , Predicción , Control de Formularios y Registros , Humanos , Proteínas Intrínsecamente Desordenadas/clasificación , Resonancia Magnética Nuclear Biomolecular , Conformación Proteica
7.
Int J Mol Sci ; 17(12)2016 Dec 19.
Artículo en Inglés | MEDLINE | ID: mdl-27999363

RESUMEN

Despite the ever-increasing role of pesticides in modern agriculture, their deleterious effects are still underexplored. Here we examine the effect of A6, a pesticide derived from the naturally-occurring α-terthienyl, and structurally related to the endocrine disrupting pesticides anilinopyrimidines, on living zebrafish larvae. We show that both A6 and an anilinopyrimidine, cyprodinyl, decrease larval survival and affect central neurons at micromolar concentrations. Focusing on a superficial and easily observable sensory system, the lateral line system, we found that defects in axonal and sensory cell regeneration can be observed at much lower doses, in the nanomolar range. We also show that A6 accumulates preferentially in lateral line neurons and hair cells. We examined whether A6 affects the expression of putative target genes, and found that genes involved in apoptosis/cell proliferation are down-regulated, as well as genes reflecting estrogen receptor activation, consistent with previous reports that anilinopyrimidines act as endocrine disruptors. On the other hand, canonical targets of endocrine signaling are not affected, suggesting that the neurotoxic effect of A6 may be due to the binding of this compound to a recently identified, neuron-specific estrogen receptor.


Asunto(s)
Agentes de Control Biológico/toxicidad , Disruptores Endocrinos/toxicidad , Larva/efectos de los fármacos , Sistema de la Línea Lateral/efectos de los fármacos , Regeneración Nerviosa/efectos de los fármacos , Pirimidinas/toxicidad , Pirimidinonas/toxicidad , Tiofenos/toxicidad , Pez Cebra/embriología , Animales , Apoptosis/efectos de los fármacos , Apoptosis/genética , Proliferación Celular/efectos de los fármacos , Proliferación Celular/genética , Regulación de la Expresión Génica , Mecanorreceptores/efectos de los fármacos , Receptores de Estrógenos/genética , Receptores de Estrógenos/metabolismo , Médula Espinal/citología , Médula Espinal/efectos de los fármacos , Tiofenos/química
8.
FEBS Lett ; 589(19 Pt A): 2611-9, 2015 Sep 14.
Artículo en Inglés | MEDLINE | ID: mdl-26320412

RESUMEN

In recent years, there has been an emergence of new 3D structures of proteins containing tandem repeats (TRs), as a result of improved expression and crystallization strategies. Databases focused on structure classifications (PDB, SCOP, CATH) do not provide an easy solution for selection of these structures from PDB. Several approaches have been developed, but no best approach exists to identify the whole range of 3D TRs. Here we describe the TAndem PrOtein detector (TAPO) that uses periodicities of atomic coordinates and other types of structural representation, including strings generated by conformational alphabets, residue contact maps, and arrangements of vectors of secondary structure elements. The benchmarking shows the superior performance of TAPO over the existing programs. In accordance with our analysis of PDB using TAPO, 19% of proteins contain 3D TRs. This analysis allowed us to identify new families of 3D TRs, suggesting that TAPO can be used to regularly update the collection and classification of existing repetitive structures.


Asunto(s)
Biología Computacional/métodos , Proteínas/química , Secuencias Repetitivas de Aminoácido , Secuencias Repetidas en Tándem , Algoritmos , Bases de Datos de Proteínas , Modelos Moleculares , Conformación Proteica , Estructura Secundaria de Proteína , Reproducibilidad de los Resultados
9.
Nucleic Acids Res ; 43(W1): W169-73, 2015 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-25820431

RESUMEN

IntFOLD is an independent web server that integrates our leading methods for structure and function prediction. The server provides a simple unified interface that aims to make complex protein modelling data more accessible to life scientists. The server web interface is designed to be intuitive and integrates a complex set of quantitative data, so that 3D modelling results can be viewed on a single page and interpreted by non-expert modellers at a glance. The only required input to the server is an amino acid sequence for the target protein. Here we describe major performance and user interface updates to the server, which comprises an integrated pipeline of methods for: tertiary structure prediction, global and local 3D model quality assessment, disorder prediction, structural domain prediction, function prediction and modelling of protein-ligand interactions. The server has been independently validated during numerous CASP (Critical Assessment of Techniques for Protein Structure Prediction) experiments, as well as being continuously evaluated by the CAMEO (Continuous Automated Model Evaluation) project. The IntFOLD server is available at: http://www.reading.ac.uk/bioinf/IntFOLD/.


Asunto(s)
Modelos Moleculares , Conformación Proteica , Programas Informáticos , Algoritmos , Internet , Ligandos , Estructura Terciaria de Proteína , Proteínas/fisiología , Análisis de Secuencia de Proteína
10.
Nucleic Acids Res ; 41(Web Server issue): W303-7, 2013 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-23761453

RESUMEN

The FunFOLD2 server is a new independent server that integrates our novel protein-ligand binding site and quality assessment protocols for the prediction of protein function (FN) from sequence via structure. Our guiding principles were, first, to provide a simple unified resource to make our function prediction software easily accessible to all via a simple web interface and, second, to produce integrated output for predictions that can be easily interpreted. The server provides a clean web interface so that results can be viewed on a single page and interpreted by non-experts at a glance. The output for the prediction is an image of the top predicted tertiary structure annotated to indicate putative ligand-binding site residues. The results page also includes a list of the most likely binding site residues and the types of predicted ligands and their frequencies in similar structures. The protein-ligand interactions can also be interactively visualized in 3D using the Jmol plug-in. The raw machine readable data are provided for developers, which comply with the Critical Assessment of Techniques for Protein Structure Prediction data standards for FN predictions. The FunFOLD2 webserver is freely available to all at the following web site: http://www.reading.ac.uk/bioinf/FunFOLD/FunFOLD_form_2_0.html.


Asunto(s)
Proteínas/química , Programas Informáticos , Algoritmos , Aminopeptidasas/química , Sitios de Unión , Internet , Ligandos , Modelos Moleculares , Conformación Proteica , Proteínas/metabolismo
11.
Nucleic Acids Res ; 41(Web Server issue): W368-72, 2013 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-23620298

RESUMEN

Once you have generated a 3D model of a protein, how do you know whether it bears any resemblance to the actual structure? To determine the usefulness of 3D models of proteins, they must be assessed in terms of their quality by methods that predict their similarity to the native structure. The ModFOLD4 server is the latest version of our leading independent server for the estimation of both the global and local (per-residue) quality of 3D protein models. The server produces both machine readable and graphical output, providing users with intuitive visual reports on the quality of predicted protein tertiary structures. The ModFOLD4 server is freely available to all at: http://www.reading.ac.uk/bioinf/ModFOLD/.


Asunto(s)
Modelos Moleculares , Estructura Terciaria de Proteína , Programas Informáticos , Internet , Análisis de Secuencia de Proteína
12.
PLoS One ; 7(5): e38219, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-22666491

RESUMEN

The estimation of prediction quality is important because without quality measures, it is difficult to determine the usefulness of a prediction. Currently, methods for ligand binding site residue predictions are assessed in the function prediction category of the biennial Critical Assessment of Techniques for Protein Structure Prediction (CASP) experiment, utilizing the Matthews Correlation Coefficient (MCC) and Binding-site Distance Test (BDT) metrics. However, the assessment of ligand binding site predictions using such metrics requires the availability of solved structures with bound ligands. Thus, we have developed a ligand binding site quality assessment tool, FunFOLDQA, which utilizes protein feature analysis to predict ligand binding site quality prior to the experimental solution of the protein structures and their ligand interactions. The FunFOLDQA feature scores were combined using: simple linear combinations, multiple linear regression and a neural network. The neural network produced significantly better results for correlations to both the MCC and BDT scores, according to Kendall's τ, Spearman's ρ and Pearson's r correlation coefficients, when tested on both the CASP8 and CASP9 datasets. The neural network also produced the largest Area Under the Curve score (AUC) when Receiver Operator Characteristic (ROC) analysis was undertaken for the CASP8 dataset. Furthermore, the FunFOLDQA algorithm incorporating the neural network, is shown to add value to FunFOLD, when both methods are employed in combination. This results in a statistically significant improvement over all of the best server methods, the FunFOLD method (6.43%), and one of the top manual groups (FN293) tested on the CASP8 dataset. The FunFOLDQA method was also found to be competitive with the top server methods when tested on the CASP9 dataset. To the best of our knowledge, FunFOLDQA is the first attempt to develop a method that can be used to assess ligand binding site prediction quality, in the absence of experimental data.


Asunto(s)
Algoritmos , Biología Computacional/métodos , Proteínas/química , Proteínas/metabolismo , Sitios de Unión , Ligandos , Modelos Lineales , Modelos Moleculares , Redes Neurales de la Computación , Unión Proteica , Conformación Proteica , Control de Calidad , Curva ROC
13.
Bioinformatics ; 28(14): 1851-7, 2012 Jul 15.
Artículo en Inglés | MEDLINE | ID: mdl-22592378

RESUMEN

MOTIVATION: Modelling the 3D structures of proteins can often be enhanced if more than one fold template is used during the modelling process. However, in many cases, this may also result in poorer model quality for a given target or alignment method. There is a need for modelling protocols that can both consistently and significantly improve 3D models and provide an indication of when models might not benefit from the use of multiple target-template alignments. Here, we investigate the use of both global and local model quality prediction scores produced by ModFOLDclust2, to improve the selection of target-template alignments for the construction of multiple-template models. Additionally, we evaluate clustering the resulting population of multi- and single-template models for the improvement of our IntFOLD-TS tertiary structure prediction method. RESULTS: We find that using accurate local model quality scores to guide alignment selection is the most consistent way to significantly improve models for each of the sequence to structure alignment methods tested. In addition, using accurate global model quality for re-ranking alignments, prior to selection, further improves the majority of multi-template modelling methods tested. Furthermore, subsequent clustering of the resulting population of multiple-template models significantly improves the quality of selected models compared with the previous version of our tertiary structure prediction method, IntFOLD-TS. AVAILABILITY AND IMPLEMENTATION: Source code and binaries can be freely downloaded from http://www.reading.ac.uk/bioinf/downloads/


Asunto(s)
Biología Computacional/métodos , Modelos Moleculares , Estructura Terciaria de Proteína , Proteínas/química , Análisis por Conglomerados , Alineación de Secuencia/métodos
14.
Proteins ; 79 Suppl 10: 137-46, 2011.
Artículo en Inglés | MEDLINE | ID: mdl-22069035

RESUMEN

The IntFOLD-TS method was developed according to the guiding principle that the model quality assessment (QA) would be the most critical stage for our template-based modeling pipeline. Thus, the IntFOLD-TS method firstly generates numerous alternate models, using in-house versions of several different sequence-structure alignment methods, which are then ranked in terms of global quality using our top performing QA method-ModFOLDclust2. In addition to the predicted global quality scores, the predictions of local errors are also provided in the resulting coordinate files, using scores that represent the predicted deviation of each residue in the model from the equivalent residue in the native structure. The IntFOLD-TS method was found to generate high quality 3D models for many of the CASP9 targets, whilst also providing highly accurate predictions of their per-residue errors. This important information may help to make the 3D models that are produced by the IntFOLD-TS method more useful for guiding future experimental work.


Asunto(s)
Biología Computacional/métodos , Modelos Moleculares , Proteínas/química , Análisis por Conglomerados , Pliegue de Proteína , Estructura Terciaria de Proteína , Programas Informáticos
15.
BMC Bioinformatics ; 12: 160, 2011 May 16.
Artículo en Inglés | MEDLINE | ID: mdl-21575183

RESUMEN

BACKGROUND: The accurate prediction of ligand binding residues from amino acid sequences is important for the automated functional annotation of novel proteins. In the previous two CASP experiments, the most successful methods in the function prediction category were those which used structural superpositions of 3D models and related templates with bound ligands in order to identify putative contacting residues. However, whilst most of this prediction process can be automated, visual inspection and manual adjustments of parameters, such as the distance thresholds used for each target, have often been required to prevent over prediction. Here we describe a novel method FunFOLD, which uses an automatic approach for cluster identification and residue selection. The software provided can easily be integrated into existing fold recognition servers, requiring only a 3D model and list of templates as inputs. A simple web interface is also provided allowing access to non-expert users. The method has been benchmarked against the top servers and manual prediction groups tested at both CASP8 and CASP9. RESULTS: The FunFOLD method shows a significant improvement over the best available servers and is shown to be competitive with the top manual prediction groups that were tested at CASP8. The FunFOLD method is also competitive with both the top server and manual methods tested at CASP9. When tested using common subsets of targets, the predictions from FunFOLD are shown to achieve a significantly higher mean Matthews Correlation Coefficient (MCC) scores and Binding-site Distance Test (BDT) scores than all server methods that were tested at CASP8. Testing on the CASP9 set showed no statistically significant separation in performance between FunFOLD and the other top server groups tested. CONCLUSIONS: The FunFOLD software is freely available as both a standalone package and a prediction server, providing competitive ligand binding site residue predictions for expert and non-expert users alike. The software provides a new fully automated approach for structure based function prediction using 3D models of proteins.


Asunto(s)
Proteínas/química , Programas Informáticos , Secuencia de Aminoácidos , Sitios de Unión , Análisis por Conglomerados , Internet , Modelos Moleculares , Unión Proteica , Conformación Proteica , Homología de Secuencia de Aminoácido , Homología Estructural de Proteína
16.
Nucleic Acids Res ; 39(Web Server issue): W171-6, 2011 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-21459847

RESUMEN

The IntFOLD server is a novel independent server that integrates several cutting edge methods for the prediction of structure and function from sequence. Our guiding principles behind the server development were as follows: (i) to provide a simple unified resource that makes our prediction software accessible to all and (ii) to produce integrated output for predictions that can be easily interpreted. The output for predictions is presented as a simple table that summarizes all results graphically via plots and annotated 3D models. The raw machine readable data files for each set of predictions are also provided for developers, which comply with the Critical Assessment of Methods for Protein Structure Prediction (CASP) data standards. The server comprises an integrated suite of five novel methods: nFOLD4, for tertiary structure prediction; ModFOLD 3.0, for model quality assessment; DISOclust 2.0, for disorder prediction; DomFOLD 2.0 for domain prediction; and FunFOLD 1.0, for ligand binding site prediction. Predictions from the IntFOLD server were found to be competitive in several categories in the recent CASP9 experiment. The IntFOLD server is available at the following web site: http://www.reading.ac.uk/bioinf/IntFOLD/.


Asunto(s)
Modelos Moleculares , Conformación Proteica , Programas Informáticos , Sitios de Unión , Internet , Ligandos , Pliegue de Proteína , Estructura Terciaria de Proteína , Análisis de Secuencia de Proteína
17.
Bioinformatics ; 26(22): 2920-1, 2010 Nov 15.
Artículo en Inglés | MEDLINE | ID: mdl-20861025

RESUMEN

MOTIVATION: We propose a novel method for scoring the accuracy of protein binding site predictions-the Binding-site Distance Test (BDT) score. Recently, the Matthews Correlation Coefficient (MCC) has been used to evaluate binding site predictions, both by developers of new methods and by the assessors for the community-wide prediction experiment-CASP8. While being a rigorous scoring method, the MCC does not take into account the actual 3D location of the predicted residues from the observed binding site. Thus, an incorrectly predicted site that is nevertheless close to the observed binding site will obtain an identical score to the same number of non-binding residues predicted at random. The MCC is somewhat affected by the subjectivity of determining observed binding residues and the ambiguity of choosing distance cutoffs. By contrast the BDT method produces continuous scores ranging between 0 and 1, relating to the distance between the predicted and observed residues. Residues predicted close to the binding site will score higher than those more distant, providing a better reflection of the true accuracy of predictions. The CASP8 function predictions were evaluated using both the MCC and BDT methods and the scores were compared. The BDT was found to strongly correlate with the MCC scores while also being less susceptible to the subjectivity of defining binding residues. We therefore suggest that this new simple score is a potentially more robust method for future evaluations of protein-ligand binding site predictions. AVAILABILITY: http://www.reading.ac.uk/bioinf/downloads/.


Asunto(s)
Biología Computacional/métodos , Proteínas/química , Sitios de Unión , Bases de Datos de Proteínas , Conformación Proteica , Análisis de Secuencia de Proteína
18.
Bioinformatics ; 26(2): 182-8, 2010 Jan 15.
Artículo en Inglés | MEDLINE | ID: mdl-19897565

RESUMEN

MOTIVATION: The accurate prediction of the quality of 3D models is a key component of successful protein tertiary structure prediction methods. Currently, clustering- or consensus-based Model Quality Assessment Programs (MQAPs) are the most accurate methods for predicting 3D model quality; however, they are often CPU intensive as they carry out multiple structural alignments in order to compare numerous models. In this study, we describe ModFOLDclustQ--a novel MQAP that compares 3D models of proteins without the need for CPU intensive structural alignments by utilizing the Q measure for model comparisons. The ModFOLDclustQ method is benchmarked against the top established methods in terms of both accuracy and speed. In addition, the ModFOLDclustQ scores are combined with those from our older ModFOLDclust method to form a new method, ModFOLDclust2, that aims to provide increased prediction accuracy with negligible computational overhead. RESULTS: The ModFOLDclustQ method is competitive with leading clustering-based MQAPs for the prediction of global model quality, yet it is up to 150 times faster than the previous version of the ModFOLDclust method at comparing models of small proteins (<60 residues) and over five times faster at comparing models of large proteins (>800 residues). Furthermore, a significant improvement in accuracy can be gained over the previous clustering-based MQAPs by combining the scores from ModFOLDclustQ and ModFOLDclust to form the new ModFOLDclust2 method, with little impact on the overall time taken for each prediction. AVAILABILITY: The ModFOLDclustQ and ModFOLDclust2 methods are available to download from http://www.reading.ac.uk/bioinf/downloads/.


Asunto(s)
Biología Computacional/métodos , Reconocimiento de Normas Patrones Automatizadas/métodos , Conformación Proteica , Proteínas/química , Bases de Datos de Proteínas , Modelos Moleculares , Pliegue de Proteína
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA