Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Bioinformatics ; 28(9): 1209-15, 2012 May 01.
Artigo em Inglês | MEDLINE | ID: mdl-22399676

RESUMO

MOTIVATION: Structural alignment methods are widely used to generate gold standard alignments for improving multiple sequence alignments and transferring functional annotations, as well as for assigning structural distances between proteins. However, the correctness of the alignments generated by these methods is difficult to assess objectively since little is known about the exact evolutionary history of most proteins. Since homology is an equivalence relation, an upper bound on alignment quality can be found by assessing the consistency of alignments. Measuring the consistency of current methods of structure alignment and determining the causes of inconsistencies can, therefore, provide information on the quality of current methods and suggest possibilities for further improvement. RESULTS: We analyze the self-consistency of seven widely-used structural alignment methods (SAP, TM-align, Fr-TM-align, MAMMOTH, DALI, CE and FATCAT) on a diverse, non-redundant set of 1863 domains from the SCOP database and demonstrate that even for relatively similar proteins the degree of inconsistency of the alignments on a residue level is high (30%). We further show that levels of consistency vary substantially between methods, with two methods (SAP and Fr-TM-align) producing more consistent alignments than the rest. Inconsistency is found to be higher near gaps and for proteins of low structural complexity, as well as for helices. The ability of the methods to identify good structural alignments is also assessed using geometric measures, for which FATCAT (flexible mode) is found to be the best performer despite being highly inconsistent. We conclude that there is substantial scope for improving the consistency of structural alignment methods. CONTACT: msadows@nimr.mrc.ac.uk SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Proteínas/química , Proteínas/genética , Alinhamento de Sequência/métodos , Análise de Sequência de Proteína/métodos , Homologia Estrutural de Proteína , Estrutura Secundária de Proteína
2.
J Struct Biol ; 172(3): 244-52, 2010 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-20691788

RESUMO

Existing protein structure classifications group proteins by overall structural similarity at the highest level and by evolutionary relationships at the lowest level, deriving higher-level groups by pairwise structure comparison. For this to be successful requires that large changes in structure are relatively rare in evolution and that proteins with no detectable evolutionary relationship do not converge on similar global chain conformations since this creates conflicts between structural and evolutionary consistency. Analysis of global structural changes using core topological descriptions for 4261 domains from classes C and D of the SCOP database and new measures of topological distance and consistency of classification showed that the topological consistency of SCOP folds is highly variable with some folds having no consistent description and significant overlaps between groups including some members of separate folds with identical topological descriptions. Topological clustering shows that including sufficient indels to allow family members to be joined would also require joining several distinct folds. We conclude that evolutionary changes in the global topology of protein domains are the root cause of many difficulties for present approaches to structure classification using pairwise comparison. As a resolution we propose that a purely structural classification should be created using an approach similar to that adopted by the Gene Ontology in which proteins are assigned labels describing structure.


Assuntos
Proteínas/química , Modelos Teóricos , Dobramento de Proteína
3.
Proteins ; 69(3): 476-85, 2007 Nov 15.
Artigo em Inglês | MEDLINE | ID: mdl-17623860

RESUMO

Comparative modeling is presently the most accurate method of protein structure prediction. Previous experiments have shown the selection of the correct template to be of paramount importance to the quality of the final model. We have derived a set of 732 targets for which a choice of ten or more templates exist with 30-80% sequence identity and used this set to compare a number of possible methods for template selection: BLAST, PSI-BLAST, profile-profile alignment, HHpred HMM-HMM comparison, global sequence alignment, and the use of a model quality assessment program (MQAP). In addition, we have investigated the question of whether any structurally defined subset of the sequence could be used to predict template quality better than overall sequence similarity. We find that template selection by BLAST is sufficient in 75% of cases but that there are examples in which improvement (global RMSD 0.5 A or more) could be made. No significant improvement is found for any of the more sophisticated sequence-based methods of template selection at high sequence identities. A subset of 118 targets extending to the lowest levels of sequence similarity was examined and the HHpred and MQAP methods were found to improve ranking when available templates had 35-40% maximum sequence identity. Structurally defined subsets in general are found to be less discriminative than overall sequence similarity, with the coil residue subset performing equivalently to sequence similarity. Finally, we demonstrate that if models are built and model quality is assessed in combination with the sequence-template sequence similarity that a extra 7% of "best" models can be found.


Assuntos
Modelos Moleculares , Conformação Proteica , Estrutura Terciária de Proteína , Análise de Sequência de Proteína
4.
Proteins ; 61 Suppl 7: 143-151, 2005.
Artigo em Inglês | MEDLINE | ID: mdl-16187356

RESUMO

A number of new and newly improved methods for predicting protein structure developed by the Jones-University College London group were used to make predictions for the CASP6 experiment. Structures were predicted with a combination of fold recognition methods (mGenTHREADER, nFOLD, and THREADER) and a substantially enhanced version of FRAGFOLD, our fragment assembly method. Attempts at automatic domain parsing were made using DomPred and DomSSEA, which are based on a secondary structure parsing algorithm and additionally for DomPred, a simple local sequence alignment scoring function. Disorder prediction was carried out using a new SVM-based version of DISOPRED. Attempts were also made at domain docking and "microdomain" folding in order to build complete chain models for some targets.


Assuntos
Biologia Computacional/métodos , Proteômica/métodos , Algoritmos , Simulação por Computador , Computadores , Bases de Dados de Proteínas , Dimerização , Humanos , Modelos Moleculares , Conformação Proteica , Dobramento de Proteína , Estrutura Secundária de Proteína , Estrutura Terciária de Proteína , Reprodutibilidade dos Testes , Alinhamento de Sequência , Software
5.
Biosystems ; 81(3): 247-54, 2005 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-16076522

RESUMO

Several stratagems are used in protein bioinformatics for the classification of proteins based on sequence, structure or function. We explore the concept of a minimal signature embedded in a sequence that defines the likely position of a protein in a classification. Specifically, we address the derivation of sparse profiles for the G-protein coupled receptor (GPCR) clan of integral membrane proteins. We present an evolutionary algorithm (EA) for the derivation of sparse profiles (signatures) without the need to supply a multiple alignment. We also apply an evolution strategy (ES) to the problem of pattern and profile refinement. Patterns were derived for the GPCR 'superfamily' and GPCR families 1-3 individually from starting populations of randomly generated signatures, using a database of integral membrane protein sequences and an objective function using a modified receiver operator characteristic (ROC) statistic. The signature derived for the family 1 GPCR sequences was shown to perform very well in a stringent cross-validation test, detecting 76% of unseen GPCR sequences at 5% error. Application of the ES refinement method to a signature developed by a previously described method [Sadowski, M.I., Parish, J.H., 2003. Automated generation and refinement of protein signatures: case study with G-protein coupled receptors. Bioinformatics 19, 727-734] resulted in a 6% increase of coverage for 5% error as measured in the validation test. We note that there might be a limit to this or any classification of proteins based on patterns or schemata.


Assuntos
Algoritmos , Biologia Computacional/métodos , Reconhecimento Automatizado de Padrão/métodos , Receptores Acoplados a Proteínas G/genética , Análise de Sequência de Proteína/métodos , Motivos de Aminoácidos/genética , Bases de Dados Genéticas , Receptores Acoplados a Proteínas G/classificação , Reprodutibilidade dos Testes
6.
Curr Opin Struct Biol ; 19(3): 357-62, 2009 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-19406632

RESUMO

An incomplete understanding of protein sequence/structure/function relationships causes many difficulties for prediction methods. The highly complex nature of these relationships is a consequence of the interplay between physics and evolution that has been studied using a wide array of experimental and theoretical techniques. We review recent findings relating to conservation of sequence, structure and function and discuss their use in developing improved prediction methods.


Assuntos
Proteínas/química , Proteínas/metabolismo , Sequência de Aminoácidos , Humanos , Modelos Moleculares , Dados de Sequência Molecular , Conformação Proteica
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA