Pesquisa | Portal Regional da BVS

Structural diversity of domain superfamilies in the CATH database.

Reeves, Gabrielle A; Dallman, Timothy J; Redfern, Oliver C; Akpor, Adrian; Orengo, Christine A.

J Mol Biol ; 360(3): 725-41, 2006 Jul 14.

Artigo em Inglês | MEDLINE | ID: mdl-16780872

RESUMO

The CATH database of domain structures has been used to explore the structural variation of homologous domains in 294 well populated domain structure superfamilies, each containing at least three sequence diverse relatives. Our analyses confirm some previously detected trends relating sequence divergence to structural variation but for a much larger dataset and in some superfamilies the new data reveal exceptional structural variation. Use of a new algorithm (2DSEC) to analyse variability in secondary structure compositions across a superfamily sheds new light on how structures evolve. 2DSEC detects inserted secondary structures that embellish the core of conserved secondary structures found throughout the superfamily. Analysis showed that for 56% of highly populated superfamilies (>9 sequence diverse relatives), there are twofold or more increases in the numbers of secondary structures in some relatives. In some families fivefold increases occur, sometimes modifying the fold of the domain. Manual inspection of secondary structure insertions or embellishments in 48 particularly variable superfamilies revealed that although these insertions were usually discontiguous in the sequence they were often co-located in 3D resulting in a larger structural motif that often modified the geometry of the active site or the surface conformation promoting diverse domain partnerships and protein interactions. These observations, supported by automatic analysis of all well populated CATH families, suggest that accretion of small secondary structure insertions may provide a simple mechanism for evolving new functions in diverse relatives. Some layered domain architectures (e.g. mainly-beta and alpha-beta sandwiches) that recur highly in the genomes more frequently exploit these types of embellishments to modify function. In these architectures, aggregation occurs most often at the edges, top or bottom of the beta-sheets. Information on structural variability across domain superfamilies has been made available through the CATH Dictionary of Homologous Structures (DHS).

Assuntos

Bases de Dados de Proteínas , Estrutura Terciária de Proteína , Proteínas/química , Proteínas/classificação , Sequência de Aminoácidos , Azurina/química , Carboidratos/química , Sequência Conservada , Galectinas/química , Lacase/química , Mutação/genética , Estrutura Secundária de Proteína , Homologia Estrutural de Proteína

The CATH Domain Structure Database and related resources Gene3D and DHS provide comprehensive domain family information for genome analysis.

Pearl, Frances; Todd, Annabel; Sillitoe, Ian; Dibley, Mark; Redfern, Oliver; Lewis, Tony; Bennett, Christopher; Marsden, Russell; Grant, Alistair; Lee, David; Akpor, Adrian; Maibaum, Michael; Harrison, Andrew; Dallman, Timothy; Reeves, Gabrielle; Diboun, Ilhem; Addou, Sarah; Lise, Stefano; Johnston, Caroline; Sillero, Antonio; Thornton, Janet; Orengo, Christine.

Nucleic Acids Res ; 33(Database issue): D247-51, 2005 Jan 01.

Artigo em Inglês | MEDLINE | ID: mdl-15608188

RESUMO

The CATH database of protein domain structures (http://www.biochem.ucl.ac.uk/bsm/cath/) currently contains 43,229 domains classified into 1467 superfamilies and 5107 sequence families. Each structural family is expanded with sequence relatives from GenBank and completed genomes, using a variety of efficient sequence search protocols and reliable thresholds. This extended CATH protein family database contains 616,470 domain sequences classified into 23,876 sequence families. This results in the significant expansion of the CATH HMM model library to include models built from the CATH sequence relatives, giving a 10% increase in coverage for detecting remote homologues. An improved Dictionary of Homologous superfamilies (DHS) (http://www.biochem.ucl.ac.uk/bsm/dhs/) containing specific sequence, structural and functional information for each superfamily in CATH considerably assists manual validation of homologues. Information on sequence relatives in CATH superfamilies, GenBank and completed genomes is presented in the CATH associated DHS and Gene3D resources. Domain partnership information can be obtained from Gene3D (http://www.biochem.ucl.ac.uk/bsm/cath/Gene3D/). A new CATH server has been implemented (http://www.biochem.ucl.ac.uk/cgi-bin/cath/CathServer.pl) providing automatic classification of newly determined sequences and structures using a suite of rapid sequence and structure comparison methods. The statistical significance of matches is assessed and links are provided to the putative superfamily or fold group to which the query sequence or structure is assigned.

Assuntos

Bases de Dados de Ácidos Nucleicos , Bases de Dados de Proteínas , Genômica , Estrutura Terciária de Proteína , Proteínas/classificação , Análise de Sequência de Proteína , Bases de Dados de Proteínas/estatística & dados numéricos , Internet , Proteínas/genética , Homologia de Sequência de Aminoácidos , Integração de Sistemas , Interface Usuário-Computador

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA