Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Resultados 1 - 2 de 2
Filtrar
Más filtros

Banco de datos
Tipo del documento
Publication year range
1.
Nucleic Acids Res ; 33(Web Server issue): W126-9, 2005 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-15980440

RESUMEN

PROtein Domain Organization and Comparison (PRODOC) comprises several programs that enable convenient comparison of proteins as a sequence of domains. The in-built dataset currently consists of approximately 698 000 proteins from 192 organisms with complete genomic data, and all the SWISSPROT proteins obtained from the Pfam database. All the entries in PRODOC are represented as a sequence of functional domains, assigned using hidden Markov models, instead of as a sequence of amino acids. On average 69% of the proteins in the proteomes and 49% of the residues are covered by functional domain assignments. Software tools allow the user to query the dataset with a sequence of domains and identify proteins with the same or a jumbled or circularly permuted arrangement of domains. As it is proposed that proteins with jumbled or the same domain sequences have similar functions, this search tool is useful in assigning the overall function of a multi-domain protein. Unique features of PRODOC include the generation of alignments between multi-domain proteins on the basis of the sequence of domains and in-built information on distantly related domain families forming superfamilies. It is also possible using PRODOC to identify domain sharing and gene fusion events across organisms. An exhaustive genome-genome comparison tool in PRODOC also enables the detection of successive domain sharing and domain fusion events across two organisms. The tool permits the identification of gene clusters involved in similar biological processes in two closely related organisms. The URL for PRODOC is http://hodgkin.mbu.iisc.ernet.in/~prodoc.


Asunto(s)
Estructura Terciaria de Proteína , Programas Informáticos , Bases de Datos de Proteínas , Internet , Proteínas/clasificación , Proteínas/genética , Análisis de Secuencia de Proteína
2.
Nucleic Acids Res ; 30(1): 289-93, 2002 Jan 01.
Artículo en Inglés | MEDLINE | ID: mdl-11752317

RESUMEN

Members of a superfamily of proteins could result from divergent evolution of homologues with insignificant similarity in the amino acid sequences. A superfamily relationship is detected commonly after the three-dimensional structures of the proteins are determined using X-ray analysis or NMR. The SUPFAM database described here relates two homologous protein families in a multiple sequence alignment database of either known or unknown structure. The present release (1.1), which is the first version of the SUPFAM database, has been derived by analysing Pfam, which is one of the commonly used databases of multiple sequence alignments of homologous proteins. The first step in establishing SUPFAM is to relate Pfam families with the families in PALI, which is an alignment database of homologous proteins of known structure that is derived largely from SCOP. The second step involves relating Pfam families which could not be associated reliably with a protein superfamily of known structure. The profile matching procedure, IMPALA, has been used in these steps. The first step resulted in identification of 1280 Pfam families (out of 2697, i.e. 47%) which are related, either by close homologous connection to a SCOP family or by distant relationship to a SCOP family, potentially forming new superfamily connections. Using the profiles of 1417 Pfam families with apparently no structural information, an all-against-all comparison involving a sequence-profile match using IMPALA resulted in clustering of 67 homologous protein families of Pfam into 28 potential new superfamilies. Expansion of groups of related proteins of yet unknown structural information, as proposed in SUPFAM, should help in identifying 'priority proteins' for structure determination in structural genomics initiatives to expand the coverage of structural information in the protein sequence space. For example, we could assign 858 distinct Pfam domains in 2203 of the gene products in the genome of Mycobacterium tubercolosis. Fifty-one of these Pfam families of unknown structure could be clustered into 17 potentially new superfamilies forming good targets for structural genomics. SUPFAM database can be accessed at http://pauling.mbu.iisc.ernet.in/~supfam.


Asunto(s)
Bases de Datos de Proteínas , Genoma , Estructura Terciaria de Proteína , Proteínas/química , Proteínas/genética , Animales , Predicción , Genoma Bacteriano , Imagenología Tridimensional , Almacenamiento y Recuperación de la Información , Internet , Mycobacterium tuberculosis/genética , Proteínas/fisiología , Alineación de Secuencia , Homología de Secuencia de Aminoácido , Relación Estructura-Actividad
SELECCIÓN DE REFERENCIAS
Detalles de la búsqueda