Dynamics of domain coverage of the protein sequence universe.

Rekapalli, Bhanu; Wuichet, Kristin; Peterson, Gregory D; Zhulin, Igor B

Rekapalli, Bhanu; Wuichet, Kristin; Peterson, Gregory D; Zhulin, Igor B.

Afiliação

Rekapalli B; Joint Institute for Computational Sciences, Oak Ridge National Laboratory - University of Tennessee, Oak Ridge, TN 37831, USA.

BMC Genomics ; 13: 634, 2012 Nov 16.

Article em En | MEDLINE | ID: mdl-23157439

RESUMO

BACKGROUND: The currently known protein sequence space consists of millions of sequences in public databases and is rapidly expanding. Assigning sequences to families leads to a better understanding of protein function and the nature of the protein universe. However, a large portion of the current protein space remains unassigned and is referred to as its "dark matter". RESULTS: Here we suggest that true size of "dark matter" is much larger than stated by current definitions. We propose an approach to reducing the size of "dark matter" by identifying and subtracting regions in protein sequences that are not likely to contain any domain. CONCLUSIONS: Recent improvements in computational domain modeling result in a decrease, albeit slowly, in the relative size of "dark matter"; however, its absolute size increases substantially with the growth of sequence data.

Assuntos

Biologia Computacional/métodos; Bases de Dados de Proteínas; Proteínas/química; Humanos; Estrutura Terciária de Proteína; Proteínas/metabolismo

Texto completo

Adicionar na Minha BVS

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Proteínas / Biologia Computacional / Bases de Dados de Proteínas Limite: Humans Idioma: En Revista: BMC Genomics Assunto da revista: GENETICA Ano de publicação: 2012 Tipo de documento: Article País de afiliação: Estados Unidos País de publicação: Reino Unido

Texto completo

Adicionar na Minha BVS

Imprimir

XML

PubMed Links

Buscar no Google