CLIPS-1D: analysis of multiple sequence alignments to deduce for residue-positions a role in catalysis, ligand-binding, or protein structure.

Janda, Jan-Oliver; Busch, Markus; Kück, Fabian; Porfenenko, Mikhail; Merkl, Rainer

Janda, Jan-Oliver; Busch, Markus; Kück, Fabian; Porfenenko, Mikhail; Merkl, Rainer.

Afiliação

Janda JO; Institute of Biophysics and Physical Biochemistry, University of Regensburg, 93040 Regensburg, Germany. Rainer.Merkl@biologie.uni-regensburg.de

BMC Bioinformatics ; 13: 55, 2012 Apr 05.

Article em En | MEDLINE | ID: mdl-22480135

ABSTRACT

ABSTRACT

BACKGROUND:

One aim of the in silico characterization of proteins is to identify all residue-positions, which are crucial for function or structure. Several sequence-based algorithms exist, which predict functionally important sites. However, with respect to sequence information, many functionally and structurally important sites are hard to distinguish and consequently a large number of incorrectly predicted functional sites have to be expected. This is why we were interested to design a new classifier that differentiates between functionally and structurally important sites and to assess its performance on representative datasets.

RESULTS:

We have implemented CLIPS-1D, which predicts a role in catalysis, ligand-binding, or protein structure for residue-positions in a mutually exclusive manner. By analyzing a multiple sequence alignment, the algorithm scores conservation as well as abundance of residues at individual sites and their local neighborhood and categorizes by means of a multiclass support vector machine. A cross-validation confirmed that residue-positions involved in catalysis were identified with state-of-the-art quality; the mean MCC-value was 0.34. For structurally important sites, prediction quality was considerably higher (mean MCC = 0.67). For ligand-binding sites, prediction quality was lower (mean MCC = 0.12), because binding sites and structurally important residue-positions share conservation and abundance values, which makes their separation difficult. We show that classification success varies for residues in a class-specific manner. This is why our algorithm computes residue-specific p-values, which allow for the statistical assessment of each individual prediction. CLIPS-1D is available as a Web service at http//www-bioinf.uni-regensburg.de/.

CONCLUSIONS:

CLIPS-1D is a classifier, whose prediction quality has been determined separately for catalytic sites, ligand-binding sites, and structurally important sites. It generates hypotheses about residue-positions important for a set of homologous proteins and focuses on conservation and abundance signals. Thus, the algorithm can be applied in cases where function cannot be transferred from well-characterized proteins by means of sequence comparison.

Assuntos

Algoritmos; Alinhamento de Sequência/métodos; Máquina de Vetores de Suporte; Sítios de Ligação; Catálise; Glicerofosfatos/metabolismo; Indol-3-Glicerolfosfato Sintase/química; Indol-3-Glicerolfosfato Sintase/metabolismo; Internet; Ligantes; Modelos Moleculares; Proteínas/química; Proteínas/metabolismo; Sulfolobus solfataricus/enzimologia

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Algoritmos / Alinhamento de Sequência / Máquina de Vetores de Suporte Tipo de estudo: Prognostic_studies Idioma: En Revista: BMC Bioinformatics Assunto da revista: INFORMATICA MEDICA Ano de publicação: 2012 Tipo de documento: Article País de afiliação: Alemanha

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google