Predicting protein-protein interactions from sequence using correlation coefficient and high-quality interaction dataset.

Shi, Ming-Guang; Xia, Jun-Feng; Li, Xue-Ling; Huang, De-Shuang

Shi, Ming-Guang; Xia, Jun-Feng; Li, Xue-Ling; Huang, De-Shuang.

Afiliação

Shi MG; Intelligent Computing Lab, Hefei Institute of Intelligent Machines, Chinese Academy of Sciences, 230031 Hefei, China.

Amino Acids ; 38(3): 891-9, 2010 Mar.

Article em En | MEDLINE | ID: mdl-19387790

RESUMO

Identifying protein-protein interactions (PPIs) is critical for understanding the cellular function of the proteins and the machinery of a proteome. Data of PPIs derived from high-throughput technologies are often incomplete and noisy. Therefore, it is important to develop computational methods and high-quality interaction dataset for predicting PPIs. A sequence-based method is proposed by combining correlation coefficient (CC) transformation and support vector machine (SVM). CC transformation not only adequately considers the neighboring effect of protein sequence but describes the level of CC between two protein sequences. A gold standard positives (interacting) dataset MIPS Core and a gold standard negatives (non-interacting) dataset GO-NEG of yeast Saccharomyces cerevisiae were mined to objectively evaluate the above method and attenuate the bias. The SVM model combined with CC transformation yielded the best performance with a high accuracy of 87.94% using gold standard positives and gold standard negatives datasets. The source code of MATLAB and the datasets are available on request under smgsmg@mail.ustc.edu.cn.

Assuntos

Aminoácidos/química; Mapeamento de Interação de Proteínas; Proteoma/química; Proteoma/metabolismo; Proteínas de Saccharomyces cerevisiae/química; Proteínas de Saccharomyces cerevisiae/metabolismo; Algoritmos; Sequência de Aminoácidos; Inteligência Artificial; Proteínas de Bactérias/química; Proteínas de Bactérias/metabolismo; Biologia Computacional/métodos; Mineração de Dados; Bases de Dados de Proteínas; Helicobacter pylori; Modelos Biológicos; Ligação Proteica; Proteômica/métodos

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Proteoma / Proteínas de Saccharomyces cerevisiae / Mapeamento de Interação de Proteínas / Aminoácidos Tipo de estudo: Evaluation_studies / Prognostic_studies / Risk_factors_studies Idioma: En Revista: Amino Acids Assunto da revista: BIOQUIMICA Ano de publicação: 2010 Tipo de documento: Article País de afiliação: China

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google