Identification of residue pairing in interacting ß-strands from a predicted residue contact map.
BMC Bioinformatics
; 19(1): 146, 2018 04 19.
Article
em En
| MEDLINE
| ID: mdl-29673311
ABSTRACT
BACKGROUND:
Despite the rapid progress of protein residue contact prediction, predicted residue contact maps frequently contain many errors. However, information of residue pairing in ß strands could be extracted from a noisy contact map, due to the presence of characteristic contact patterns in ß-ß interactions. This information may benefit the tertiary structure prediction of mainly ß proteins. In this work, we propose a novel ridge-detection-based ß-ß contact predictor to identify residue pairing in ß strands from any predicted residue contact map.RESULTS:
Our algorithm RDb2C adopts ridge detection, a well-developed technique in computer image processing, to capture consecutive residue contacts, and then utilizes a novel multi-stage random forest framework to integrate the ridge information and additional features for prediction. Starting from the predicted contact map of CCMpred, RDb2C remarkably outperforms all state-of-the-art methods on two conventional test sets of ß proteins (BetaSheet916 and BetaSheet1452), and achieves F1-scores of ~ 62% and ~ 76% at the residue level and strand level, respectively. Taking the prediction of the more advanced RaptorX-Contact as input, RDb2C achieves impressively higher performance, with F1-scores reaching ~ 76% and ~ 86% at the residue level and strand level, respectively. In a test of structural modeling using the top 1 L predicted contacts as constraints, for 61 mainly ß proteins, the average TM-score achieves 0.442 when using the raw RaptorX-Contact prediction, but increases to 0.506 when using the improved prediction by RDb2C.CONCLUSION:
Our method can significantly improve the prediction of ß-ß contacts from any predicted residue contact maps. Prediction results of our algorithm could be directly applied to effectively facilitate the practical structure prediction of mainly ß proteins.AVAILABILITY:
All source data and codes are available at http//166.111.152.91/Downloads.html or the GitHub address of https//github.com/wzmao/RDb2C .Palavras-chave
Texto completo:
1
Base de dados:
MEDLINE
Assunto principal:
Proteínas
/
Biologia Computacional
/
Aminoácidos
Tipo de estudo:
Diagnostic_studies
/
Prognostic_studies
/
Risk_factors_studies
Idioma:
En
Ano de publicação:
2018
Tipo de documento:
Article