The evolution of contact prediction: evidence that contact selection in statistical contact prediction is changing.
Bioinformatics
; 36(6): 1750-1756, 2020 03 01.
Article
em En
| MEDLINE
| ID: mdl-31693112
MOTIVATION: Over the last few years, the field of protein structure prediction has been transformed by increasingly accurate contact prediction software. These methods are based on the detection of coevolutionary relationships between residues from multiple sequence alignments (MSAs). However, despite speculation, there is little evidence of a link between contact prediction and the physico-chemical interactions which drive amino-acid coevolution. Furthermore, existing protocols predict only a fraction of all protein contacts and it is not clear why some contacts are favoured over others. Using a dataset of 863 protein domains, we assessed the physico-chemical interactions of contacts predicted by CCMpred, MetaPSICOV and DNCON2, as examples of direct coupling analysis, meta-prediction and deep learning. RESULTS: We considered correctly predicted contacts and compared their properties against the protein contacts that were not predicted. Predicted contacts tend to form more bonds than non-predicted contacts, which suggests these contacts may be more important than contacts that were not predicted. Comparing the contacts predicted by each method, we found that metaPSICOV and DNCON2 favour accuracy, whereas CCMPred detects contacts with more bonds. This suggests that the push for higher accuracy may lead to a loss of physico-chemically important contacts. These results underscore the connection between protein physico-chemistry and the coevolutionary couplings that can be derived from MSAs. This relationship is likely to be relevant to protein structure prediction and functional analysis of protein structure and may be key to understanding their utility for different problems in structural biology. AVAILABILITY AND IMPLEMENTATION: We use publicly available databases. Our code is available for download at https://opig.stats.ox.ac.uk/. SUPPLEMENTARY INFORMATION: Supplementary information is available at Bioinformatics online.
Texto completo:
1
Base de dados:
MEDLINE
Assunto principal:
Biologia Computacional
/
Análise de Sequência de Proteína
Idioma:
En
Ano de publicação:
2020
Tipo de documento:
Article