A high-throughput predictive method for sequence-similar fold switchers.
Biopolymers
; 112(10): e23416, 2021 Oct.
Article
em En
| MEDLINE
| ID: mdl-33462801
Although most experimentally characterized proteins with similar sequences assume the same folds and perform similar functions, an increasing number of exceptions is emerging. One class of exceptions comprises sequence-similar fold switchers, whose secondary structures shift from α-helix <-> ß-sheet through a small number of mutations, a sequence insertion, or a deletion. Predictive methods for identifying sequence-similar fold switchers are desirable because some are associated with disease and/or can perform different functions in cells. Here, we use homology-based secondary structure predictions to identify sequence-similar fold switchers from their amino acid sequences alone. To do this, we predicted the secondary structures of sequence-similar fold switchers using three different homology-based secondary structure predictors: PSIPRED, JPred4, and SPIDER3. We found that α-helix <-> ß-strand prediction discrepancies from JPred4 discriminated between the different conformations of sequence-similar fold switchers with high statistical significance (P < 1.8*10-19 ). Thus, we used these discrepancies as a classifier and found that they can often robustly discriminate between sequence-similar fold switchers and sequence-similar proteins that maintain the same folds (Matthews Correlation Coefficient of 0.82). We found that JPred4 is a more robust predictor of sequence-similar fold switchers because of (a) the curated sequence database it uses to produce multiple sequence alignments and (b) its use of sequence profiles based on Hidden Markov Models. Our results indicate that inconsistencies between JPred4 secondary structure predictions can be used to identify some sequence-similar fold switchers from their sequences alone. Thus, the negative information from inconsistent secondary structure predictions can potentially be leveraged to identify sequence-similar fold switchers from the broad base of genomic sequences.
Palavras-chave
Texto completo:
1
Bases de dados:
MEDLINE
Assunto principal:
Proteínas
/
Dobramento de Proteína
Tipo de estudo:
Prognostic_studies
/
Risk_factors_studies
Idioma:
En
Revista:
Biopolymers
Ano de publicação:
2021
Tipo de documento:
Article
País de afiliação:
Estados Unidos