Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 27
Filtrar
1.
Bioessays ; 45(9): e2300057, 2023 09.
Artigo em Inglês | MEDLINE | ID: mdl-37431685

RESUMO

Fold-switching proteins, which remodel their secondary and tertiary structures in response to cellular stimuli, suggest a new view of protein fold space. For decades, experimental evidence has indicated that protein fold space is discrete: dissimilar folds are encoded by dissimilar amino acid sequences. Challenging this assumption, fold-switching proteins interconnect discrete groups of dissimilar protein folds, making protein fold space fluid. Three recent observations support the concept of fluid fold space: (1) some amino acid sequences interconvert between folds with distinct secondary structures, (2) some naturally occurring sequences have switched folds by stepwise mutation, and (3) fold switching is evolutionarily selected and likely confers advantage. These observations indicate that minor amino acid sequence modifications can transform protein structure and function. Consequently, proteomic structural and functional diversity may be expanded by alternative splicing, small nucleotide polymorphisms, post-translational modifications, and modified translation rates.


Assuntos
Dobramento de Proteína , Proteômica , Modelos Moleculares , Proteínas/metabolismo , Sequência de Aminoácidos
2.
PLoS Genet ; 15(4): e1008092, 2019 04.
Artigo em Inglês | MEDLINE | ID: mdl-31022184

RESUMO

Human leukocyte antigen (HLA) is a key genetic factor conferring risk of systemic lupus erythematosus (SLE), but precise independent localization of HLA effects is extremely challenging. As a result, the contribution of specific HLA alleles and amino-acid residues to the overall risk of SLE and to risk of specific autoantibodies are far from completely understood. Here, we dissected (a) overall SLE association signals across HLA, (b) HLA-peptide interaction, and (c) residue-autoantibody association. Classical alleles, SNPs, and amino-acid residues of eight HLA genes were imputed across 4,915 SLE cases and 13,513 controls from Eastern Asia. We performed association followed by conditional analysis across HLA, assessing both overall SLE risk and risk of autoantibody production. DR15 alleles HLA-DRB1*15:01 (P = 1.4x10-27, odds ratio (OR) = 1.57) and HLA-DQB1*06:02 (P = 7.4x10-23, OR = 1.55) formed the most significant haplotype (OR = 2.33). Conditioned protein-residue signals were stronger than allele signals and mapped predominantly to HLA-DRB1 residue 13 (P = 2.2x10-75) and its proxy position 11 (P = 1.1x10-67), followed by HLA-DRB1-37 (P = 4.5x10-24). After conditioning on HLA-DRB1, novel associations at HLA-A-70 (P = 1.4x10-8), HLA-DPB1-35 (P = 9.0x10-16), HLA-DQB1-37 (P = 2.7x10-14), and HLA-B-9 (P = 6.5x10-15) emerged. Together, these seven residues increased the proportion of explained heritability due to HLA to 2.6%. Risk residues for both overall disease and hallmark autoantibodies (i.e., nRNP: DRB1-11, P = 2.0x10-14; DRB1-13, P = 2.9x10-13; DRB1-30, P = 3.9x10-14) localized to the peptide-binding groove of HLA-DRB1. Enrichment for specific amino-acid characteristics in the peptide-binding groove correlated with overall SLE risk and with autoantibody presence. Risk residues were in primarily negatively charged side-chains, in contrast with rheumatoid arthritis. We identified novel SLE signals in HLA Class I loci (HLA-A, HLA-B), and localized primary Class II signals to five residues in HLA-DRB1, HLA-DPB1, and HLA-DQB1. These findings provide insights about the mechanisms by which the risk residues interact with each other to produce autoantibodies and are involved in SLE pathophysiology.


Assuntos
Sequência de Aminoácidos , Autoanticorpos/imunologia , Suscetibilidade a Doenças , Antígenos de Histocompatibilidade Classe II/química , Antígenos de Histocompatibilidade Classe II/imunologia , Antígenos de Histocompatibilidade Classe I/química , Antígenos de Histocompatibilidade Classe I/imunologia , Lúpus Eritematoso Sistêmico/etiologia , Alelos , Substituição de Aminoácidos , Povo Asiático , Feminino , Predisposição Genética para Doença , Variação Genética , Antígenos de Histocompatibilidade Classe I/genética , Antígenos de Histocompatibilidade Classe II/genética , Humanos , Masculino , Razão de Chances , Polimorfismo de Nucleotídeo Único
3.
Biopolymers ; 112(10): e23471, 2021 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-34498740

RESUMO

Extant fold-switching proteins remodel their secondary structures and change their functions in response to cellular stimuli, regulating biological processes and affecting human health. Despite their biological importance, these proteins remain understudied. Predictive methods are needed to expedite the process of discovering and characterizing more of these shapeshifting proteins. Most previous approaches require a solved structure or all-atom simulations, greatly constraining their use. Here, we propose a high-throughput sequence-based method for predicting extant fold switchers that transition from α-helix in one conformation to ß-strand in the other. This method leverages two previous observations: (a) α-helix â†” ß-strand prediction discrepancies from JPred4 are a robust predictor of fold switching, and (b) the fold-switching regions (FSRs) of some extant fold switchers have different secondary structure propensities when expressed by themselves (isolated FSRs) than when expressed within the context of their parent protein (contextualized FSRs). Combining these two observations, we ran JPred4 on 99-fold-switching proteins and found strong correspondence between predicted and experimentally observed α-helix â†” ß-strand discrepancies. To test the overall robustness of this finding, we randomly selected regions of proteins not expected to switch folds (single-fold proteins) and found significantly fewer predicted α-helix â†” ß-strand discrepancies. Combining these discrepancies with the overall percentage of predicted secondary structure, we developed a classifier to identify extant fold switchers (Matthews correlation coefficient of .71). Although this classifier had a high false-negative rate (7/17), its false-positive rate was very low (2/136), suggesting that it can be used to predict a subset of extant fold switchers from a multitude of available genomic sequences.


Assuntos
Dobramento de Proteína , Proteínas , Humanos , Conformação Proteica em alfa-Hélice , Conformação Proteica em Folha beta , Estrutura Secundária de Proteína
4.
Biopolymers ; 112(10): e23416, 2021 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-33462801

RESUMO

Although most experimentally characterized proteins with similar sequences assume the same folds and perform similar functions, an increasing number of exceptions is emerging. One class of exceptions comprises sequence-similar fold switchers, whose secondary structures shift from α-helix <-> ß-sheet through a small number of mutations, a sequence insertion, or a deletion. Predictive methods for identifying sequence-similar fold switchers are desirable because some are associated with disease and/or can perform different functions in cells. Here, we use homology-based secondary structure predictions to identify sequence-similar fold switchers from their amino acid sequences alone. To do this, we predicted the secondary structures of sequence-similar fold switchers using three different homology-based secondary structure predictors: PSIPRED, JPred4, and SPIDER3. We found that α-helix <-> ß-strand prediction discrepancies from JPred4 discriminated between the different conformations of sequence-similar fold switchers with high statistical significance (P < 1.8*10-19 ). Thus, we used these discrepancies as a classifier and found that they can often robustly discriminate between sequence-similar fold switchers and sequence-similar proteins that maintain the same folds (Matthews Correlation Coefficient of 0.82). We found that JPred4 is a more robust predictor of sequence-similar fold switchers because of (a) the curated sequence database it uses to produce multiple sequence alignments and (b) its use of sequence profiles based on Hidden Markov Models. Our results indicate that inconsistencies between JPred4 secondary structure predictions can be used to identify some sequence-similar fold switchers from their sequences alone. Thus, the negative information from inconsistent secondary structure predictions can potentially be leveraged to identify sequence-similar fold switchers from the broad base of genomic sequences.


Assuntos
Dobramento de Proteína , Proteínas , Sequência de Aminoácidos , Estrutura Secundária de Proteína , Alinhamento de Sequência
5.
Proc Natl Acad Sci U S A ; 115(23): 5968-5973, 2018 06 05.
Artigo em Inglês | MEDLINE | ID: mdl-29784778

RESUMO

A central tenet of biology is that globular proteins have a unique 3D structure under physiological conditions. Recent work has challenged this notion by demonstrating that some proteins switch folds, a process that involves remodeling of secondary structure in response to a few mutations (evolved fold switchers) or cellular stimuli (extant fold switchers). To date, extant fold switchers have been viewed as rare byproducts of evolution, but their frequency has been neither quantified nor estimated. By systematically and exhaustively searching the Protein Data Bank (PDB), we found ∼100 extant fold-switching proteins. Furthermore, we gathered multiple lines of evidence suggesting that these proteins are widespread in nature. Based on these lines of evidence, we hypothesized that the frequency of extant fold-switching proteins may be underrepresented by the structures in the PDB. Thus, we sought to identify other putative extant fold switchers with only one solved conformation. To do this, we identified two characteristic features of our ∼100 extant fold-switching proteins, incorrect secondary structure predictions and likely independent folding cooperativity, and searched the PDB for other proteins with similar features. Reassuringly, this method identified dozens of other proteins in the literature with indication of a structural change but only one solved conformation in the PDB. Thus, we used it to estimate that 0.5-4% of PDB proteins switch folds. These results demonstrate that extant fold-switching proteins are likely more common than the PDB reflects, which has implications for cell biology, genomics, and human health.


Assuntos
Dobramento de Proteína , Estrutura Secundária de Proteína , Proteínas/química , Proteínas/metabolismo , Bases de Dados de Proteínas , Genômica , Humanos , Interações Hidrofóbicas e Hidrofílicas , Modelos Moleculares
6.
Biopolymers ; 112(10): e23478, 2021 10.
Artigo em Inglês | MEDLINE | ID: mdl-34694634
7.
Biophys J ; 108(1): 154-62, 2015 Jan 06.
Artigo em Inglês | MEDLINE | ID: mdl-25564862

RESUMO

Metamorphic proteins, including proteins with high levels of sequence identity but different folds, are exceptions to the long-standing rule-of-thumb that proteins with as little as 30% sequence identity adopt the same fold. Which topologies can be bridged by these highly identical sequences remains an open question. Here we bridge two 3-α-helix bundle proteins with two radically different folds. Using a straightforward approach, we engineered the sequences of one subdomain within maltose binding protein (MBP, α/ß/α-sandwich) and another within outer surface protein A (OspA, ß-sheet) to have high sequence identity (80 and 77%, respectively) with engineered variants of protein G (GA, 3-α-helix bundle). Circular dichroism and nuclear magnetic resonance spectra of all engineered variants demonstrate that they maintain their native conformations despite substantial sequence modification. Furthermore, the MBP variant (80% identical to GA) remained active. Thermodynamic analysis of numerous GA and MBP variants suggests that the key to our approach involved stabilizing the modified MBP and OspA subdomains via external interactions with neighboring substructures, indicating that subdomain interactions can stabilize alternative folds over a broad range of sequence variation. These findings suggest that it is possible to bridge one fold with many other topologies, which has implications for protein folding, evolution, and misfolding diseases.


Assuntos
Antígenos de Superfície/química , Proteínas da Membrana Bacteriana Externa/química , Vacinas Bacterianas/química , Lipoproteínas/química , Proteínas Ligantes de Maltose/química , Dobramento de Proteína , Antígenos de Superfície/genética , Proteínas da Membrana Bacteriana Externa/genética , Vacinas Bacterianas/genética , Dicroísmo Circular , Lipoproteínas/genética , Proteínas Ligantes de Maltose/genética , Modelos Moleculares , Mutação , Ressonância Magnética Nuclear Biomolecular , Estabilidade Proteica , Estrutura Secundária de Proteína , Homologia de Sequência de Aminoácidos , Termodinâmica
8.
Proc Natl Acad Sci U S A ; 109(24): 9420-5, 2012 Jun 12.
Artigo em Inglês | MEDLINE | ID: mdl-22635268

RESUMO

Protein domains are conspicuous structural units in globular proteins, and their identification has been a topic of intense biochemical interest dating back to the earliest crystal structures. Numerous disparate domain identification algorithms have been proposed, all involving some combination of visual intuition and/or structure-based decomposition. Instead, we present a rigorous, thermodynamically-based approach that redefines domains as cooperative chain segments. In greater detail, most small proteins fold with high cooperativity, meaning that the equilibrium population is dominated by completely folded and completely unfolded molecules, with a negligible subpopulation of partially folded intermediates. Here, we redefine structural domains in thermodynamic terms as cooperative folding units, based on m-values, which measure the cooperativity of a protein or its substructures. In our analysis, a domain is equated to a contiguous segment of the folded protein whose m-value is largely unaffected when that segment is excised from its parent structure. Defined in this way, a domain is a self-contained cooperative unit; i.e., its cooperativity depends primarily upon intrasegment interactions, not intersegment interactions. Implementing this concept computationally, the domains in a large representative set of proteins were identified; all exhibit consistency with experimental findings. Specifically, our domain divisions correspond to the experimentally determined equilibrium folding intermediates in a set of nine proteins. The approach was also proofed against a representative set of 71 additional proteins, again with confirmatory results. Our reframed interpretation of a protein domain transforms an indeterminate structural phenomenon into a quantifiable molecular property grounded in solution thermodynamics.


Assuntos
Proteínas/química , Termodinâmica , Algoritmos , Modelos Moleculares , Conformação Proteica
9.
Proc Natl Acad Sci U S A ; 108(1): 109-13, 2011 Jan 04.
Artigo em Inglês | MEDLINE | ID: mdl-21148101

RESUMO

A protein backbone has two degrees of conformational freedom per residue, described by its ϕ,ψ-angles. Accordingly, the energy landscape of a blocked peptide unit can be mapped in two dimensions, as shown by Ramachandran, Sasisekharan, and Ramakrishnan almost half a century ago. With atoms approximated as hard spheres, the eponymous Ramachandran plot demonstrated that steric clashes alone eliminate 3/4 of ϕ,ψ-space, a result that has guided all subsequent work. Here, we show that adding hydrogen-bonding constraints to these steric criteria eliminates another substantial region of ϕ,ψ-space for a blocked peptide; for conformers within this region, an amide hydrogen is solvent-inaccessible, depriving it of a hydrogen-bonding partner. Yet, this "forbidden" region is well populated in folded proteins, which can provide longer-range intramolecular hydrogen-bond partners for these otherwise unsatisfied polar groups. Consequently, conformational space expands under folding conditions, a paradigm-shifting realization that prompts an experimentally verifiable conjecture about likely folding pathways.


Assuntos
Amidas/química , Modelos Moleculares , Conformação Proteica , Dobramento de Proteína , Amidas/metabolismo , Bases de Dados de Proteínas , Ligação de Hidrogênio , Simulação de Dinâmica Molecular
10.
Curr Opin Struct Biol ; 86: 102807, 2024 06.
Artigo em Inglês | MEDLINE | ID: mdl-38537533

RESUMO

In the last two decades, our existing notion that most foldable proteins have a unique native state has been challenged by the discovery of metamorphic proteins, which reversibly interconvert between multiple, sometimes highly dissimilar, native states. As the number of known metamorphic proteins increases, several computational and experimental strategies have emerged for gaining insights about their refolding processes and identifying unknown metamorphic proteins amongst the known proteome. In this review, we describe the current advances in biophysically and functionally ascertaining the structural interconversions of metamorphic proteins and how coevolution can be harnessed to identify novel metamorphic proteins from sequence information. We also discuss the challenges and ongoing efforts in using artificial intelligence-based protein structure prediction methods to discover metamorphic proteins and predict their corresponding three-dimensional structures.


Assuntos
Dobramento de Proteína , Proteínas , Proteínas/química , Proteínas/metabolismo , Conformação Proteica , Modelos Moleculares , Humanos , Inteligência Artificial
11.
bioRxiv ; 2024 Jan 09.
Artigo em Inglês | MEDLINE | ID: mdl-38313252

RESUMO

Though typically associated with a single folded state, some globular proteins remodel their secondary and/or tertiary structures in response to cellular stimuli. AlphaFold21 (AF2) readily generates one dominant protein structure for these fold-switching (a.k.a. metamorphic) proteins2, but it often fails to predict their alternative experimentally observed structures3,4. Wayment-Steele, et al. steered AF2 to predict alternative structures of a few metamorphic proteins using a method they call AF-cluster5. However, their Paper lacks some essential controls needed to assess AF-cluster's reliability. We find that these controls show AF-cluster to be a poor predictor of metamorphic proteins. First, closer examination of the Paper's results reveals that random sequence sampling outperforms sequence clustering, challenging the claim that AF-cluster works by "deconvolving conflicting sets of couplings." Further, we observe that AF-cluster mistakes some single-folding KaiB homologs for fold switchers, a critical flaw bound to mislead users. Finally, proper error analysis reveals that AF-cluster predicts many correct structures with low confidence and some experimentally unobserved conformations with confidences similar to experimentally observed ones. For these reasons, we suggest using ColabFold6-based random sequence sampling7-augmented by other predictive approaches-as a more accurate and less computationally intense alternative to AF-cluster.

12.
Protein Sci ; 32(12): e4836, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-37953705

RESUMO

The program SSDraw generates publication-quality protein secondary structure diagrams from three-dimensional protein structures. To depict relationships between secondary structure and other protein features, diagrams can be colored by conservation score, B-factor, or custom scoring. Diagrams of homologous proteins can be registered according to an input multiple sequence alignment. Linear visualization allows the user to stack registered diagrams, facilitating comparison of secondary structure and other properties among homologous proteins. SSDraw can be used to compare secondary structures of homologous proteins with both conserved and divergent folds. It can also generate one secondary structure diagram from an input protein structure of interest. The source code can be downloaded (https://github.com/ncbi/SSDraw) and run locally for rapid structure generation, while a Google Colab notebook allows easy use.


Assuntos
Proteínas , Software , Proteínas/química , Estrutura Secundária de Proteína , Alinhamento de Sequência
13.
bioRxiv ; 2023 Nov 10.
Artigo em Inglês | MEDLINE | ID: mdl-37786684

RESUMO

The program SSDraw generates publication-quality protein secondary structure diagrams from three-dimensional protein structures. To depict relationships between secondary structure and other protein features, diagrams can be colored by conservation score, B-factor, or custom scoring. Diagrams of homologous proteins can be registered according to an input multiple sequence alignment. Linear visualization allows the user to stack registered diagrams, facilitating comparison of secondary structure and other properties among homologous proteins. SSDraw can be used to compare secondary structures of homologous proteins with both conserved and divergent folds. It can also generate one secondary structure diagram from an input protein structure of interest. The source code can be downloaded (https://github.com/ethanchen1301/SSDraw) and run locally for rapid structure generation, while a Google Colab notebook allows easy use.

14.
Nat Commun ; 14(1): 5478, 2023 09 06.
Artigo em Inglês | MEDLINE | ID: mdl-37673981

RESUMO

Although most globular proteins fold into a single stable structure, an increasing number have been shown to remodel their secondary and tertiary structures in response to cellular stimuli. State-of-the-art algorithms predict that these fold-switching proteins adopt only one stable structure, missing their functionally critical alternative folds. Why these algorithms predict a single fold is unclear, but all of them infer protein structure from coevolved amino acid pairs. Here, we hypothesize that coevolutionary signatures are being missed. Suspecting that single-fold variants could be masking these signatures, we developed an approach, called Alternative Contact Enhancement (ACE), to search both highly diverse protein superfamilies-composed of single-fold and fold-switching variants-and protein subfamilies with more fold-switching variants. ACE successfully revealed coevolution of amino acid pairs uniquely corresponding to both conformations of 56/56 fold-switching proteins from distinct families. Then, we used ACE-derived contacts to (1) predict two experimentally consistent conformations of a candidate protein with unsolved structure and (2) develop a blind prediction pipeline for fold-switching proteins. The discovery of widespread dual-fold coevolution indicates that fold-switching sequences have been preserved by natural selection, implying that their functionalities provide evolutionary advantage and paving the way for predictions of diverse protein structures from single sequences.


Assuntos
Aminoácidos , Evolução Biológica , Humanos , Algoritmos
15.
bioRxiv ; 2023 Jan 20.
Artigo em Inglês | MEDLINE | ID: mdl-36789442

RESUMO

Although most globular proteins fold into a single stable structure 1 , an increasing number have been shown to remodel their secondary and tertiary structures in response to cellular stimuli 2 . State-of-the-art algorithms 3-5 predict that these fold-switching proteins assume only one stable structure 6,7 , missing their functionally critical alternative folds. Why these algorithms predict a single fold is unclear, but all of them infer protein structure from coevolved amino acid pairs. Here, we hypothesize that coevolutionary signatures are being missed. Suspecting that over-represented single-fold sequences may be masking these signatures, we developed an approach to search both highly diverse protein superfamilies-composed of single-fold and fold-switching variants-and protein subfamilies with more fold-switching variants. This approach successfully revealed coevolution of amino acid pairs uniquely corresponding to both conformations of 56/58 fold-switching proteins from distinct families. Then, using a set of coevolved amino acid pairs predicted by our approach, we successfully biased AlphaFold2 5 to predict two experimentally consistent conformations of a candidate protein with unsolved structure. The discovery of widespread dual-fold coevolution indicates that fold-switching sequences have been preserved by natural selection, implying that their functionalities provide evolutionary advantage and paving the way for predictions of diverse protein structures from single sequences.

16.
Protein Sci ; 32(3): e4596, 2023 03.
Artigo em Inglês | MEDLINE | ID: mdl-36782353

RESUMO

Though many folded proteins assume one stable structure that performs one function, a small-but-increasing number remodel their secondary and tertiary structures and change their functions in response to cellular stimuli. These fold-switching proteins regulate biological processes and are associated with autoimmune dysfunction, severe acute respiratory syndrome coronavirus-2 infection, and more. Despite their biological importance, it is difficult to computationally predict fold switching. With the aim of advancing computational prediction and experimental characterization of fold switchers, this review discusses several features that distinguish fold-switching proteins from their single-fold and intrinsically disordered counterparts. First, the isolated structures of fold switchers are less stable and more heterogeneous than single folders but more stable and less heterogeneous than intrinsically disordered proteins (IDPs). Second, the sequences of single fold, fold switching, and intrinsically disordered proteins can evolve at distinct rates. Third, proteins from these three classes are best predicted using different computational techniques. Finally, late-breaking results suggest that single folders, fold switchers, and IDPs have distinct patterns of residue-residue coevolution. The review closes by discussing high-throughput and medium-throughput experimental approaches that might be used to identify new fold-switching proteins.


Assuntos
COVID-19 , Proteínas Intrinsicamente Desordenadas , Humanos , Proteínas Intrinsicamente Desordenadas/química , Dobramento de Proteína , Modelos Moleculares
17.
Nat Commun ; 14(1): 3177, 2023 06 01.
Artigo em Inglês | MEDLINE | ID: mdl-37264049

RESUMO

Although homologous protein sequences are expected to adopt similar structures, some amino acid substitutions can interconvert α-helices and ß-sheets. Such fold switching may have occurred over evolutionary history, but supporting evidence has been limited by the: (1) abundance and diversity of sequenced genes, (2) quantity of experimentally determined protein structures, and (3) assumptions underlying the statistical methods used to infer homology. Here, we overcome these barriers by applying multiple statistical methods to a family of ~600,000 bacterial response regulator proteins. We find that their homologous DNA-binding subunits assume divergent structures: helix-turn-helix versus α-helix + ß-sheet (winged helix). Phylogenetic analyses, ancestral sequence reconstruction, and AlphaFold2 models indicate that amino acid substitutions facilitated a switch from helix-turn-helix into winged helix. This structural transformation likely expanded DNA-binding specificity. Our approach uncovers an evolutionary pathway between two protein folds and provides a methodology to identify secondary structure switching in other protein families.


Assuntos
Proteínas de Bactérias , Proteínas de Ligação a DNA , Proteínas de Ligação a DNA/metabolismo , Filogenia , Alinhamento de Sequência , Homologia de Sequência de Aminoácidos , Proteínas de Bactérias/metabolismo , DNA/metabolismo
18.
bioRxiv ; 2023 Nov 25.
Artigo em Inglês | MEDLINE | ID: mdl-38076792

RESUMO

Though typically associated with a single folded state, globular proteins are dynamic and often assume alternative or transient structures important for their functions1,2. Wayment-Steele, et al. steered ColabFold3 to predict alternative structures of several proteins using a method they call AF-cluster4. They propose that AF-cluster "enables ColabFold to sample alternate states of known metamorphic proteins with high confidence" by first clustering multiple sequence alignments (MSAs) in a way that "deconvolves" coevolutionary information specific to different conformations and then using these clusters as input for ColabFold. Contrary to this Coevolution Assumption, clustered MSAs are not needed to make these predictions. Rather, these alternative structures can be predicted from single sequences and/or sequence similarity, indicating that coevolutionary information is unnecessary for predictive success and may not be used at all. These results suggest that AF-cluster's predictive scope is likely limited to sequences with distinct-yet-homologous structures within ColabFold's training set.

19.
bioRxiv ; 2023 Dec 13.
Artigo em Inglês | MEDLINE | ID: mdl-38168383

RESUMO

Recent work suggests that AlphaFold2 (AF2)-a deep learning-based model that can accurately infer protein structure from sequence-may discern important features of folded protein energy landscapes, defined by the diversity and frequency of different conformations in the folded state. Here, we test the limits of its predictive power on fold-switching proteins, which assume two structures with regions of distinct secondary and/or tertiary structure. Using several implementations of AF2, including two published enhanced sampling approaches, we generated >280,000 models of 93 fold-switching proteins whose experimentally determined conformations were likely in AF2's training set. Combining all models, AF2 predicted fold switching with a modest success rate of ~25%, indicating that it does not readily sample both experimentally characterized conformations of most fold switchers. Further, AF2's confidence metrics selected against models consistent with experimentally determined fold-switching conformations in favor of inconsistent models. Accordingly, these confidence metrics-though suggested to evaluate protein energetics reliably-did not discriminate between low and high energy states of fold-switching proteins. We then evaluated AF2's performance on seven fold-switching proteins outside of its training set, generating >159,000 models in total. Fold switching was accurately predicted in one of seven targets with moderate confidence. Further, AF2 demonstrated no ability to predict alternative conformations of two newly discovered targets without homologs in the set of 93 fold switchers. These results indicate that AF2 has more to learn about the underlying energetics of protein ensembles and highlight the need for further developments of methods that readily predict multiple protein conformations.

20.
Protein Sci ; 31(6): e4353, 2022 06.
Artigo em Inglês | MEDLINE | ID: mdl-35634782

RESUMO

AlphaFold2 has revolutionized protein structure prediction by leveraging sequence information to rapidly model protein folds with atomic-level accuracy. Nevertheless, previous work has shown that these predictions tend to be inaccurate for structurally heterogeneous proteins. To systematically assess factors that contribute to this inaccuracy, we tested AlphaFold2's performance on 98-fold-switching proteins, which assume at least two distinct-yet-stable secondary and tertiary structures. Topological similarities were quantified between five predicted and two experimentally determined structures of each fold-switching protein. Overall, 94% of AlphaFold2 predictions captured one experimentally determined conformation but not the other. Despite these biased results, AlphaFold2's estimated confidences were moderate-to-high for 74% of fold-switching residues, a result that contrasts with overall low confidences for intrinsically disordered proteins, which are also structurally heterogeneous. To investigate factors contributing to this disparity, we quantified sequence variation within the multiple sequence alignments used to generate AlphaFold2's predictions of fold-switching and intrinsically disordered proteins. Unlike intrinsically disordered regions, whose sequence alignments show low conservation, fold-switching regions had conservation rates statistically similar to canonical single-fold proteins. Furthermore, intrinsically disordered regions had systematically lower prediction confidences than either fold-switching or single-fold proteins, regardless of sequence conservation. AlphaFold2's high prediction confidences for fold switchers indicate that it uses sophisticated pattern recognition to search for one most probable conformer rather than protein biophysics to model a protein's structural ensemble. Thus, it is not surprising that its predictions often fail for proteins whose properties are not fully apparent from solved protein structures. Our results emphasize the need to look at protein structure as an ensemble and suggest that systematic examination of fold-switching sequences may reveal propensities for multiple stable secondary and tertiary structures.


Assuntos
Proteínas Intrinsicamente Desordenadas , Alinhamento de Sequência
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA