RESUMEN
Natural proteins are fragile entities, intrinsically sensitive to perturbations both at the level of sequence and their immediate environment. Here, we highlight the diverse strategies available for engineering function through mutations influencing backbone conformational entropy, charge-charge interactions, and in the loops and hinge regions, many of which are located far from the active site. It thus appears that there are potentially numerous ways to microscopically vary the identity of residues and the constituent interactions to tune function. Functional modulation could occur via changes in native-state stability, altered thermodynamic coupling extents within the folded structure, redistributed dynamics, or through modulation of the population of conformational substates. As these mechanisms are intrinsically linked and given the pervasive long-range effects of mutations, it is crucial to consider the interaction network as a whole and fully map the native conformational landscape to place mutational effects in the context of allostery and protein evolution.
RESUMEN
The relative magnitudes of noncovalent stabilization energies or the coupling free energies in folded proteins are anisotropically distributed, uniquely influencing folding and functional behaviors. In this regard, the fructose repressor (FruR) DBD belonging to the LacR repressor family harbors a three-residue insertionâKQYâbetween the canonical second and third helices. This sequence insertion promotes a strong Tyr-Tyr stacking interaction that is not observed in related homologues. Combining experiments with simulations, we show that the Tyr-Tyr stacking contributes to a decoupled unfolding due to the localization of a large part of the stabilization energy in this specific structural region. This leads to melting temperatures from different probes spanning nearly 10 K, while concomitantly stabilizing a partially structured intermediate state. Disruption of the aromatic stacking interaction via an alanine mutation promotes a molten-globular state whose native ensemble is replete with non-native interactions while displaying enhanced thermodynamic fluctuations and minimal calorimetric cooperativity. Surprisingly, the molten-globular variant of FruR DBD binds to the operator site on DNA with an affinity similar to that of the wild-type but with altered secondary-structure characteristics in the bound state, underscoring the chaperone-like role of DNA through its large negative electrostatic potential. FruR DBD thus appears to be at the verge of disorder as expected of an entropically destabilizing three-residue insertion but is rescued by the aromatic stacking interaction that distinctly dictates the finer details of stability, cooperativity, and binding.
Asunto(s)
ADN , Desplegamiento Proteico , Termodinámica , ADN/química , ADN/metabolismo , Unión Proteica , Simulación de Dinámica Molecular , Dominios Proteicos , Proteínas Represoras/química , Proteínas Represoras/metabolismo , Proteínas Represoras/genética , Sitios de UniónRESUMEN
TipA, a MerR family transcription factor from Streptomyces lividans, promotes antibiotic resistance by sequestering broad-spectrum thiopeptide-based antibiotics, thus counteracting their inhibitory effect on ribosomes. TipAS, a minimal binding motif which is expressed as an isoform of TipA, harbors a partially disordered N-terminal subdomain that folds upon binding multiple antibiotics. The extent and nature of the underlying molecular heterogeneity in TipAS that shapes its promiscuous folding-function landscape is an open question and is critical for understanding antibiotic-sequestration mechanisms. Here, combining equilibrium and time-resolved experiments, statistical modeling, and simulations, we show that the TipAS native ensemble exhibits a pre-equilibrium between binding-incompetent and binding-competent substates, with the fully folded state appearing only as an excited state under physiological conditions. The binding-competent state characterized by a partially structured N-terminal subdomain loses structure progressively in the physiological range of temperatures, swells on temperature increase, and displays slow conformational exchange across multiple conformations. Binding to the bactericidal antibiotic thiostrepton follows a combination of induced-fit and conformational-selection-like mechanisms, via partial binding and concomitant stabilization of the binding-competent substate. These ensemble features are evolutionarily conserved across orthologs from select bacteria that infect humans, underscoring the functional role of partial disorder in the native ensemble of antibiotic-sequestering proteins belonging to the MerR family.
Asunto(s)
Antibacterianos , Proteínas Bacterianas , Pliegue de Proteína , Antibacterianos/metabolismo , Antibacterianos/farmacología , Antibacterianos/química , Proteínas Bacterianas/metabolismo , Proteínas Bacterianas/química , Proteínas Bacterianas/genética , Streptomyces lividans/metabolismo , Streptomyces lividans/genética , Unión Proteica , Conformación Proteica , Modelos Moleculares , Factores de Transcripción/metabolismo , Factores de Transcripción/químicaRESUMEN
Environmentally regulated gene expression is critical for bacterial survival under stress conditions, including extremes in temperature, osmolarity and nutrient availability. Here, we dissect the thermo- and osmo-responsory behavior of the transcriptional repressor H-NS, an archetypal nucleoid-condensing sensory protein, ubiquitous in enterobacteria that infect the mammalian gut. Through experiments and thermodynamic modeling, we show that H-NS exhibits osmolarity, temperature and concentration dependent self-association, with a highly polydisperse native ensemble dominated by monomers, dimers, tetramers and octamers. The relative population of these oligomeric states is determined by an interplay between dimerization and higher-order oligomerization, which in turn drives a competition between weak homo- versus hetero-oligomerization of protein-protein and protein-DNA complexes. A phosphomimetic mutation, Y61E, fully eliminates higher-order self-assembly and preserves only dimerization while weakening DNA binding, highlighting that oligomerization is a prerequisite for strong DNA binding. We further demonstrate the presence of long-distance thermodynamic connectivity between dimerization and oligomerization sites on H-NS which influences the binding of the co-repressor Cnu, and switches the DNA binding mode of the hetero-oligomeric H-NS:Cnu complex. Our work thus uncovers important organizational principles in H-NS including a multi-layered thermodynamic control, and provides a molecular framework broadly applicable to other thermo-osmo sensory proteins that employ similar mechanisms to regulate gene expression.
Asunto(s)
Proteínas Bacterianas , Proteínas de Unión al ADN , Enterobacteriaceae , Proteínas Bacterianas/metabolismo , ADN/genética , ADN/metabolismo , Proteínas de Unión al ADN/metabolismo , Enterobacteriaceae/metabolismo , Temperatura , Factores de Transcripción/metabolismoRESUMEN
The extent and molecular basis of interdomain communication in multidomain proteins, central to understanding allostery and function, is an open question. One simple evolutionary strategy could involve the selection of either conflicting or favorable electrostatic interactions across the interface of two closely spaced domains to tune the magnitude of interdomain connectivity. Here, we study a bilobed domain FF34 from the eukaryotic p190A RhoGAP protein to explore one such design principle that mediates interdomain communication. We find that while the individual structural units in wild-type FF34 are marginally coupled, they exhibit distinct intrinsic stabilities and low cooperativity, manifesting as slow folding. The FF3-FF4 interface harbors a frustrated network of highly conserved electrostatic interactions-a charge troika-that promotes the population of multiple, decoupled, and non-native structural modes on a rugged native landscape. Perturbing this network via a charge-reversal mutation not only enhances stability and cooperativity but also dampens the fluctuations globally and speeds up the folding rate by at least an order of magnitude. Our work highlights how a conserved but nonoptimal network of interfacial electrostatic interactions shapes the native ensemble of a bilobed protein, a feature that could be exploited in designing molecular systems with long-range connectivity and enhanced cooperativity.
RESUMEN
Paralogous proteins confer enhanced fitness to organisms via complex sequence-conformation codes that shape functional divergence, specialization, or promiscuity. Here, we dissect the underlying mechanism of promiscuous binding versus partial subfunctionalization in paralogues by studying structurally identical acyl-CoA binding proteins (ACBPs) from Plasmodium falciparum that serve as promising drug targets due to their high expression during the protozoan proliferative phase. Combining spectroscopic measurements, solution NMR, SPR, and simulations on two of the paralogues, A16 and A749, we show that minor sequence differences shape nearly every local and global conformational feature. A749 displays a broader and heterogeneous native ensemble, weaker thermodynamic coupling and cooperativity, enhanced fluctuations, and a larger binding pocket volume compared to A16. Site-specific tryptophan probes signal a graded reduction in the sampling of substates in the holo form, which is particularly apparent in A749. The paralogues exhibit a spectrum of binding affinities to different acyl-CoAs with A749, the more promiscuous and hence the likely ancestor, binding 1000-fold stronger to lauroyl-CoA under physiological conditions. We thus demonstrate how minor sequence changes modulate the extent of long-range interactions and dynamics, effectively contributing to the molecular evolution of contrasting functional repertoires in paralogues.
Asunto(s)
Inhibidor de la Unión a Diazepam , Proteínas , Inhibidor de la Unión a Diazepam/genética , Inhibidor de la Unión a Diazepam/química , Inhibidor de la Unión a Diazepam/metabolismo , Proteínas/metabolismo , Conformación Molecular , Acilcoenzima A/metabolismo , Plasmodium falciparum/genética , Plasmodium falciparum/metabolismoRESUMEN
Over 40% of eukaryotic proteomes and 15% of bacterial proteomes are predicted to be intrinsically disordered based on their amino acid sequence. Intrinsically disordered proteins (IDPs) exist as heterogeneous ensembles of interconverting conformations and pose a challenge to the structure-function paradigm by apparently functioning without possessing stable structural elements. IDPs play a prominent role in biological processes involving extensive intermolecular interaction networks and their inherently dynamic nature facilitates their promiscuous interaction with multiple structurally diverse partner molecules. NMR spectroscopy has made pivotal contributions to our understanding of IDPs because of its unique ability to characterize heterogeneity at atomic resolution. NMR methods such as Chemical Exchange Saturation Transfer (CEST) and relaxation dispersion have enabled the detection of 'invisible' excited states in biomolecules which are transiently and sparsely populated, yet central for function. Here, we develop a 1Hα CEST pulse sequence which overcomes the resonance overlap problem in the 1Hα-13Cα plane of IDPs by taking advantage of the superior resolution in the 1H-15N correlation spectrum. In this sequence, magnetization is transferred after 1H CEST using a triple resonance coherence transfer pathway from 1Hα (i) to 1HN(i + 1) during which the 15N(t1) and 1HN(t2) are frequency labelled. This approach is integrated with spin state-selective CEST for eliminating spurious dips in CEST profiles resulting from dipolar cross-relaxation. We apply this sequence to determine the excited state 1Hα chemical shifts of the intrinsically disordered DNA binding domain (CytRN) of the bacterial cytidine repressor (CytR), which transiently acquires a functional globally folded conformation. The structure of the excited state, calculated using 1Hα chemical shifts in conjunction with other excited state NMR restraints, is a three-helix bundle incorporating a helix-turn-helix motif that is vital for binding DNA.
Asunto(s)
Proteínas Intrínsecamente Desordenadas , Proteoma , Secuencia de Aminoácidos , Citidina , EucariontesRESUMEN
A longstanding goal in the field of intrinsically disordered proteins (IDPs) is to characterize their structural heterogeneity and pinpoint the role of this heterogeneity in IDP function. Here, we use multinuclear chemical exchange saturation (CEST) nuclear magnetic resonance to determine the structure of a thermally accessible globally folded excited state in equilibrium with the intrinsically disordered native ensemble of a bacterial transcriptional regulator CytR. We further provide evidence from double resonance CEST experiments that the excited state, which structurally resembles the DNA-bound form of cytidine repressor (CytR), recognizes DNA by means of a "folding-before-binding" conformational selection pathway. The disorder-to-order regulatory switch in DNA recognition by natively disordered CytR therefore operates through a dynamical variant of the lock-and-key mechanism where the structurally complementary conformation is transiently accessed via thermal fluctuations.
Asunto(s)
Proteínas Intrínsecamente Desordenadas , Proteínas Intrínsecamente Desordenadas/química , Pliegue de Proteína , Unión Proteica , Espectroscopía de Resonancia Magnética , ADN/química , Conformación ProteicaRESUMEN
G-protein-coupled receptors (GPCRs) are ubiquitous integral membrane proteins involved in diverse cellular signaling processes. Here, we carry out a large-scale ensemble thermodynamic study of 45 ligand-free GPCRs employing a structure-based statistical mechanical framework. We find that multiple partially structured states co-exist in the GPCR native ensemble, with the TM helices 1, 6 and 7 displaying varied folding status, and shaping the conformational landscape. Strongly coupled residues are anisotropically distributed, accounting for only 13% of the residues, illustrating that a large number of residues are inherently dynamic. Active-state GPCRs are characterized by reduced conformational heterogeneity with altered coupling-patterns distributed throughout the structural scaffold. In silico alanine-scanning mutagenesis reveals that extra- and intra-cellular faces of GPCRs are coupled thermodynamically, highlighting an exquisite structural specialization and the fluid nature of the intramolecular interaction network. The ensemble-based perturbation methodology presented here lays the foundation for understanding allosteric mechanisms and the effects of disease-causing mutations in GCPRs.
Asunto(s)
Receptores Acoplados a Proteínas G , Transducción de Señal , Modelos Moleculares , Receptores Acoplados a Proteínas G/metabolismo , Estructura Secundaria de Proteína , Ligandos , Termodinámica , Conformación ProteicaRESUMEN
The mutations G170R and I244T are the most common disease cause in primary hyperoxaluria type I (PH1). These mutations cause the misfolding of the AGT protein in the minor allele AGT-LM that contains the P11L polymorphism, which may affect the folding of the N-terminal segment (NTT-AGT). The NTT-AGT is phosphorylated at T9, although the role of this event in PH1 is unknown. In this work, phosphorylation of T9 was mimicked by introducing the T9E mutation in the NTT-AGT peptide and the full-length protein. The NTT-AGT conformational landscape was studied by circular dichroism, NMR, and statistical mechanical methods. Functional and stability effects on the full-length AGT protein were characterized by spectroscopic methods. The T9E and P11L mutations together reshaped the conformational landscape of the isolated NTT-AGT peptide by stabilizing ordered conformations. In the context of the full-length AGT protein, the T9E mutation had no effect on the overall AGT function or conformation, but enhanced aggregation of the minor allele (LM) protein and synergized with the mutations G170R and I244T. Our findings indicate that phosphorylation of T9 may affect the conformation of the NTT-AGT and synergize with PH1-causing mutations to promote aggregation in a genotype-specific manner. Phosphorylation should be considered a novel regulatory mechanism in PH1 pathogenesis.
Asunto(s)
Polimorfismo Genético , Agregado de Proteínas , Humanos , Fosforilación , Mutación , Genotipo , Transaminasas/metabolismoRESUMEN
Phosphoglycerate kinase has been a model for the stability, folding cooperativity and catalysis of a two-domain protein. The human isoform 1 (hPGK1) is associated with cancer development and rare genetic diseases that affect several of its features. To investigate how mutations affect hPGK1 folding landscape and interaction networks, we have introduced mutations at a buried site in the N-terminal domain (F25 mutants) that either created cavities (F25L, F25V, F25A), enhanced conformational entropy (F25G) or introduced structural strain (F25W) and evaluated their effects using biophysical experimental and theoretical methods. All F25 mutants folded well, but showed reduced unfolding cooperativity, kinetic stability and altered activation energetics according to the results from thermal and chemical denaturation analyses. These alterations correlated well with the structural perturbation caused by mutations in the N-terminal domain and the destabilization caused in the interdomain interface as revealed by H/D exchange under native conditions. Importantly, experimental and theoretical analyses showed that these effects are significant even when the perturbation is mild and local. Our approach will be useful to establish the molecular basis of hPGK1 genotype-phenotype correlations due to phosphorylation events and single amino acid substitutions associated with disease.
Asunto(s)
Fosfoglicerato Quinasa/metabolismo , Pliegue de Proteína , Humanos , Interacciones Hidrofóbicas e Hidrofílicas , Cinética , Fosfoglicerato Quinasa/genética , Desnaturalización Proteica , TermodinámicaRESUMEN
Mutational effects in globular proteins exhibit an exponential-like decreasing dependence on distance from the mutated site, suggestive of long-range modulation of structural-thermodynamic features. Here, we extract the physical origins of this pattern by employing a statistical-mechanical model to construct conformational ensembles of three archetypal proteins. Through large-scale in silico alanine-scanning mutagenesis, we show that inter-residue differential coupling free energies, which are characteristic ensemble thermodynamic properties, follow a similar exponential distance dependence with the effects felt until â¼15-20 Å from the mutated site. From the perspective of an ensemble-averaged structure, this feature arises via long-range reorganization of the interaction network on mutations which is more significant for charged residues compared to hydrophobic residues. Our work highlights how subtle alterations in the microscopic distribution of states manifest as a macroscopic distance dependence, the physical origins of mutation-induced dynamic allostery, and the necessity to consider the global intra-protein interaction network to understand mutational outcomes.
RESUMEN
The intrinsically disordered DNA-binding domain of cytidine repressor (CytR-DBD) folds in the presence of target DNA and regulates the expression of multiple genes in E. coli. To explore the conformational rearrangements in the unbound state and the target recognition mechanisms of CytR-DBD, we carried out single-molecule Förster resonance energy transfer (smFRET) measurements. The smFRET data of CytR-DBD in the absence of DNA show one major and one minor population assignable to an expanded unfolded state and a compact folded state, respectively. The population of the folded state increases and decreases upon titration with salt and denaturant, respectively, in an apparent two-state manner. The peak FRET efficiencies of both the unfolded and folded states change continuously with denaturant concentration, demonstrating the intrinsic flexibility of the DNA-binding domain and the deviation from a strict two-state transition. Remarkably, the CytR-DBD exhibits a compact structure when bound to both the specific and nonspecific DNA; however, the peak FRET efficiencies of the two structures are slightly but consistently different. The observed conformational heterogeneity highlights the potential structural changes required for CytR to bind variably spaced operator sequences.
Asunto(s)
Proteínas de Escherichia coli , Escherichia coli , ADN/metabolismo , Escherichia coli/genética , Proteínas de Escherichia coli/química , Transferencia Resonante de Energía de Fluorescencia , Proteínas Represoras/química , Espectrometría de FluorescenciaRESUMEN
Allosterism is a common phenomenon in protein biochemistry that allows rapid regulation of protein stability; dynamics and function. However, the mechanisms by which allosterism occurs (by mutations or post-translational modifications (PTMs)) may be complex, particularly due to long-range propagation of the perturbation across protein structures. In this work, we have investigated allosteric communication in the multifunctional, cancer-related and antioxidant protein NQO1 by mutating several fully buried leucine residues (L7, L10 and L30) to smaller residues (V, A and G) at sites in the N-terminal domain. In almost all cases, mutated residues were not close to the FAD or the active site. Mutations LâG strongly compromised conformational stability and solubility, and L30A and L30V also notably decreased solubility. The mutation L10A, closer to the FAD binding site, severely decreased FAD binding affinity (≈20 fold vs. WT) through long-range and context-dependent effects. Using a combination of experimental and computational analyses, we show that most of the effects are found in the apo state of the protein, in contrast to other common polymorphisms and PTMs previously characterized in NQO1. The integrated study presented here is a first step towards a detailed structural-functional mapping of the mutational landscape of NQO1, a multifunctional and redox signaling protein of high biomedical relevance.
RESUMEN
We investigate the conformational properties of the intrinsically disordered DNA-binding domain of CytR in the presence of the polymeric crowder polyethylene glycol (PEG). Integrating circular dichroism, nuclear magnetic resonance, and single-molecule Förster resonance energy transfer measurements, we demonstrate that disordered CytR populates a well-folded minor conformation in its native ensemble, while the unfolded ensemble collapses and folds with an increase in crowder density independent of the crowder size. Employing a statistical-mechanical model, the effective reduction in the accessible conformational space of a residue in the unfolded state is estimated to be 10% at 300 mg/mL PEG8000, relative to dilute conditions. The experimentally consistent PEG-temperature phase diagram thus constructed reveals that entropic effects can stabilize disordered CytR by 10 kJ mol-1, driving the equilibrium toward folded conformations under physiological conditions. Our work highlights the malleable conformational landscape of CytR, the presence of a folded conformation in the disordered ensemble, and proposes a scaling relation for quantifying excluded volume effects on protein stability.
Asunto(s)
Pliegue de Proteína , Proteínas , Dicroismo Circular , Entropía , Conformación Molecular , Conformación ProteicaRESUMEN
The functioning of proteins is intimately tied to their fluctuations in the native ensemble. The structural-energetic features that determine fluctuation amplitudes and hence the shape of the underlying landscape, which in turn determine the magnitude of the functional output, are often confounded by multiple variables. Here, we employ the FF1 domain from human p190A RhoGAP protein as a model system to uncover the molecular basis for phosphorylation of a buried tyrosine, which is crucial to the transcriptional activity associated with transcription factor TFII-I. Combining spectroscopy, calorimetry, statistical-mechanical modeling, molecular simulations, and in vitro phosphorylation assays, we show that the FF1 domain samples a diverse array of conformations in its native ensemble, some of which are phosphorylation-competent. Upon eliminating unfavorable charge-charge interactions through a single charge-reversal (K53E) or charge-neutralizing (K53Q) mutation, we observe proportionately lower phosphorylation extents due to the altered structural coupling, damped equilibrium fluctuations, and a more compact native ensemble. We thus establish a conformational selection mechanism for phosphorylation in the FF1 domain with K53 acting as a "gatekeeper", modulating the solvent exposure of the buried tyrosine. Our work demonstrates the role of unfavorable charge-charge interactions in governing functional events through the modulation of native ensemble characteristics, a feature that could be prevalent in ordered protein domains.
RESUMEN
Mutational perturbations of protein structures, i.e., phi-value analysis, are commonly employed to probe the extent of involvement of a particular residue in the rate-determining step(s) of folding. This generally involves the measurement of folding thermodynamic parameters and kinetic rate constants for the wild-type and mutant proteins. While computational approaches have been reasonably successful in understanding and predicting the effect of mutations on folding thermodynamics, it has been challenging to explore the same on kinetics due to confounding structural, energetic, and dynamic factors. Accordingly, the frequent observation of fractional phi-values (mean of ~0.3) has resisted a precise and consistent interpretation. Here, we describe how to construct, parameterize, and employ a simple one-dimensional free energy surface model that is grounded in the basic tenets of the energy landscape theory to predict and simulate the effect of mutations on folding kinetics. As a proof of principle, we simulate one-dimensional free energy profiles of 806 mutations from 24 different proteins employing just the experimental destabilization as input, reproduce the relative unfolding activation free energies with a correlation of 0.91, and show that the mean phi-value of 0.3 essentially corresponds to the extent of stabilization energy gained at the barrier top while folding.
Asunto(s)
Pliegue de Proteína , Cinética , Mutación , Proteínas/genética , TermodinámicaRESUMEN
Protein sequences and structures evolve by satisfying varied physical and biochemical constraints. This multi-level selection is enabled not just by the patterning of amino acids on the sequence, but also via coupling between residues in the native structure. Here, we employ an energetically detailed statistical mechanical model with millions of microstates to extract such long-range structural correlations, i.e. thermodynamic coupling free energies, from a diverse family of protein structures. We find that despite the intricate and anisotropic distribution of coupling patterns, the majority of residues (>70%) are only marginally coupled contributing to functional motions and catalysis. Physical origins of 'sectors', determinants of native ensemble heterogeneity in extant, ancient and designed proteins, and the basis for allostery emerge naturally from coupling free energies. The statistical framework highlights how evolutionary selection and optimization occur at the level of global interaction network for a given protein fold impacting folding, function, and allosteric outputs.
RESUMEN
Obligate symbionts typically exhibit high evolutionary rates. Consequently, their proteins may differ considerably from their modern and ancestral homologs in terms of both sequence and properties, thus providing excellent models to study protein evolution. Also, obligate symbionts are challenging to culture in the lab and proteins from uncultured organisms must be produced in heterologous hosts using recombinant DNA technology. Obligate symbionts thus replicate a fundamental scenario of metagenomics studies aimed at the functional characterization and biotechnological exploitation of proteins from the bacteria in soil. Here, we use the thioredoxin from Candidatus Photodesmus katoptron, an uncultured symbiont of flashlight fish, to explore evolutionary and engineering aspects of protein folding in heterologous hosts. The symbiont protein is a standard thioredoxin in terms of 3D-structure, stability and redox activity. However, its folding outside the original host is severely impaired, as shown by a very slow refolding in vitro and an inefficient expression in E. coli that leads mostly to insoluble protein. By contrast, resurrected Precambrian thioredoxins express efficiently in E. coli, plausibly reflecting an ancient adaptation to unassisted folding. We have used a statistical-mechanical model of the folding landscape to guide back-to-ancestor engineering of the symbiont protein. Remarkably, we find that the efficiency of heterologous expression correlates with the in vitro (i.e., unassisted) folding rate and that the ancestral expression efficiency can be achieved with only 1-2 back-to-ancestor replacements. These results demonstrate a minimal-perturbation, sequence-engineering approach to rescue inefficient heterologous expression which may potentially be useful in metagenomics efforts targeting recent adaptations.
Asunto(s)
Proteínas Bacterianas/biosíntesis , Peces/microbiología , Pliegue de Proteína , Proteínas Recombinantes/biosíntesis , Vibrionaceae/metabolismo , Animales , Proteínas Bacterianas/química , Proteínas Bacterianas/genética , Escherichia coli/metabolismo , Metagenómica , Ingeniería de Proteínas , Proteínas Recombinantes/química , Proteínas Recombinantes/genética , Simbiosis , Tiorredoxinas/biosíntesis , Tiorredoxinas/química , Vibrionaceae/genéticaRESUMEN
Single domain proteins fold via diverse mechanisms emphasizing the intricate relationship between energetics and structure, which is a direct consequence of functional constraints and demands imposed at the level of sequence. On the other hand, elucidating the interplay between folding mechanisms and function is challenging in large proteins, given the inherent shortcomings in identifying metastable states experimentally and the sampling limitations associated with computational methods. Here, we show that free energy profiles and surfaces of large systems (>150 residues), as predicted by a statistical mechanical model, display a wide array of folding mechanisms with ubiquitous folding intermediates and heterogeneous native ensembles. Importantly, residues around the ligand binding or enzyme active site display a larger tendency to partially unfold and this manifests as intermediates or excited states along the folding coordinate in ligand binding domains, transcription repressors, and representative enzymes from all the six classes, including the SARS-CoV-2 receptor binding domain (RBD) of the spike protein and the protease Mpro. It thus appears that it is relatively easier to distill the imprints of function on the folding landscape of larger proteins as opposed to smaller systems. We discuss how an understanding of energetic-entropic features in ordered proteins can pinpoint specific avenues through which folding mechanisms, populations of partially structured states and function can be engineered.