RESUMEN
As proposed here, ß-turns play an essential role in protein self-assembly. This compact, four-residue motif affects protein conformation dramatically by reversing the overall chain direction. Turns are the "hinges" in globular proteins. This new proposal broadens a previous hypothesis that globular proteins solve the folding problem in part by filtering conformers with unsatisfied backbone hydrogen bonds, thereby preorganizing the folding population. Recapitulating that hypothesis: unsatisfied conformers would be dramatically destabilizing, shifting the U(nfolded) â N(ative) equilibrium far to the left. If even a single backbone polar group is satisfied by solvent when unfolded but buried and unsatisfied when folded, that energy penalty alone, approximately +5 kcal/mol, would rival almost the entire free energy of protein stabilization at room temperature. Consequently, globular proteins are built on scaffolds of hydrogen-bonded α-helices and/or strands of ß-sheet, motifs that can be extended indefinitely, with intra-segment hydrogen bond partners for their backbone polar groups and without steric clash. Scaffolds foster a protein-wide hydrogen-bonded network, and, of thermodynamic necessity, they self-assemble cooperatively. Unlike elements of repetitive secondary structure, α-helices and ß-sheet, a four-residue ß-turn has only a single hydrogen bond (from i + 3 â i), not a cooperatively formed assembly of hydrogen bonds. As such, turns can form autonomously and are poised to initiate assembly of scaffold elements by bringing them together in an orientation and registration that promotes cooperative "zipping". The overall effect of this self-assembly mechanism is to induce substantial preorganization in the thermodynamically accessible folding population and, concomitantly, to reduce the folding entropy.
RESUMEN
It has been a long-standing conviction that a protein's native fold is selected from a vast number of conformers by the optimal constellation of enthalpically favorable interactions. In marked contrast, this Perspective introduces a different mechanism, one that emphasizes conformational entropy as the principal organizer in protein folding while proposing that the conventional view is incomplete. This mechanism stems from the realization that hydrogen bond satisfaction is a thermodynamic necessity. In particular, a backbone hydrogen bond may add little to the stability of the native state, but a completely unsatisfied backbone hydrogen bond would be dramatically destabilizing, shifting the U(nfolded) â N(ative) equilibrium far to the left. If even a single backbone polar group is satisfied by solvent when unfolded but buried and unsatisfied when folded, that energy penalty alone, approximately +5 kcal/mol, would rival almost the entire free energy of protein stabilization, typically between -5 and -15 kcal/mol under physiological conditions. Consequently, upon folding, buried backbone polar groups must form hydrogen bonds, and they do so by assembling scaffolds of α-helices and/or strands of ß-sheet, the only conformers in which, with rare exception, hydrogen bond donors and acceptors are exactly balanced. In addition, only a few thousand viable scaffold topologies are possible for a typical protein domain. This thermodynamic imperative winnows the folding population by culling conformers with unsatisfied hydrogen bonds, thereby reducing the entropy cost of folding. Importantly, conformational restrictions imposed by backbone···backbone hydrogen bonding in the scaffold are sequence-independent, enabling mutationâand thus evolutionâwithout sacrificing the structure.
Asunto(s)
Pliegue de Proteína , Proteínas/química , Termodinámica , Enlace de Hidrógeno , Conformación Proteica en Hélice alfa , Conformación Proteica en Lámina beta , Dominios y Motivos de Interacción de ProteínasRESUMEN
The Ramachandran plot for backbone Ï,ψ-angles in a blocked monopeptide has played a central role in understanding protein structure. Curiously, a similar analysis for side chain χ-angles has been comparatively neglected. Instead, efforts have focused on compiling various types of side chain libraries extracted from proteins of known structure. Departing from this trend, the following analysis presents backbone-based maps of side chains in blocked monopeptides. As in the original Ï,ψ-plot, these maps are derived solely from hard-sphere steric repulsion. Remarkably, the side chain biases exhibit marked similarities to corresponding biases seen in high-resolution protein structures. Consequently, some of the entropic cost for side chain localization in proteins is prepaid prior to the onset of folding events because conformational bias is built into the chain at the covalent level. Furthermore, side chain conformations are seen to experience fewer steric restrictions for backbone conformations in either the α or ß basins, those map regions where repetitive Ï,ψ-angles result in α-helices or strands of ß-sheet, respectively. Here, these α and ß basins are entropically favored for steric reasons alone; a blocked monopeptide is too short to accommodate the peptide hydrogen bonds that stabilize repetitive secondary structure. Thus, despite differing energetics, α/ß-basins are favored for both monopeptides and repetitive secondary structure, underpinning an energetically unfrustrated compatibility between these two levels of protein structure.
Asunto(s)
Conformación Proteica en Hélice alfa , Conformación Proteica en Lámina beta , Proteínas/química , Entropía , Enlace de Hidrógeno , Simulación de Dinámica Molecular , Péptidos/química , Conformación ProteicaRESUMEN
How hydrophobicity (HY) drives protein folding is studied. The 1971 Nozaki-Tanford method of measuring HY is modified to use gases as solutes, not crystals, and this makes the method easy to use. Alkanes are found to be much more hydrophobic than rare gases, and the two different kinds of HY are termed intrinsic (rare gases) and extrinsic (alkanes). The HY values of rare gases are proportional to solvent-accessible surface area (ASA), whereas the HY values of alkanes depend on special hydration shells. Earlier work showed that hydration shells produce the hydration energetics of alkanes. Evidence is given here that the transfer energetics of alkanes to cyclohexane [Wolfenden R, Lewis CA, Jr, Yuan Y, Carter CW, Jr (2015) Proc Natl Acad Sci USA 112(24):7484-7488] measure the release of these shells. Alkane shells are stabilized importantly by van der Waals interactions between alkane carbon and water oxygen atoms. Thus, rare gases cannot form this type of shell. The very short (approximately picoseconds) lifetime of the van der Waals interaction probably explains why NMR efforts to detect alkane hydration shells have failed. The close similarity between the sizes of the opposing energetics for forming or releasing alkane shells confirms the presence of these shells on alkanes and supports Kauzmann's 1959 mechanism of protein folding. A space-filling model is given for the hydration shells on linear alkanes. The model reproduces the n values of Jorgensen et al. [Jorgensen WL, Gao J, Ravimohan C (1985) J Phys Chem 89:3470-3473] for the number of waters in alkane hydration shells.
Asunto(s)
Alcanos/química , Gases/química , Interacciones Hidrofóbicas e Hidrofílicas , Pliegue de Proteína , Algoritmos , Modelos Químicos , Solventes/química , TermodinámicaRESUMEN
Pauling's mastery of peptide stereochemistry-based on small molecule crystal structures and the theory of chemical bonding-led to his realization that the peptide unit is planar and then to the Pauling-Corey-Branson model of the α-helix. Similarly, contemporary protein structure refinement is based on experimentally determined diffraction data together with stereochemical restraints. However, even an X-ray structure at ultra-high resolution is still an under-determined model in which the linkage among refinement parameters is complex. Consequently, restrictions imposed on any given parameter can affect the entire structure. Here, we examine recent studies of high resolution protein X-ray structures, where substantial distortions of the peptide plane are found to be commonplace. Planarity is assessed by the ω-angle, a dihedral angle determined by the peptide bond (C-N) and its flanking covalent neighbors; for an ideally planar trans peptide, ω = 180°. By using a freely available refinement package, Phenix [Afonine et al. (2012) Acta Cryst. D, 68:352-367], we demonstrate that tightening default restrictions on the ω-angle can significantly reduce apparent deviations from peptide unit planarity without consequent reduction in reported evaluation metrics (e.g., R-factors). To be clear, our result does not show that substantial non-planarity is absent, only that an equivalent alternative model is possible. Resolving this disparity will ultimately require improved understanding of the deformation energy. Meanwhile, we urge inclusion of ω-angle statistics in new structure reports in order to focus critical attention on the usual practice of assigning default values to ω-angle constraints during structure refinement.
Asunto(s)
Modelos Moleculares , Péptidos/química , Estructura Secundaria de Proteína , Proteínas/química , Biología Computacional/métodos , Cristalografía por Rayos X , Bases de Datos de Proteínas , Reproducibilidad de los Resultados , TermodinámicaRESUMEN
Protein domains are conspicuous structural units in globular proteins, and their identification has been a topic of intense biochemical interest dating back to the earliest crystal structures. Numerous disparate domain identification algorithms have been proposed, all involving some combination of visual intuition and/or structure-based decomposition. Instead, we present a rigorous, thermodynamically-based approach that redefines domains as cooperative chain segments. In greater detail, most small proteins fold with high cooperativity, meaning that the equilibrium population is dominated by completely folded and completely unfolded molecules, with a negligible subpopulation of partially folded intermediates. Here, we redefine structural domains in thermodynamic terms as cooperative folding units, based on m-values, which measure the cooperativity of a protein or its substructures. In our analysis, a domain is equated to a contiguous segment of the folded protein whose m-value is largely unaffected when that segment is excised from its parent structure. Defined in this way, a domain is a self-contained cooperative unit; i.e., its cooperativity depends primarily upon intrasegment interactions, not intersegment interactions. Implementing this concept computationally, the domains in a large representative set of proteins were identified; all exhibit consistency with experimental findings. Specifically, our domain divisions correspond to the experimentally determined equilibrium folding intermediates in a set of nine proteins. The approach was also proofed against a representative set of 71 additional proteins, again with confirmatory results. Our reframed interpretation of a protein domain transforms an indeterminate structural phenomenon into a quantifiable molecular property grounded in solution thermodynamics.
Asunto(s)
Proteínas/química , Termodinámica , Algoritmos , Modelos Moleculares , Conformación ProteicaRESUMEN
A protein backbone has two degrees of conformational freedom per residue, described by its Ï,ψ-angles. Accordingly, the energy landscape of a blocked peptide unit can be mapped in two dimensions, as shown by Ramachandran, Sasisekharan, and Ramakrishnan almost half a century ago. With atoms approximated as hard spheres, the eponymous Ramachandran plot demonstrated that steric clashes alone eliminate 3/4 of Ï,ψ-space, a result that has guided all subsequent work. Here, we show that adding hydrogen-bonding constraints to these steric criteria eliminates another substantial region of Ï,ψ-space for a blocked peptide; for conformers within this region, an amide hydrogen is solvent-inaccessible, depriving it of a hydrogen-bonding partner. Yet, this "forbidden" region is well populated in folded proteins, which can provide longer-range intramolecular hydrogen-bond partners for these otherwise unsatisfied polar groups. Consequently, conformational space expands under folding conditions, a paradigm-shifting realization that prompts an experimentally verifiable conjecture about likely folding pathways.
Asunto(s)
Amidas/química , Modelos Moleculares , Conformación Proteica , Pliegue de Proteína , Amidas/metabolismo , Bases de Datos de Proteínas , Enlace de Hidrógeno , Simulación de Dinámica MolecularRESUMEN
We present a physically rigorous method to calculate solvent-dependent accessible surface areas (ASAs) of amino acid residues in unfolded proteins. ASA values will be larger in a good solvent, where solute-solvent interactions dominate and promote chain extension. Conversely, they will be smaller in a poor solvent, where solute-solute interactions dominate and promote chain collapse. In the method described here, these solvent-dependent effects are modeled by Boltzmann-weighting a simulated ensemble for solvent quality-good or poor. Solvent quality is parameterized as intramolecular hydrogen bond strength, using a "hydrogen bond dial" that can be varied from "off" to "high" (i.e., from 0 to -6 kcal/mol per hydrogen bond). When plotted as a function of hydrogen bond strength, the Boltzmann-weighted distribution of conformers describes a sigmoidal curve, with a transition midpoint near 1.5 kcal/mol per hydrogen bond. ASA tables for the 20 residues are provided under good solvent conditions and at this transition midpoint. For the backbone, these midpoint ASA values are found to be in good agreement with the earlier estimate of unfolded state ASA given by the mean of Creamer's upper and lower bounds [Creamer TP, et al. (1997) Biochemistry 36:2832-2835], a gratifying result in that cosolvents of experimental interest, such as urea (good solvent) and trimethylamine N-oxide (poor solvent), are known to affect the backbone predominantly. Unanticipated results from our simulations predict that a significant population of three-residue, hydrogen-bonded turns (inverse gamma-turns) will be detectable in blocked polyalanyl heptamers in poor solvent-an experimentally verifiable conjecture.
Asunto(s)
Modelos Moleculares , Oligopéptidos/química , Desnaturalización Proteica/efectos de los fármacos , Solventes/farmacología , Enlace de Hidrógeno , Modelos Químicos , Péptidos , Conformación Proteica/efectos de los fármacos , Estructura Secundaria de Proteína , SolubilidadRESUMEN
This Perspective is intended to raise questions about the conventional interpretation of protein folding. According to the conventional interpretation, developed over many decades, a protein population can visit a vast number of conformations under unfolding conditions, but a single dominant native population emerges under folding conditions. Accordingly, folding comes with a substantial loss of conformational entropy. How is this price paid? The conventional answer is that favorable interactions between and among the side chains can compensate for entropy loss, and moreover, these interactions are responsible for the structural particulars of the native conformation. Challenging this interpretation, the Perspective introduces a proposal that high energy (i.e., unfavorable) excluding interactions winnow the accessible population substantially under physical-chemical conditions that favor folding. Both steric clash and unsatisfied hydrogen bond donors and acceptors are classified as excluding interactions, so called because conformers with such disfavored interactions will be largely excluded from the thermodynamic population. Both excluding interactions and solvent factors that induce compactness are somewhat nonspecific, yet together they promote substantial chain organization. Moreover, proteins are built on a backbone scaffold consisting of α-helices and strands of ß-sheet, where the number of hydrogen bond donors and acceptors is exactly balanced. These repetitive secondary structural elements are the only two conformers that can be both completely hydrogen-bond satisfied and extended indefinitely without encountering a steric clash. Consequently, the number of fundamental folds is limited to no more than ~10,000 for a protein domain. Once excluding interactions are taken into account, the issue of "frustration" is largely eliminated and the Levinthal paradox is resolved. Putting the "bottom line" at the top: it is likely that hydrogen-bond satisfaction represents a largely under-appreciated parameter in protein folding models.
Asunto(s)
Conformación Proteica , Pliegue de Proteína , Proteínas , Entropía , Enlace de Hidrógeno , Modelos Moleculares , Proteínas/química , Proteínas/metabolismo , TermodinámicaRESUMEN
The native state structures of globular proteins are stable and well packed indicating that self-interactions are favored over protein-solvent interactions under folding conditions. We use this as a guiding principle to derive the geometry of the building blocks of protein structures-α helices and strands assembled into ß sheets-with no adjustable parameters, no amino acid sequence information, and no chemistry. There is an almost perfect fit between the dictates of mathematics and physics and the rules of quantum chemistry. Protein evolution is facilitated by sequence-independent platforms, which can elaborate sequence-dependent functional diversity. Our work highlights the vital role of discreteness in life and may have implications for the creation of artificial life and on the nature of life elsewhere in the cosmos.
Asunto(s)
Física , Proteínas , Secuencia de Aminoácidos , Biología , Conformación Proteica , Conformación Proteica en Hélice alfa , Pliegue de ProteínaRESUMEN
New experimental results show that either gain or loss of close packing can be observed as a discrete step in protein folding or unfolding reactions. This finding poses a significant challenge to the conventional two-state model of protein folding. Results of interest involve dry molten globule (DMG) intermediates, an expanded form of the protein that lacks appreciable solvent. When an unfolding protein expands to the DMG state, side chains unlock and gain conformational entropy, while liquid-like van der Waals interactions persist. Four unrelated proteins are now known to form DMGs as the first step of unfolding, suggesting that such an intermediate may well be commonplace in both folding and unfolding. Data from the literature show that peptide amide protons are protected in the DMG, indicating that backbone structure is intact despite loss of side-chain close packing. Other complementary evidence shows that secondary structure formation provides a major source of compaction during folding. In our model, the major free-energy barrier separating unfolded from native states usually occurs during the transition between the unfolded state and the DMG. The absence of close packing at this barrier provides an explanation for why phi-values, derived from a Brønsted-Leffler plot, depend primarily on structure at the mutational site and not on specific side-chain interactions. The conventional two-state folding model breaks down when there are DMG intermediates, a realization that has major implications for future experimental work on the mechanism of protein folding.
Asunto(s)
Conformación Proteica , Estructura Secundaria de Proteína , Desplegamiento Proteico , Proteínas/química , Entropía , Modelos Moleculares , Pliegue de Proteína , Solventes/químicaRESUMEN
Understanding the process of protein folding has been recognized as an important challenge for >70 years. It is, quintessentially, a thermodynamic problem and, arguably, thermodynamics is our most powerful discipline for understanding biological systems. Yet, despite all this, we still lack predictive understanding of protein folding. Is something missing from this picture?
Asunto(s)
Pliegue de Proteína , Proteínas/química , Algoritmos , Enlace de Hidrógeno , Modelos Moleculares , Péptidos/química , Conformación Proteica , Estructura Secundaria de Proteína , Estructura Terciaria de Proteína , Termodinámica , Agua/químicaRESUMEN
We have been analyzing the extent to which protein secondary structure determines protein tertiary structure in simple protein folds. An earlier paper demonstrated that three-dimensional structure can be obtained successfully using only highly approximate backbone torsion angles for every residue. Here, the initial information is further diluted by introducing a realistic degree of experimental uncertainty into this process. In particular, we tackle the practical problem of determining three-dimensional structure solely from backbone chemical shifts, which can be measured directly by NMR and are known to be correlated with a protein's backbone torsion angles. Extending our previous algorithm to incorporate these experimentally determined data, clusters of structures compatible with the experimentally determined chemical shifts were generated by fragment assembly Monte Carlo. The cluster that corresponds to the native conformation was then identified based on four energy terms: steric clash, solvent-squeezing, hydrogen-bonding, and hydrophobic contact. Currently, the method has been applied successfully to five small proteins with simple topology. Although still under development, this approach offers promise for high-throughput NMR structure determination.
Asunto(s)
Método de Montecarlo , Resonancia Magnética Nuclear Biomolecular , Conformación Proteica , Enlace de Hidrógeno , Modelos MolecularesRESUMEN
Globular proteins are assemblies of alpha-helices and beta-strands, interconnected by reverse turns and longer loops. Most short turns can be classified readily into a limited repertoire of discrete backbone conformations, but the physical-chemical determinants of these distinct conformational basins remain an open question. We investigated this question by exhaustive analysis of all backbone conformations accessible to short chain segments bracketed by either an alpha-helix or a beta-strand (i.e., alpha-segment-alpha, beta-segment-beta, alpha-segment-beta, and beta-segment-alpha) in a nine-state model. We find that each of these four secondary structure environments imposes its own unique steric and hydrogen-bonding constraints on the intervening segment, resulting in a limited repertoire of conformations. In greater detail, an exhaustive set of conformations was generated for short backbone segments having reverse-turn chain topology and bracketed between elements of secondary structure. This set was filtered, and only clash-free, hydrogen-bond-satisfied conformers having reverse-turn topology were retained. The filtered set includes authentic turn conformations, observed in proteins of known structure, but little else. In particular, over 99% of the alternative conformations failed to satisfy at least one criterion and were excluded from the filtered set. Furthermore, almost all of the remaining alternative conformations have close tolerances that would be too tight to accommodate side chains longer than a single beta-carbon. These results provide a molecular explanation for the observation that reverse turns between elements of regular secondary can be classified into a small number of discrete conformations.
Asunto(s)
Estructura Secundaria de Proteína , Simulación por Computador , Bases de Datos de Proteínas , Enlace de Hidrógeno , Modelos Moleculares , Pliegue de Proteína , Proteínas/químicaRESUMEN
Genes composed of tandem repetitive sequence motifs are abundant in nature and are enriched in eukaryotes. To investigate repeat protein gene formation mechanisms, we have conducted a large-scale analysis of their introns and exons. We find that a wide variety of repeat motifs exhibit a striking conservation of intron position and phase, and are composed of exons that encode one or two complete repeats. These results suggest a simple model of repeat protein gene formation from local duplications. This model is corroborated by amino acid sequence similarity patterns among neighboring repeats from various repeat protein genes. The distribution of one- and two-repeat exons indicates that intron-facilitated repeat motif duplication, in which the start and end points of duplication are located in consecutive intronic regions, significantly exceeds intron-independent duplication. These results suggest that introns have contributed to the greater abundance of repeat protein genes in eukaryotic versus prokaryotic organisms, a conclusion that is supported by taxonomic analysis.
Asunto(s)
Duplicación de Gen , Intrones/genética , Proteínas/genética , Secuencias Repetitivas de Ácidos Nucleicos/genética , Animales , Secuencia Conservada/genética , ADN/genética , Exones/genética , Modelos Genéticos , Modelos MolecularesRESUMEN
Using a test set of 13 small, compact proteins, we demonstrate that a remarkably simple protocol can capture native topology from secondary structure information alone, in the absence of long-range interactions. It has been a long-standing open question whether such information is sufficient to determine a protein's fold. Indeed, even the far simpler problem of reconstructing the three-dimensional structure of a protein from its exact backbone torsion angles has remained a difficult challenge owing to the small, but cumulative, deviations from ideality in backbone planarity, which, if ignored, cause large errors in structure. As a familiar example, a small change in an elbow angle causes a large displacement at the end of your arm; the longer the arm, the larger the displacement. Here, correct secondary structure assignments (alpha-helix, beta-strand, beta-turn, polyproline II, coil) were used to constrain polypeptide backbone chains devoid of side chains, and the most stable folded conformations were determined, using Monte Carlo simulation. Just three terms were used to assess stability: molecular compaction, steric exclusion, and hydrogen bonding. For nine of the 13 proteins, this protocol restricts the main chain to a surprisingly small number of energetically favorable topologies, with the native one prominent among them.
Asunto(s)
Pliegue de Proteína , Estructura Secundaria de Proteína , Simulación por Computador , Enlace de Hidrógeno , Proteínas del Tejido Nervioso/química , Conformación ProteicaRESUMEN
The magnitude of protein conformational space is over-estimated by the traditional random-coil model, in which local steric restrictions arise exclusively from interactions between adjacent chain neighbors. Using a five-state model, we assessed the extent to which steric hindrance and hydrogen bond satisfaction, energetically significant factors, impose additional conformational restrictions on polypeptide chains, beyond adjacent residues. Steric hindrance is repulsive: the distance of closest approach between any two atoms cannot be less than the sum of their van der Waals radii. Hydrogen bond satisfaction is attractive: polar backbone atoms must form hydrogen bonds, either intramolecularly or to solvent water. To gauge the impact of these two factors on the magnitude of conformational space, we systematically enumerated and classified the disfavored conformations that restrict short polyalanyl backbone chains. Applying such restrictions to longer chains, we derived a scaling law to estimate conformational restriction as a function of chain length. Disfavored conformations predicted by the model were tested against experimentally determined structures in the coil library, a non-helix, non-strand subset of the PDB. These disfavored conformations are usually absent from the coil library, and exceptions can be uniformly rationalized.
Asunto(s)
Simulación por Computador , Fragmentos de Péptidos/química , Conformación Proteica , Pliegue de Proteína , Proteínas/química , Agua/química , Alanina/química , Alanina/metabolismo , Enlace de Hidrógeno , Modelos Químicos , Fragmentos de Péptidos/metabolismo , Proteínas/metabolismo , Solventes , Estereoisomerismo , Termodinámica , Agua/metabolismoRESUMEN
RNABase is a unified database of all three-dimensional structures containing RNA deposited in either the Protein Data Bank (PDB) or Nucleic Acid Data Base (NDB). For each structure, RNABase contains a brief summary as well as annotation of conformational parameters, identification of possible model errors, Ramachandran-style conformational maps and classification of ribonucleotides into conformers. These same analyses can also be performed on structures submitted by users. To facilitate access, structures are automatically placed into a variety of functional and structural categories, including: ribozymes, pseudoknots, etc. RNABase can be freely accessed on the web at http://www.rnabase.org. We are committed to maintaining this database indefinitely.