RESUMEN
Cytochrome P450cam (CYP101A1) catalyzes the hydroxylation of d-camphor by molecular oxygen. The enzyme-catalyzed hydroxylation exhibits a high degree of regioselectivity and stereoselectivity, with a single major product, d-5-exo-hydroxycamphor, suggesting that the substrate is oriented to facilitate this specificity. In previous work, we used an elastic network model and perturbation response scanning to show that normal deformation modes of the enzyme structure are highly responsive not only to the presence of a substrate but also to the substrate orientation. This work examines the effects of mutations near the active site on substrate localization and orientation. The investigated mutations were designed to promote a change in substrate orientation and/or location that might give rise to different hydroxylation products, while maintaining the same carbon and oxygen atom balances as in the wild type (WT) enzyme. Computational experiments and parallel in vitro site-directed mutations of CYP101A1 were used to examine reaction products and enzyme activity. 1H-15N TROSY-HSQC correlation maps were used to compare the computational results with detectable perturbations in the enzyme structure and dynamics. We found that all of the mutant enzymes retained the same regio- and stereospecificity of hydroxylation as the WT enzyme, with varying degrees of efficiency, which suggests that large portions of the enzyme have been subjected to evolutionary pressure to arrive at the appropriate sequence-structure combination for efficient 5-exo hydroxylation of camphor.
Asunto(s)
Alcanfor 5-Monooxigenasa , Alcanfor , Alcanfor/química , Alcanfor 5-Monooxigenasa/química , Dominio Catalítico , Sistema Enzimático del Citocromo P-450/metabolismo , Hidroxilación , Mutación , Oxígeno , Especificidad por SustratoRESUMEN
It has long been recognized that certain sites within a protein, such as sites in the protein core or catalytic residues in enzymes, are evolutionarily more conserved than other sites. However, our understanding of rate variation among sites remains surprisingly limited. Recent progress to address this includes the development of a wide array of reliable methods to estimate site-specific substitution rates from sequence alignments. In addition, several molecular traits have been identified that correlate with site-specific mutation rates, and novel mechanistic biophysical models have been proposed to explain the observed correlations. Nonetheless, current models explain, at best, approximately 60% of the observed variance, highlighting the limitations of current methods and models and the need for new research directions.
Asunto(s)
Evolución Molecular , Proteínas/genética , Biología Computacional/métodos , Proteínas/química , Proteínas/metabolismoRESUMEN
Recent studies proposed that enzyme-active sites induce evolutionary constraints at long distances. The physical origin of such long-range evolutionary coupling is unknown. Here, I use a recent biophysical model of evolution to study the relationship between physical and evolutionary couplings on a diverse data set of monomeric enzymes. I show that evolutionary coupling is not universally long-range. Rather, range varies widely among enzymes, from 2 to 20 Å. Furthermore, the evolutionary coupling range of an enzyme does not inform on the underlying physical coupling, which is short range for all enzymes. Rather, evolutionary coupling range is determined by functional selection pressure.
Asunto(s)
Enzimas , Evolución Molecular , Dominio Catalítico , Enzimas/genética , Enzimas/metabolismoRESUMEN
The rate of evolution varies among sites within proteins. In enzymes, two rate gradients are observed: rate decreases with increasing local packing and it increases with increasing distance from catalytic residues. The rate-packing gradient would be mainly due to stability constraints and is well reproduced by biophysical models with selection for protein stability. However, stability constraints are unlikely to account for the rate-distance gradient. Here, to explore the mechanistic underpinnings of the rate gradients observed in enzymes, I propose a stability-activity model of enzyme evolution, MSA. This model is based on a two-dimensional fitness function that depends on stability, quantified by ΔG, the enzyme's folding free energy, and activity, quantified by ΔG*, the activation energy barrier of the enzymatic reaction. I test MSA on a diverse data set of enzymes, comparing it with two simpler models: MS, which depends only on ΔG, and MA, which depends only on ΔG*. I found that MSA clearly outperforms both MS and MA and it accounts for both the rate-packing and rate-distance gradients. Thus, MSA captures the distribution of stability and activity constraints within enzymes, explaining the resulting patterns of rate variation among sites.
Asunto(s)
Activación Enzimática/genética , Enzimas/genética , Evolución Molecular , Modelos Genéticos , Estabilidad Proteica , Enzimas/metabolismo , Aptitud GenéticaRESUMEN
Functional residues in proteins tend to be highly conserved over evolutionary time. However, to what extent functional sites impose evolutionary constraints on nearby or even more distant residues is not known. Here, we report pervasive conservation gradients toward catalytic residues in a dataset of 524 distinct enzymes: evolutionary conservation decreases approximately linearly with increasing distance to the nearest catalytic residue in the protein structure. This trend encompasses, on average, 80% of the residues in any enzyme, and it is independent of known structural constraints on protein evolution such as residue packing or solvent accessibility. Further, the trend exists in both monomeric and multimeric enzymes and irrespective of enzyme size and/or location of the active site in the enzyme structure. By contrast, sites in protein-protein interfaces, unlike catalytic residues, are only weakly conserved and induce only minor rate gradients. In aggregate, these observations show that functional sites, and in particular catalytic residues, induce long-range evolutionary constraints in enzymes.
Asunto(s)
Enzimas/química , Enzimas/metabolismo , Evolución Molecular , Secuencia de Aminoácidos , Dominio Catalítico , Secuencia Conservada , Bases de Datos de Proteínas , Conformación Proteica , Dominios y Motivos de Interacción de ProteínasRESUMEN
Protein sequences evolve under selection pressures imposed by functional and biophysical requirements, resulting in site-dependent rates of amino acid substitution. Relative solvent accessibility (RSA) and local packing density (LPD) have emerged as the best candidates to quantify structural constraint. Recent research assumes that RSA is the main determinant of sequence divergence. However, it is not yet clear which is the best predictor of substitution rates. To address this issue, we compared RSA and LPD with site-specific rates of evolution for a diverse data set of enzymes. In contrast with recent studies, we found that LPD measures correlate better than RSA with evolutionary rate. Moreover, the independent contribution of RSA is minor. Taking into account that LPD is related to backbone flexibility, we put forward the possibility that the rate of evolution of a site is determined by the ease with which the backbone deforms to accommodate mutations.
Asunto(s)
Enzimas/química , Evolución Molecular , Relación Estructura-Actividad , Sustitución de Aminoácidos , Mutación , Conformación Proteica , SolventesRESUMEN
Evolutionary-rate variation among sites within proteins depends on functional and biophysical properties that constrain protein evolution. It is generally accepted that proteins must be able to fold stably in order to function. However, the relationship between stability constraints and among-sites rate variation is not well understood. Here, we present a biophysical model that links the thermodynamic stability changes due to mutations at sites in proteins ([Formula: see text]) to the rate at which mutations accumulate at those sites over evolutionary time. We find that such a 'stability model' generally performs well, displaying correlations between predicted and empirically observed rates of up to 0.75 for some proteins. We further find that our model has comparable predictive power as does an alternative, recently proposed 'stress model' that explains evolutionary-rate variation among sites in terms of the excess energy needed for mutants to adopt the correct active structure ([Formula: see text]). The two models make distinct predictions, though, and for some proteins the stability model outperforms the stress model and vice versa. We conclude that both stability and stress constrain site-specific sequence evolution in proteins.
Asunto(s)
Secuencia de Aminoácidos , Evolución Molecular , Mutación , Modelos Genéticos , TermodinámicaRESUMEN
BACKGROUND: Protein sites evolve at different rates due to functional and biophysical constraints. It is usually considered that the main structural determinant of a site's rate of evolution is its Relative Solvent Accessibility (RSA). However, a recent comparative study has shown that the main structural determinant is the site's Local Packing Density (LPD). LPD is related with dynamical flexibility, which has also been shown to correlate with sequence variability. Our purpose is to investigate the mechanism that connects a site's LPD with its rate of evolution. RESULTS: We consider two models: an empirical Flexibility Model and a mechanistic Stress Model. The Flexibility Model postulates a linear increase of site-specific rate of evolution with dynamical flexibility. The Stress Model, introduced here, models mutations as random perturbations of the protein's potential energy landscape, for which we use simple Elastic Network Models (ENMs). To account for natural selection we assume a single active conformation and use basic statistical physics to derive a linear relationship between site-specific evolutionary rates and the local stress of the mutant's active conformation.We compare both models on a large and diverse dataset of enzymes. In a protein-by-protein study we found that the Stress Model outperforms the Flexibility Model for most proteins. Pooling all proteins together we show that the Stress Model is strongly supported by the total weight of evidence. Moreover, it accounts for the observed nonlinear dependence of sequence variability on flexibility. Finally, when mutational stress is controlled for, there is very little remaining correlation between sequence variability and dynamical flexibility. CONCLUSIONS: We developed a mechanistic Stress Model of evolution according to which the rate of evolution of a site is predicted to depend linearly on the local mutational stress of the active conformation. Such local stress is proportional to LPD, so that this model explains the relationship between LPD and evolutionary rate. Moreover, the model also accounts for the nonlinear dependence between evolutionary rate and dynamical flexibility.
Asunto(s)
Evolución Molecular , Proteínas/genética , Estrés Mecánico , Evolución Biológica , Modelos Genéticos , DocilidadRESUMEN
MOTIVATION: The function of a protein depends not only on its structure but also on its dynamics. This is at the basis of a large body of experimental and theoretical work on protein dynamics. Further insight into the dynamics-function relationship can be gained by studying the evolutionary divergence of protein motions. To investigate this, we need appropriate comparative dynamics methods. The most used dynamical similarity score is the correlation between the root mean square fluctuations (RMSF) of aligned residues. Despite its usefulness, RMSF is in general less evolutionarily conserved than the native structure. A fundamental issue is whether RMSF is not as conserved as structure because dynamics is less conserved or because RMSF is not the best property to use to study its conservation. RESULTS: We performed a systematic assessment of several scores that quantify the (dis)similarity between protein fluctuation patterns. We show that the best scores perform as well as or better than structural dissimilarity, as assessed by their consistency with the SCOP classification. We conclude that to uncover the full extent of the evolutionary conservation of protein fluctuation patterns, it is important to measure the directions of fluctuations and their correlations between sites. CONTACT: Nathalie.Reuter@mbi.uib.no SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics Online.
Asunto(s)
Biología Computacional/métodos , Bases de Datos de Proteínas , Proteínas/química , Estructura Terciaria de ProteínaRESUMEN
Studying the effect of perturbations on protein structure is a basic approach in protein research. Important problems, such as predicting pathological mutations and understanding patterns of structural evolution, have been addressed by computational simulations that model mutations using forces and predict the resulting deformations. In single mutation-response scanning simulations, a sensitivity matrix is obtained by averaging deformations over point mutations. In double mutation-response scanning simulations, a compensation matrix is obtained by minimizing deformations over pairs of mutations. These very useful simulation-based methods may be too slow to deal with large proteins, protein complexes, or large protein databases. To address this issue, I derived analytical closed formulas to calculate the sensitivity and compensation matrices directly, without simulations. Here, I present these derivations and show that the resulting analytical methods are much faster than their simulation counterparts.
RESUMEN
It was recently found that the lowest-energy collective normal modes dominate the evolutionary divergence of protein structures. This was attributed to a presumed functional importance of such motions, i.e., to natural selection. In contrast to this selectionist explanation, we proposed that the observed behavior could be just the expected physical response of proteins to random mutations. This proposal was based on the success of a linearly forced elastic network model (LFENM) of mutational effects on structure to account for the observed pattern of structural divergence. Here, to further test the mutational explanation and the LFENM, we analyze the structural differences observed not only in homologous (globin-like) proteins but also in unselected experimentally engineered myoglobin mutants and in wild-type variants subject to other perturbations such as ligand-binding and pH changes. We show that the lowest normal modes dominate structural change in all the cases considered and that the LFENM reproduces this behavior quantitatively. The collective nature of the lowest normal modes results in global conformational changes that depend little on the exact nature or location of the perturbation. Significantly, the evolutionarily conserved structural core matches the regions observed to be more robust with respect to mutations, so that the core would be more conserved even under unselected random mutations. In a word, the observed patterns of structural variation can be seen as the natural response of proteins to perturbations and can be adequately modeled using the LFENM, which serves as a common framework to relate a priori different phenomena.
Asunto(s)
Globinas/química , Globinas/genética , Animales , Simulación por Computador , Bases de Datos de Proteínas , Evolución Molecular , Modelos Genéticos , Mutación , Mioglobina/química , Mioglobina/genética , Conformación Proteica , BallenasRESUMEN
Protein structures do not evolve uniformly, but the degree of structure divergence varies among sites. The resulting site-dependent structure divergence patterns emerge from a process that involves mutation and selection, which may both, in principle, influence the emergent pattern. In contrast with sequence divergence patterns, which are known to be mainly determined by selection, the relative contributions of mutation and selection to structure divergence patterns is unclear. Here, studying 6 protein families with a mechanistic biophysical model of protein evolution, we untangle the effects of mutation and selection. We found that even in the absence of selection, structure divergence varies from site to site because the mutational sensitivity is not uniform. Selection scales the profile, increasing its amplitude, without changing its shape. This scaling effect follows from the similarity between mutational sensitivity and sequence variability profiles.
RESUMEN
The aim of the present work is to study the evolutionary divergence of vibrational protein dynamics. To this end, we used the Gaussian Network Model to perform a systematic analysis of normal mode conservation on a large dataset of proteins classified into homologous sets of family pairs and superfamily pairs. We found that the lowest most collective normal modes are the most conserved ones. More precisely, there is, on average, a linear correlation between normal mode conservation and mode collectivity. These results imply that the previously observed conservation of backbone flexibility (B-factor) profiles is due to the conservation of the most collective modes, which contribute the most to such profiles. We discuss the possible roles of normal mode robustness and natural selection in the determination of the observed behavior. Finally, we draw some practical implications for dynamics-based protein alignment and classification and discuss possible caveats of the present approach.
Asunto(s)
Evolución Molecular , Modelos Moleculares , Proteínas/genética , Alineación de Secuencia , Proteínas/clasificación , Alineación de Secuencia/métodosRESUMEN
Noncovalent interactions and physicochemical properties of amino acids are important topics in biochemistry courses. Here, we present a computational laboratory where the capacity of each of the 20 amino acids to maintain different noncovalent interactions are used to investigate the stabilizing forces in a set of proteins coming from organisms adapted to different environments. Using protein sequence and structure information it is possible to evaluate the noncovalent contributions to the stabilization of a given protein fold. As a case study, we use the protein lumazine synthase from three different organisms adapted to live in extreme temperatures: one psychrophilic (optimal growth temperature, 0-20 °C), one mesophilic (optimal growth temperature, 20-50 °C), and one thermophilic (optimal growth temperature, 80-110 °C). We found that this computational laboratory for biochemistry and molecular biology courses enhances student amino acid noncovalent interaction understanding and how these interactions are involved in protein stability.
RESUMEN
The integration of molecular evolution and protein biophysics is an emerging theme that steadily gained importance during the last 15 years, significantly advancing both fields. The central integrative concept is the stability of the native state, although non-native conformations are increasingly recognized to play a major role, concerning, for example, aggregation, folding kinetics, or functional dynamics. Besides molecular requirements on fitness, the stability of native and alternative conformations is modulated by a variety of factors, including population size, selective pressure on the replicative system, which determines mutation rates and biases, and epistatic effects. We discuss some of the recent advances, open questions, and integrating views in protein evolution, in light of the many underlying trade-offs, correlations, and dichotomies.
Asunto(s)
Fenómenos Biofísicos , Evolución Molecular , Proteínas/metabolismo , Mutación , Estabilidad Proteica , Proteínas/química , Proteínas/genéticaRESUMEN
For decades, rates of protein evolution have been interpreted in terms of the vague concept of functional importance. Slowly evolving proteins or sites within proteins were assumed to be more functionally important and thus subject to stronger selection pressure. More recently, biophysical models of protein evolution, which combine evolutionary theory with protein biophysics, have completely revolutionized our view of the forces that shape sequence divergence. Slowly evolving proteins have been found to evolve slowly because of selection against toxic misfolding and misinteractions, linking their rate of evolution primarily to their abundance. Similarly, most slowly evolving sites in proteins are not directly involved in function, but mutating these sites has a large impact on protein structure and stability. In this article, we review the studies in the emerging field of biophysical protein evolution that have shaped our current understanding of sequence divergence patterns. We also propose future research directions to develop this nascent field.
Asunto(s)
Evolución Molecular , Proteínas/química , Proteínas/genética , Fenómenos Biofísicos , Biofisica , Aptitud Genética , Humanos , Mutación , Pliegue de Proteína , Estabilidad Proteica , TermodinámicaRESUMEN
We have obtained AMBER94 force-field parameters for the TTQ cofactor of the enzyme methylamine dehydrogenase (MADH). This enzyme catalyzes the oxidation of methylamine to produce formaldehyde and ammonia. In the rate-determining step of the catalyzed reaction, a proton is transferred from the methyl group of the substrate to residue Asp76. We used the new parameters to perform molecular dynamics simulations of MADH in order to characterize the dynamics of the active site prior to the proton-transfer step. We found that only one of the oxygen atoms of Asp76 can act as an acceptor of the proton. The other oxygen interacts with Thr122 via a strong hydrogen bond. In contrast, because of the rotation the methyl group of the substrate, the three methyl hydrogen atoms are alternately in position to be transferred. The distance that the proton has to travel presents a broad distribution with a peak between 1.0 and 1.1 A and reaches values as short as 0.8 A. The fluctuation of the distance between the donor and the acceptor has the largest frequency component at 50 cm(-1), but the spectrum presents a rich structure between 10 and 400 cm(-1). The more important peaks appear below 250 cm(-1).
Asunto(s)
Oxidorreductasas actuantes sobre Donantes de Grupo CH-NH/química , Sitios de Unión , Enlace de Hidrógeno , Modelos MolecularesRESUMEN
BACKGROUND: Protein structure research often deals with the comparison of two or more structures of the same protein, for instance when handling alternative structure models for the same protein, point mutants, molecule movements, structure predictions, etc. Often the difference between structures is small, restricted to a local neighborhood, and buried in structural "noise" due to trivial differences resulting from experimental artifacts. In such cases, whole-structure comparisons by means of structure superposition may be unsatisfactory and researchers have to perform a tedious process of manually superposing different segments individually and/or use different frames of reference, chosen roughly by educated guessing. RESULTS: We have developed an algorithm to compare local structural differences between alternative structures of the same protein. We have implemented the algorithm through a computer program that performs the numerical evaluation and allows inspecting visually the results of the structure comparison. We have tested the algorithm on different kinds of model systems. Here we present the algorithm and some results to illustrate its characteristics. CONCLUSION: This program may provide an insight into the local structural changes produced in a protein structure by different interactions or modifications. It is convenient for the general user and it can be applied to standard or specific tasks on protein structure research.
Asunto(s)
Algoritmos , Modelos Moleculares , Proteínas/química , Presentación de Datos , Modelos Estructurales , Alineación de SecuenciaRESUMEN
The Structurally Constrained Protein Evolution (SCPE) model simulates protein evolution by introducing random mutations into the evolving sequences and selecting them against too much structural perturbation. Given a single protein structure, the SCPE model can be used to obtain a whole set of site-dependent amino acid substitution matrices. The set of SCPE substitution matrices for a given protein family can be seen as an independent-sites model of evolution for that family. Thus, these matrices can be compared with other substitution-matrix-based models of evolution. So far, SCPE has been tested only on left-handed parallel beta helix (LbetaH) proteins. Here, we address the question of generality by assessing the SCPE model on representatives of the four main classes of folds: alpha, beta, alpha+beta, and alpha/beta. We compare with other models using the likelihood ratio test with parametric bootstrapping. We show that SCPE performs better than the popular JTT model for all cases considered. Furthermore, by considering the relative contributions of mutation and selection, we found that the key to the success of the SCPE model is the selection step.
Asunto(s)
Evolución Molecular , Modelos Genéticos , Proteínas/genética , Algoritmos , Bases de Datos de Proteínas , Mutación , Filogenia , Pliegue de Proteína , Proteínas/química , Selección GenéticaRESUMEN
In protein evolution, due to functional and biophysical constraints, the rates of amino acid substitution differ from site to site. Among the best predictors of site-specific rates are solvent accessibility and packing density. The packing density measure that best correlates with rates is the weighted contact number (WCN), the sum of inverse square distances between a site's C α and the C α of the other sites. According to a mechanistic stress model proposed recently, rates are determined by packing because mutating packed sites stresses and destabilizes the protein's active conformation. While WCN is a measure of C α packing, mutations replace side chains. Here, we consider whether a site's evolutionary divergence is constrained by main-chain packing or side-chain packing. To address this issue, we extended the stress theory to model side chains explicitly. The theory predicts that rates should depend solely on side-chain contact density. We tested this prediction on a data set of structurally and functionally diverse monomeric enzymes. We compared side-chain contact density with main-chain contact density measures and with relative solvent accessibility (RSA). We found that side-chain contact density is the best predictor of rate variation among sites (it explains 39.2% of the variation). Moreover, the independent contribution of main-chain contact density measures and RSA are negligible. Thus, as predicted by the stress theory, site-specific evolutionary rates are determined by side-chain packing.