RESUMEN
Enzymes have been shaped by evolution over billions of years to catalyse the chemical reactions that support life on earth. Dispersed in the literature, or organised in online databases, knowledge about enzymes can be structured in distinct dimensions, either related to their quality as biological macromolecules, such as their sequence and structure, or related to their chemical functions, such as the catalytic site, kinetics, mechanism, and overall reaction. The evolution of enzymes can only be understood when each of these dimensions is considered. In addition, many of the properties of enzymes only make sense in the light of evolution. We start this review by outlining the main paradigms of enzyme evolution, including gene duplication and divergence, convergent evolution, and evolution by recombination of domains. In the second part, we overview the current collective knowledge about enzymes, as organised by different types of data and collected in several databases. We also highlight some increasingly powerful computational tools that can be used to close gaps in understanding, in particular for types of data that require laborious experimental protocols. We believe that recent advances in protein structure prediction will be a powerful catalyst for the prediction of binding, mechanism, and ultimately, chemical reactions. A comprehensive mapping of enzyme function and evolution may be attainable in the near future.
Asunto(s)
Biología Computacional , Enzimas , Proteínas , Catálisis , Dominio Catalítico , Enzimas/genética , Enzimas/metabolismo , Evolución Molecular , Proteínas/genéticaRESUMEN
Over the years, hundreds of enzyme reaction mechanisms have been studied using experimental and simulation methods. This rich literature on biological catalysis is now ripe for use as the foundation of new knowledge-based approaches to investigate enzyme mechanisms. Here, we present a tool able to automatically infer mechanistic paths for a given three-dimensional active site and enzyme reaction, based on a set of catalytic rules compiled from the Mechanism and Catalytic Site Atlas, a database of enzyme mechanisms. EzMechanism (pronounced as 'Easy' Mechanism) is available to everyone through a web user interface. When studying a mechanism, EzMechanism facilitates and improves the generation of hypotheses, by making sure that relevant information is considered, as derived from the literature on both related and unrelated enzymes. We validated EzMechanism on a set of 62 enzymes and have identified paths for further improvement, including the need for additional and more generic catalytic rules.
RESUMEN
Enzyme catalysis is governed by a limited toolkit of residues and organic or inorganic co-factors. Therefore, it is expected that recurring residue arrangements will be found across the enzyme space, which perform a defined catalytic function, are structurally similar and occur in unrelated enzymes. Leveraging the integrated information in the Mechanism and Catalytic Site Atlas (M-CSA) (enzyme structure, sequence, catalytic residue annotations, catalysed reaction, detailed mechanism description), 3D templates were derived to represent compact groups of catalytic residues. A fuzzy template-template search, allowed us to identify those recurring motifs, which are conserved or convergent, that we define as the "modules of enzyme catalysis". We show that a large fraction of these modules facilitate binding of metal ions, co-factors and substrates, and are frequently the result of convergent evolution. A smaller number of convergent modules perform a well-defined catalytic role, such as the variants of the catalytic triad (i.e. Ser-His-Asp/Cys-His-Asp) and the saccharide-cleaving Asp/Glu triad. It is also shown that enzymes whose functions have diverged during evolution preserve regions of their active site unaltered, as shown by modules performing similar or identical steps of the catalytic mechanism. We have compiled a comprehensive library of catalytic modules, that characterise a broad spectrum of enzymes. These modules can be used as templates in enzyme design and for better understanding catalysis in 3D.
RESUMEN
The drug discovery process involves designing compounds to selectively interact with their targets. The majority of therapeutic targets for low molecular weight (small molecule) drugs are proteins. The outstanding accuracy with which recent artificial intelligence methods compile the three-dimensional structure of proteins has made protein targets more accessible to the drug design process. Here, we present our perspective of the significance of accurate protein structure prediction on various stages of the small molecule drug discovery life cycle focusing on current capabilities and assessing how further evolution of such predictive procedures can have a more decisive impact in the discovery of new medicines.
Asunto(s)
Inteligencia Artificial , Descubrimiento de Drogas , Descubrimiento de Drogas/métodos , Diseño de Fármacos , Proteínas/químicaRESUMEN
Most proteins fold into 3D structures that determine how they function and orchestrate the biological processes of the cell. Recent developments in computational methods for protein structure predictions have reached the accuracy of experimentally determined models. Although this has been independently verified, the implementation of these methods across structural-biology applications remains to be tested. Here, we evaluate the use of AlphaFold2 (AF2) predictions in the study of characteristic structural elements; the impact of missense variants; function and ligand binding site predictions; modeling of interactions; and modeling of experimental structural data. For 11 proteomes, an average of 25% additional residues can be confidently modeled when compared with homology modeling, identifying structural features rarely seen in the Protein Data Bank. AF2-based predictions of protein disorder and complexes surpass dedicated tools, and AF2 models can be used across diverse applications equally well compared with experimentally determined structures, when the confidence metrics are critically considered. In summary, we find that these advances are likely to have a transformative impact in structural biology and broader life-science research.
Asunto(s)
Biología Computacional , Furilfuramida , Biología Computacional/métodos , Sitios de Unión , Proteínas/química , Bases de Datos de Proteínas , Conformación ProteicaRESUMEN
Conformational variation in catalytic residues can be captured as alternative snapshots in enzyme crystal structures. Addressing the question of whether active site flexibility is an intrinsic and essential property of enzymes for catalysis, we present a comprehensive study on the 3D variation of active sites of 925 enzyme families, using explicit catalytic residue annotations from the Mechanism and Catalytic Site Atlas and structural data from the Protein Data Bank. Through weighted pairwise superposition of the functional atoms of active sites, we captured structural variability at single-residue level and examined the geometrical changes as ligands bind or as mutations occur. We demonstrate that catalytic centres of enzymes can be inherently rigid or flexible to various degrees according to the function they perform, and structural variability most often involves a subset of the catalytic residues, usually those not directly involved in the formation or cleavage of bonds. Moreover, data suggest that 2/3 of active sites are flexible, and in half of those, flexibility is only observed in the side chain. The goal of this work is to characterise our current knowledge of the extent of flexibility at the heart of catalysis and ultimately place our findings in the context of the evolution of catalysis as enzymes evolve new functions and bind different substrates.
Asunto(s)
Biocatálisis , Dominio Catalítico , Enzimas , Bases de Datos de Proteínas , Enzimas/química , LigandosRESUMEN
Enzyme reactions take place in the active site through a series of catalytic steps, which are collectively termed the enzyme mechanism. The catalytic step is thereby the individual unit to consider for the purposes of building new enzyme mechanisms - i.e. through the mix and match of individual catalytic steps, new enzyme mechanisms and reactions can be conceived. In the case of natural evolution, it has been shown that new enzyme functions have emerged through the tweaking of existing mechanisms by the addition, removal, or modification of some catalytic steps, while maintaining other steps of the mechanism intact. Recently, we have extracted and codified the information on the catalytic steps of hundreds of enzymes in a machine-readable way, with the aim of automating this kind of evolutionary analysis. In this paper, we illustrate how these data, which we called the "rules of enzyme catalysis", can be used to identify similar catalytic steps across enzymes that differ in their overall function and/or structural folds. A discussion on a set of three enzymes that share part of their mechanism is used as an exemplar to illustrate how this approach can reveal divergent and convergent evolution of enzymes at the mechanistic level. Supplementary Information: The online version contains supplementary material available at 10.1007/s12551-022-01022-9.
RESUMEN
MOTIVATION: The discovery of protein-ligand-binding sites is a major step for elucidating protein function and for investigating new functional roles. Detecting protein-ligand-binding sites experimentally is time-consuming and expensive. Thus, a variety of in silico methods to detect and predict binding sites was proposed as they can be scalable, fast and present low cost. RESULTS: We proposed Graph-based Residue neighborhood Strategy to Predict binding sites (GRaSP), a novel residue centric and scalable method to predict ligand-binding site residues. It is based on a supervised learning strategy that models the residue environment as a graph at the atomic level. Results show that GRaSP made compatible or superior predictions when compared with methods described in the literature. GRaSP outperformed six other residue-centric methods, including the one considered as state-of-the-art. Also, our method achieved better results than the method from CAMEO independent assessment. GRaSP ranked second when compared with five state-of-the-art pocket-centric methods, which we consider a significant result, as it was not devised to predict pockets. Finally, our method proved scalable as it took 10-20 s on average to predict the binding site for a protein complex whereas the state-of-the-art residue-centric method takes 2-5 h on average. AVAILABILITY AND IMPLEMENTATION: The source code and datasets are available at https://github.com/charles-abreu/GRaSP. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Asunto(s)
Proteínas , Programas Informáticos , Sitios de Unión , Fuerza de la Mano , LigandosRESUMEN
Pseudoenzymes are proteins that are evolutionary related to enzymes but lack relevant catalytic activity. They are usually evolved from enzymatic ancestors that have lost their catalytic activities. The loss of catalytic function is one extreme amongst the other evolutionary changes that can occur to enzymes, like the changing of substrate specificity or the reaction catalysed. However, the loss of catalytic function events remain poorly characterised, except for some notable examples, like the pseudokinases. In this review, we aim to analyse current knowledge related to pseudoenzymes across a large number of enzymes families. This aims to be a review of the data available in biological databases, rather than a more traditional literature review. In particular, we use UniProtKB as the source for functional annotation and M-CSA (Mechanism and Catalytic Site Atlas) for information on the catalytic residues of enzymes. We show that explicit annotation of lack of activity is not exhaustive in UniProtKB and that a protocol using lack of catalytic annotation as an indication for lack of function can be an adequate alternative, after some corrections. After identifying pseudoenzymes related to enzymes in M-CSA, we were able to comment on their prevalence across enzyme families, and on the correlation between lack of catalytic function and the mutation of catalytic residues. These analyses challenge two common ideas in the emerging literature: that pseudoenzymes are ubiquitous across enzyme families and that mutations in the catalytic residues of enzyme homologues are always a good indication of lack of activity.
Asunto(s)
Enzimas , Bases del Conocimiento , Anotación de Secuencia Molecular/métodos , Proteínas/análisis , Proteínas/metabolismo , Humanos , Anotación de Secuencia Molecular/normasRESUMEN
The catalytic residues of an enzyme comprise the amino acids located in the active center responsible for accelerating the enzyme-catalyzed reaction. These residues lower the activation energy of reactions by performing several catalytic functions. Decades of enzymology research has established general themes regarding the roles of specific residues in these catalytic reactions, but it has been more difficult to explore these roles in a more systematic way. Here, we review the data on the catalytic residues of 648 enzymes, as annotated in the Mechanism and Catalytic Site Atlas (M-CSA), and compare our results with those in previous studies. We structured this analysis around three key properties of the catalytic residues: amino acid type, catalytic function, and sequence conservation in homologous proteins. As expected, we observed that catalysis is mostly accomplished by a small set of residues performing a limited number of catalytic functions. Catalytic residues are typically highly conserved, but to a smaller degree in homologues that perform different reactions or are nonenzymes (pseudoenzymes). Cross-analysis yielded further insights revealing which residues perform particular functions and how often. We obtained more detailed specificity rules for certain functions by identifying the chemical group upon which the residue acts. Finally, we show the mutation tolerance of the catalytic residues based on their roles. The characterization of the catalytic residues, their functions, and conservation, as presented here, is key to understanding the impact of mutations in evolution, disease, and enzyme design. The tools developed for this analysis are available at the M-CSA website and allow for user specific analysis of the same data.
Asunto(s)
Aminoácidos/química , Dominio Catalítico , Enzimas/química , Secuencia de Aminoácidos , Aminoácidos/metabolismo , Animales , Biocatálisis , Secuencia Conservada , Bases de Datos de Proteínas , Enzimas/metabolismo , HumanosRESUMEN
Transform-MinER (Transforming Molecules in Enzyme Reactions) is a web application facilitating the exploration of chemical biosynthetic space, guiding the user toward promising start points for enzyme design projects or directed evolution experiments. Two types of search are possible: Molecule Search allows a user to submit a source substrate enabling Transform-MinER to search for enzyme reactions acting on similar substrates, whereas Path Search additionally allows a user to submit a target molecule enabling Transform-MinER to search for a path of enzyme reactions acting on similar substrates to link source and target. Transform-MinER searches for potential reaction centers in the source substrate and uses chemoinformatic fingerprints to identify those that are situated in molecular environments similar to native counterparts, prioritizing steps that move closer to the target using reactions most similar to native in its exploration of search space. The ligand-based methodology behind Transform-MinER is presented, and its performance is validated yielding 90% success rates: first, on a data set of native pathways from the KEGG database, and second, on a data set of de novo enzyme reactions.
Asunto(s)
Quimioinformática/métodos , Minería de Datos/métodos , Tecnología Farmacéutica/métodos , Aldehído-Liasas/química , Algoritmos , Biocatálisis , Bases de Datos de Compuestos Químicos , Ligandos , Fosfato de Sitagliptina/síntesis química , Programas Informáticos , Especificidad por Sustrato , Biología Sintética/métodos , Transaminasas/químicaRESUMEN
MOTIVATION: Cofactors are essential for many enzyme reactions. The Protein Data Bank (PDB) contains >67 000 entries containing enzyme structures, many with bound cofactor or cofactor-like molecules. This work aims to identify and categorize these small molecules in the PDB and make it easier to find them. RESULTS: The Protein Data Bank in Europe (PDBe; pdbe.org) has implemented a pipeline to identify enzyme cofactor and cofactor-like molecules, which are now part of the PDBe weekly release process. AVAILABILITY AND IMPLEMENTATION: Information is made available on the individual PDBe entry pages at pdbe.org and programmatically through the PDBe REST API (pdbe.org/api). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Asunto(s)
Bases de Datos de Proteínas , Coenzimas , Europa (Continente) , Conformación ProteicaRESUMEN
Motivation: One goal of synthetic biology is to make new enzymes to generate new products, but identifying the starting enzymes for further investigation is often elusive and relies on expert knowledge, intensive literature searching and trial and error. Results: We present Transform Molecules in Enzyme Reactions, an online computational tool that transforms query substrate molecules into products using enzyme reactions. The most similar native enzyme reactions for each transformation are found, highlighting those that may be of most interest for enzyme design and directed evolution approaches. Availability and implementation: https://www.ebi.ac.uk/thornton-srv/transform-miner.
Asunto(s)
Enzimas/análisis , Programas InformáticosRESUMEN
There are numerous applications that use the structures of protein-ligand complexes from the PDB, such as 3D pharmacophore identification, virtual screening, and fragment-based drug design. The structures underlying these applications are potentially much more informative if they contain biologically relevant bound ligands, with high similarity to the cognate ligands. We present a study of ligand-enzyme complexes that compares the similarity of bound and cognate ligands, enabling the best matches to be identified. We calculate the molecular similarity scores using a method called PARITY (proportion of atoms residing in identical topology), which can conveniently be combined to give a similarity score for all cognate reactants or products in the reaction. Thus, we generate a rank-ordered list of related PDB structures, according to the biological similarity of the ligands bound in the structures.
Asunto(s)
Acetilcolina/química , Acetilcolinesterasa/química , Biosimilares Farmacéuticos/química , Uroporfirinógeno III Sintetasa/química , Uroporfirinógenos/química , Acetilcolina/metabolismo , Acetilcolinesterasa/metabolismo , Sitios de Unión , Biosimilares Farmacéuticos/metabolismo , Humanos , Ligandos , Simulación del Acoplamiento Molecular , Unión Proteica , Especificidad por Sustrato , Uroporfirinógeno III Sintetasa/metabolismo , Uroporfirinógenos/metabolismoRESUMEN
A set of low molecular weight compounds containing a hydroxyethylamine (HEA) core structure with different prime side alkyl substituted 4,5,6,7-tetrahydrobenzazoles and one 4,5,6,7-tetrahydropyridinoazole was synthesized. Striking differences were observed on potencies in the BACE-1 enzymatic and cellular assays depending on the nature of the heteroatoms in the bicyclic ring, from the low active compound 4 to inhibitor 6, displaying BACE-1 IC(50) values of 44 nM (enzyme assay) and 65 nM (cell-based assay).
Asunto(s)
Secretasas de la Proteína Precursora del Amiloide/antagonistas & inhibidores , Ácido Aspártico Endopeptidasas/antagonistas & inhibidores , Azoles/síntesis química , Benzoxazoles/síntesis química , Diseño de Fármacos , Inhibidores Enzimáticos/síntesis química , Etilaminas/síntesis química , Piridinas/síntesis química , Animales , Azoles/química , Azoles/farmacología , Benzoxazoles/química , Benzoxazoles/farmacología , Dominio Catalítico , Cristalografía por Rayos X , Activación Enzimática/efectos de los fármacos , Inhibidores Enzimáticos/química , Inhibidores Enzimáticos/farmacología , Etilaminas/química , Etilaminas/farmacología , Humanos , Concentración 50 Inhibidora , Masculino , Ratones , Ratones Endogámicos C57BL , Modelos Moleculares , Estructura Molecular , Piridinas/química , Piridinas/farmacologíaRESUMEN
Two types of P1-P3-linked macrocyclic renin inhibitors containing the hydroxyethylene isostere (HE) scaffold just outside the macrocyclic ring have been synthesized. An aromatic or aliphatic substituent (P3sp) was introduced in the macrocyclic ring aiming at the S3 subpocket (S3sp) in order to optimize the potency. A 5-6-fold improvement in both the K(i) and the human plasma renin activity (HPRA)IC(50) was observed when moving from the starting linear peptidomimetic compound 1 to the most potent macrocycle 42 (K(i) = 3.3 nM and HPRA IC(50) = 7 nM). Truncation of the prime side of 42 led to 8-10-fold loss of inhibitory activity in macrocycle 43 (K(i) = 34 nM and HPRA IC(50) = 56 nM). All macrocycles were epimeric mixtures in regard to the P3sp substituent and X-ray crystallographic data of the representative renin macrocycle 43 complex showed that only the S-isomer buried the substituent into the S3sp. Inhibitory selectivity over cathepsin D (Cat-D) and BACE-1 was also investigated for all the macrocycles and showed that truncation of the prime side increased selectivity of inhibition in favor of renin.
Asunto(s)
Compuestos Macrocíclicos/química , Inhibidores de Proteasas/síntesis química , Renina/antagonistas & inhibidores , Ácido Aspártico Endopeptidasas/antagonistas & inhibidores , Ácido Aspártico Endopeptidasas/metabolismo , Sitios de Unión , Catepsina D/antagonistas & inhibidores , Catepsina D/metabolismo , Cristalografía por Rayos X , Diseño de Fármacos , Humanos , Compuestos Macrocíclicos/síntesis química , Compuestos Macrocíclicos/farmacología , Inhibidores de Proteasas/química , Inhibidores de Proteasas/farmacología , Renina/metabolismoRESUMEN
Highly potent BACE-1 protease inhibitors have been developed from an inhibitors containing a hydroxyethylene (HE) core displaying aryloxymethyl or benzyloxymethyl P1 side chain and a methoxy P1' side chain. The target molecules were synthesized in good overall yields from chiral carbohydrate starting materials. The inhibitors show high BACE-1 potency and good selectivity against cathepsin D, where the most potent inhibitor furnishes BACE-1 K(i) << 1 nM and displays >1000-fold selectivity over cathepsin D.
Asunto(s)
Secretasas de la Proteína Precursora del Amiloide/antagonistas & inhibidores , Ácido Aspártico Endopeptidasas/antagonistas & inhibidores , Etilenos/síntesis química , Secretasas de la Proteína Precursora del Amiloide/química , Ácido Aspártico Endopeptidasas/química , Catepsina D/antagonistas & inhibidores , Cristalografía por Rayos X , Diseño de Fármacos , Etilenos/química , Etilenos/farmacología , Humanos , Enlace de Hidrógeno , Modelos Moleculares , Estereoisomerismo , Relación Estructura-ActividadRESUMEN
We herein describe the design and synthesis of a series of BACE-1 inhibitors incorporating a P1-substituted hydroxylethylene transition state isostere. The synthetic route starting from commercially available carbohydrates yielded a pivotal lactone intermediate with excellent stereochemical control which subsequently could be diversified at the P1-position. The final inhibitors were optimized using three different amines to provide the residues in the P2'-P3' position and three different acids affording the residues in the P2-P3 position. In addition we report on the stereochemical preference of the P1'-methyl substituent in the synthesized inhibitors. All inhibitors were evaluated in an in vitro BACE-1 assay where the most potent inhibitor, 34-(R), exhibited a BACE-1 IC(50) value of 3.1 nM.
Asunto(s)
Secretasas de la Proteína Precursora del Amiloide/antagonistas & inhibidores , Ácido Aspártico Endopeptidasas/antagonistas & inhibidores , Inhibidores Enzimáticos/síntesis química , Etilenos/química , Línea Celular , Cristalografía por Rayos X , Diseño de Fármacos , Inhibidores Enzimáticos/química , Inhibidores Enzimáticos/farmacología , Humanos , Concentración 50 Inhibidora , Estructura Molecular , Estereoisomerismo , Relación Estructura-ActividadRESUMEN
Several BACE-1 inhibitors with low nanomolar level activities, encompassing a statine-based core structure with phenyloxymethyl- and benzyloxymethyl residues in the P1 position, are presented. The novel P1 modification introduced to allow the facile exploration of the S1 binding pocket of BACE-1, delivered highly promising inhibitors.