RESUMEN
De novo enzyme design has sought to introduce active sites and substrate-binding pockets that are predicted to catalyse a reaction of interest into geometrically compatible native scaffolds1,2, but has been limited by a lack of suitable protein structures and the complexity of native protein sequence-structure relationships. Here we describe a deep-learning-based 'family-wide hallucination' approach that generates large numbers of idealized protein structures containing diverse pocket shapes and designed sequences that encode them. We use these scaffolds to design artificial luciferases that selectively catalyse the oxidative chemiluminescence of the synthetic luciferin substrates diphenylterazine3 and 2-deoxycoelenterazine. The designed active sites position an arginine guanidinium group adjacent to an anion that develops during the reaction in a binding pocket with high shape complementarity. For both luciferin substrates, we obtain designed luciferases with high selectivity; the most active of these is a small (13.9 kDa) and thermostable (with a melting temperature higher than 95 °C) enzyme that has a catalytic efficiency on diphenylterazine (kcat/Km = 106 M-1 s-1) comparable to that of native luciferases, but a much higher substrate specificity. The creation of highly active and specific biocatalysts from scratch with broad applications in biomedicine is a key milestone for computational enzyme design, and our approach should enable generation of a wide range of luciferases and other enzymes.
Asunto(s)
Aprendizaje Profundo , Luciferasas , Biocatálisis , Dominio Catalítico , Estabilidad de Enzimas , Calor , Luciferasas/química , Luciferasas/metabolismo , Luciferinas/metabolismo , Luminiscencia , Oxidación-Reducción , Especificidad por SustratoRESUMEN
Natural proteins are highly optimized for function but are often difficult to produce at a scale suitable for biotechnological applications due to poor expression in heterologous systems, limited solubility, and sensitivity to temperature. Thus, a general method that improves the physical properties of native proteins while maintaining function could have wide utility for protein-based technologies. Here, we show that the deep neural network ProteinMPNN, together with evolutionary and structural information, provides a route to increasing protein expression, stability, and function. For both myoglobin and tobacco etch virus (TEV) protease, we generated designs with improved expression, elevated melting temperatures, and improved function. For TEV protease, we identified multiple designs with improved catalytic activity as compared to the parent sequence and previously reported TEV variants. Our approach should be broadly useful for improving the expression, stability, and function of biotechnologically important proteins.
Asunto(s)
Endopeptidasas , Temperatura , Endopeptidasas/metabolismo , Proteínas Recombinantes de FusiónRESUMEN
Although much is known about protein folding in buffers, it remains unclear how the cellular protein homeostasis network functions as a system to partition client proteins between folded and functional, soluble and misfolded, and aggregated conformations. Herein, we develop small molecule folding probes that specifically react with the folded and functional fraction of the protein of interest, enabling fluorescence-based quantification of this fraction in cell lysate at a time point of interest. Importantly, these probes minimally perturb a protein's folding equilibria within cells during and after cell lysis, because sufficient cellular chaperone/chaperonin holdase activity is created by rapid ATP depletion during cell lysis. The folding probe strategy and the faithful quantification of a particular protein's functional fraction are exemplified with retroaldolase, a de novo designed enzyme, and transthyretin, a nonenzyme protein. Our findings challenge the often invoked assumption that the soluble fraction of a client protein is fully folded in the cell. Moreover, our results reveal that the partitioning of destabilized retroaldolase and transthyretin mutants between the aforementioned conformational states is strongly influenced by cytosolic proteostasis network perturbations. Overall, our results suggest that applying a chemical folding probe strategy to other client proteins offers opportunities to reveal how the proteostasis network functions as a system to regulate the folding and function of individual client proteins in vivo.
Asunto(s)
Proteínas de Escherichia coli/metabolismo , Colorantes Fluorescentes , Pliegue de Proteína , Adenosina Trifosfato/metabolismo , Escherichia coli/metabolismoRESUMEN
Enzyme-based tags attached to a protein-of-interest (POI) that react with a small molecule, rendering the conjugate fluorescent, are very useful for studying the POI in living cells. These tags are typically based on endogenous enzymes, so protein engineering is required to ensure that the small-molecule probe does not react with the endogenous enzyme in the cell of interest. Here we demonstrate that de novo-designed enzymes can be used as tags to attach to POIs. The inherent bioorthogonality of the de novo-designed enzyme-small-molecule probe reaction circumvents the need for protein engineering, since these enzyme activities are not present in living organisms. Herein, we transform a family of de novo-designed retroaldolases into variable-molecular-weight tags exhibiting fluorescence imaging, reporter, and electrophoresis applications that are regulated by tailored, reactive small-molecule fluorophores.
Asunto(s)
Aldehído-Liasas/química , Colorantes Fluorescentes/química , Imagen Óptica , Células HEK293 , Células HeLa , Humanos , Modelos Moleculares , Sondas Moleculares/química , Ingeniería de ProteínasRESUMEN
The ability to redesign enzymes to catalyze noncognate chemical transformations would have wide-ranging applications. We developed a computational method for repurposing the reactivity of metalloenzyme active site functional groups to catalyze new reactions. Using this method, we engineered a zinc-containing mouse adenosine deaminase to catalyze the hydrolysis of a model organophosphate with a catalytic efficiency (k(cat)/K(m)) of ~10(4) M(-1) s(-1) after directed evolution. In the high-resolution crystal structure of the enzyme, all but one of the designed residues adopt the designed conformation. The designed enzyme efficiently catalyzes the hydrolysis of the R(P) isomer of a coumarinyl analog of the nerve agent cyclosarin, and it shows marked substrate selectivity for coumarinyl leaving groups. Computational redesign of native enzyme active sites complements directed evolution methods and offers a general approach for exploring their untapped catalytic potential for new reactivities.
Asunto(s)
Adenosina Desaminasa/metabolismo , Simulación por Computador , Diseño Asistido por Computadora , Metaloproteínas/metabolismo , Compuestos Organofosforados/metabolismo , Zinc/química , Adenosina Desaminasa/química , Animales , Biocatálisis , Dominio Catalítico , Biología Computacional , Hidrólisis , Metaloproteínas/química , Ratones , Modelos Moleculares , Conformación Molecular , Compuestos Organofosforados/química , Zinc/metabolismoRESUMEN
Modeling the conformational heterogeneity of protein-small molecule systems is an outstanding challenge. We reasoned that while residue level descriptions of biomolecules are efficient for de novo structure prediction, for probing heterogeneity of interactions with small molecules in the folded state an entirely atomic level description could have advantages in speed and generality. We developed a graph neural network called ChemNet trained to recapitulate correct atomic positions from partially corrupted input structures from the Cambridge Structural Database and the Protein Data Bank; the nodes of the graph are the atoms in the system. ChemNet accurately generates structures of diverse organic small molecules given knowledge of their atom composition and bonding, and given a description of the larger protein context, and builds up structures of small molecules and protein side chains for protein-small molecule docking. Because ChemNet is rapid and stochastic, ensembles of predictions can be readily generated to map conformational heterogeneity. In enzyme design efforts described here and elsewhere, we find that using ChemNet to assess the accuracy and pre-organization of the designed active sites results in higher success rates and higher activities; we obtain a preorganized retroaldolase with a k cat/K M of 11000 M-1min-1, considerably higher than any pre-deep learning design for this reaction. We anticipate that ChemNet will be widely useful for rapidly generating conformational ensembles of small molecule and small molecule-protein systems, and for designing higher activity preorganized enzymes.
RESUMEN
The RGD (Arg-Gly-Asp)-binding integrins αvß6 and αvß8 are clinically validated cancer and fibrosis targets of considerable therapeutic importance. Compounds that can discriminate between the two closely related integrin proteins and other RGD integrins, stabilize specific conformational states, and have sufficient stability enabling tissue restricted administration could have considerable therapeutic utility. Existing small molecules and antibody inhibitors do not have all of these properties, and hence there is a need for new approaches. Here we describe a method for computationally designing hyperstable RGD-containing miniproteins that are highly selective for a single RGD integrin heterodimer and conformational state, and use this strategy to design inhibitors of αvß6 and αvß8 with high selectivity. The αvß6 and αvß8 inhibitors have picomolar affinities for their targets, and >1000-fold selectivity over other RGD integrins. CryoEM structures are within 0.6-0.7Å root-mean-square deviation (RMSD) to the computational design models; the designed αvß6 inhibitor and native ligand stabilize the open conformation in contrast to the therapeutic anti-αvß6 antibody BG00011 that stabilizes the bent-closed conformation and caused on-target toxicity in patients with lung fibrosis, and the αvß8 inhibitor maintains the constitutively fixed extended-closed αvß8 conformation. In a mouse model of bleomycin-induced lung fibrosis, the αvß6 inhibitor potently reduced fibrotic burden and improved overall lung mechanics when delivered via oropharyngeal administration mimicking inhalation, demonstrating the therapeutic potential of de novo designed integrin binding proteins with high selectivity.
RESUMEN
The RGD (Arg-Gly-Asp)-binding integrins αvß6 and αvß8 are clinically validated cancer and fibrosis targets of considerable therapeutic importance. Compounds that can discriminate between homologous αvß6 and αvß8 and other RGD integrins, stabilize specific conformational states, and have high thermal stability could have considerable therapeutic utility. Existing small molecule and antibody inhibitors do not have all these properties, and hence new approaches are needed. Here we describe a generalized method for computationally designing RGD-containing miniproteins selective for a single RGD integrin heterodimer and conformational state. We design hyperstable, selective αvß6 and αvß8 inhibitors that bind with picomolar affinity. CryoEM structures of the designed inhibitor-integrin complexes are very close to the computational design models, and show that the inhibitors stabilize specific conformational states of the αvß6 and the αvß8 integrins. In a lung fibrosis mouse model, the αvß6 inhibitor potently reduced fibrotic burden and improved overall lung mechanics, demonstrating the therapeutic potential of de novo designed integrin binding proteins with high selectivity.
Asunto(s)
Integrinas , Fibrosis Pulmonar , Animales , Ratones , Membrana Celular , Microscopía por Crioelectrón , Modelos Animales de EnfermedadRESUMEN
While native scaffolds offer a large diversity of shapes and topologies for enzyme engineering, their often unpredictable behavior in response to sequence modification makes de novo generated scaffolds an exciting alternative. Here we explore the customization of the backbone and sequence of a de novo designed eight stranded ß-barrel protein to create catalysts for a retro-aldolase model reaction. We show that active and specific catalysts can be designed in this fold and use directed evolution to further optimize activity and stereoselectivity. Our results support previous suggestions that different folds have different inherent amenability to evolution and this property could account, in part, for the distribution of natural enzymes among different folds.
Asunto(s)
Ingeniería de Proteínas , Proteínas , Proteínas/genética , Ingeniería de Proteínas/métodosRESUMEN
Computational design of new active sites has generally proceeded by geometrically defining interactions between the reaction transition state(s) and surrounding side-chain functional groups which maximize transition-state stabilization, and then searching for sites in protein scaffolds where the specified side-chain-transition-state interactions can be realized. A limitation of this approach is that the interactions between the side chains themselves are not constrained. An extensive connected hydrogen bond network involving the catalytic residues was observed in a designed retroaldolase following directed evolution. Such connected networks could increase catalytic activity by preorganizing active site residues in catalytically competent orientations, and enabling concerted interactions between side chains during catalysis, for example, proton shuffling. We developed a method for designing active sites in which the catalytic side chains, in addition to making interactions with the transition state, are also involved in extensive hydrogen bond networks. Because of the added constraint of hydrogen-bond connectivity between the catalytic side chains, to find solutions, a wider range of interactions between these side chains and the transition state must be considered. Our new method starts from a ChemDraw-like two-dimensional representation of the transition state with hydrogen-bond donors, acceptors, and covalent interaction sites indicated, and all placements of side-chain functional groups that make the indicated interactions with the transition state, and are fully connected in a single hydrogen-bond network are systematically enumerated. The RosettaMatch method can then be used to identify realizations of these fully-connected active sites in protein scaffolds. The method generates many fully-connected active site solutions for a set of model reactions that are promising starting points for the design of fully-preorganized enzyme catalysts.
Asunto(s)
Redes Neurales de la Computación , Proteínas/metabolismo , Sitios de Unión , Biocatálisis , Bases de Datos de Proteínas , Enlace de Hidrógeno , Modelos Moleculares , Proteínas/químicaRESUMEN
We describe the design of an optical switch in the chaperonin GroEL that is opened and closed by its ATP- and cochaperonin GroES-driven conformational changes. The switch, based on a fluorophore and a quencher, is engineered into the single-ring variant of the chaperone, and shows dramatic modulation of its fluorescent intensity in response to the transition of the protein between its allosteric states. It, therefore, forms a sensitive probe for the dynamics of the allosteric transitions of this machine, both in the bulk and in single molecules.
Asunto(s)
Proteínas Bacterianas/química , Proteínas Bacterianas/metabolismo , Chaperonina 60/química , Chaperonina 60/metabolismo , Escherichia coli/metabolismo , Colorantes Fluorescentes/metabolismo , Óptica y Fotónica , Fluorescencia , Modelos Moleculares , Conformación ProteicaRESUMEN
Designed retroaldolases have utilized a nucleophilic lysine to promote carbon-carbon bond cleavage of ß-hydroxy-ketones via a covalent Schiff base intermediate. Previous computational designs have incorporated a water molecule to facilitate formation and breakdown of the carbinolamine intermediate to give the Schiff base and to function as a general acid/base. Here we investigate an alternative active-site design in which the catalytic water molecule was replaced by the side chain of a glutamic acid. Five out of seven designs expressed solubly and exhibited catalytic efficiencies similar to previously designed retroaldolases for the conversion of 4-hydroxy-4-(6-methoxy-2-naphthyl)-2-butanone to 6-methoxy-2-naphthaldehyde and acetone. After one round of site-directed saturation mutagenesis, improved variants of the two best designs, RA114 and RA117, exhibited among the highest kcat (>10(-3)s(-1)) and kcat/KM (11-25M(-1)s(-1)) values observed for retroaldolase designs prior to comprehensive directed evolution. In both cases, the >10(5)-fold rate accelerations that were achieved are within 1-3 orders of magnitude of the rate enhancements reported for the best catalysts for related reactions, including catalytic antibodies (kcat/kuncat=10(6) to 10(8)) and an extensively evolved computational design (kcat/kuncat>10(7)). The catalytic sites, revealed by X-ray structures of optimized versions of the two active designs, are in close agreement with the design models except for the catalytic lysine in RA114. We further improved the variants by computational remodeling of the loops and yeast display selection for reactivity of the catalytic lysine with a diketone probe, obtaining an additional order of magnitude enhancement in activity with both approaches.
Asunto(s)
Fructosa-Bifosfato Aldolasa/química , Fructosa-Bifosfato Aldolasa/metabolismo , Ingeniería de Proteínas , Acetona/metabolismo , Aldehídos/metabolismo , Butanonas/metabolismo , Dominio Catalítico , Cristalografía por Rayos X , Análisis Mutacional de ADN , Fructosa-Bifosfato Aldolasa/genética , Expresión Génica , Cinética , Modelos Moleculares , Nabumetona , Naftalenos/metabolismo , Conformación Proteica , Proteínas Recombinantes/química , Proteínas Recombinantes/genética , Proteínas Recombinantes/metabolismoRESUMEN
There has been recent success in designing enzymes for simple chemical reactions using a two-step protocol. In the first step, a geometric matching algorithm is used to identify naturally occurring protein scaffolds at which predefined idealized active sites can be realized. In the second step, the residues surrounding the transition state model are optimized to increase transition state binding affinity and to bolster the primary catalytic side chains. To improve the design methodology, we investigated how the set of solutions identified by the design calculations relate to the overall set of solutions for two different chemical reactions. Using a TIM barrel scaffold in which catalytically active Kemp eliminase and retroaldolase designs were obtained previously, we carried out activity screens of random libraries made to be compositionally similar to active designs. A small number of active catalysts were found in screens of 10³ variants for each of the two reactions, which differ from the computational designs in that they reuse charged residues already present in the native scaffold. The results suggest that computational design considerably increases the frequency of catalyst generation for active sites involving newly introduced catalytic residues, highlighting the importance of interaction cooperativity in enzyme active sites.
Asunto(s)
Aldehído-Liasas/química , Biología Computacional/métodos , Liasas/química , Ingeniería de Proteínas/métodos , Aldehído-Liasas/genética , Aldehído-Liasas/metabolismo , Algoritmos , Secuencia de Aminoácidos , Biocatálisis , Dominio Catalítico , Liasas/genética , Liasas/metabolismo , Modelos Moleculares , Datos de Secuencia Molecular , Mutación , Conformación Proteica , SolubilidadRESUMEN
In nature, the evolution of new protein functions is driven not only by side-chain substitutions (point mutations), but also by backbone modifications (insertions and deletions). The current laboratory diversification methods, however, are largely limited to point mutations. Of particular interest are short insertions-by-duplication that are frequent in nature but cannot be introduced in vitro in a library format (i.e. in random locations and lengths). Here, we describe a new procedure that allows the generation of tandem repeats of random fragments of the target gene via rolling-circle amplification, and the concurrent incorporation of these repeats into the target gene. This procedure, dubbed tandem repeat insertion, or TRINS, results in a library of genes carrying insertions-by-duplication of variable lengths (3-150 bp) at random positions. This diversification pattern allows sampling of sequence space regions that are not readily accessible by other protocols. We demonstrate this method by constructing three different gene libraries, and by selecting insertion variants of TEM-1 ß-lactamase.
Asunto(s)
ADN/genética , Evolución Molecular Dirigida/métodos , Biblioteca de Genes , Mutagénesis Insercional/métodos , Secuencias Repetidas en Tándem , beta-Lactamasas/genética , Clonación Molecular/métodos , ADN Circular/genética , Humanos , Reacción en Cadena de la PolimerasaRESUMEN
The chaperonin GroEL assists protein folding by undergoing ATP-induced conformational changes that are concerted within each of its two back-to-back stacked rings. Here we examined whether concerted allosteric switching gives rise to all-or-none release and folding of domains in a chimeric fluorescent protein substrate, CyPet-YPet. Using this substrate, it was possible to determine the folding yield of each domain from its intrinsic fluorescence and that of the entire chimera by measuring Förster resonance energy transfer between the two domains. Hence, it was possible to determine whether release of one domain is accompanied by release of the other domain (concerted mechanism), or whether their release is not coupled. Our results show that the chimera's release tends to be concerted when folding is assisted by a wild-type GroEL variant, but not when assisted by the F44W/D155A mutant that undergoes a sequential allosteric switch. A connection between the allosteric mechanism of this molecular machine and its biological function in assisting folding is thus established.
Asunto(s)
Adenosina Trifosfato/metabolismo , Chaperonina 60/química , Chaperonina 60/metabolismo , Proteínas de Escherichia coli/química , Proteínas de Escherichia coli/metabolismo , Adenosina Trifosfato/química , Regulación Alostérica , Chaperonina 60/genética , Proteínas de Escherichia coli/genética , Transferencia Resonante de Energía de Fluorescencia , Colorantes Fluorescentes/química , Colorantes Fluorescentes/metabolismo , Pliegue de Proteína , Estructura Terciaria de Proteína , Proteínas Recombinantes de Fusión/química , Proteínas Recombinantes de Fusión/genética , Proteínas Recombinantes de Fusión/metabolismoRESUMEN
The double-ring chaperonin GroEL mediates protein folding, in conjunction with its helper protein GroES, by undergoing ATP-induced conformational changes that are concerted within each heptameric ring. Here we have examined whether the concerted nature of these transitions is responsible for protein substrate release in an all-or-none manner. Two chimeric substrates were designed, each with two different reporter activities that were recovered after denaturation in GroES-dependent and independent fashions, respectively. The refolding of the chimeras was monitored in the presence of GroEL variants that undergo ATP-induced intraring conformational changes that are either sequential (F44W/D155A) or concerted (F44W). Our results show that release of a protein substrate from GroEL in a domain-by-domain fashion is favored when the intraring allosteric transitions of GroEL are sequential and not concerted.