RESUMEN
General approaches for designing sequence-specific peptide-binding proteins would have wide utility in proteomics and synthetic biology. However, designing peptide-binding proteins is challenging, as most peptides do not have defined structures in isolation, and hydrogen bonds must be made to the buried polar groups in the peptide backbone1-3. Here, inspired by natural and re-engineered protein-peptide systems4-11, we set out to design proteins made out of repeating units that bind peptides with repeating sequences, with a one-to-one correspondence between the repeat units of the protein and those of the peptide. We use geometric hashing to identify protein backbones and peptide-docking arrangements that are compatible with bidentate hydrogen bonds between the side chains of the protein and the peptide backbone12. The remainder of the protein sequence is then optimized for folding and peptide binding. We design repeat proteins to bind to six different tripeptide-repeat sequences in polyproline II conformations. The proteins are hyperstable and bind to four to six tandem repeats of their tripeptide targets with nanomolar to picomolar affinities in vitro and in living cells. Crystal structures reveal repeating interactions between protein and peptide interactions as designed, including ladders of hydrogen bonds from protein side chains to peptide backbones. By redesigning the binding interfaces of individual repeat units, specificity can be achieved for non-repeating peptide sequences and for disordered regions of native proteins.
Asunto(s)
Péptidos , Ingeniería de Proteínas , Proteínas , Secuencia de Aminoácidos , Modelos Moleculares , Péptidos/química , Péptidos/metabolismo , Proteínas/química , Proteínas/metabolismo , Ingeniería de Proteínas/métodos , Enlace de Hidrógeno , Unión Proteica , Pliegue de Proteína , Conformación ProteicaRESUMEN
There has been considerable recent progress in designing new proteins using deep-learning methods1-9. Despite this progress, a general deep-learning framework for protein design that enables solution of a wide range of design challenges, including de novo binder design and design of higher-order symmetric architectures, has yet to be described. Diffusion models10,11 have had considerable success in image and language generative modelling but limited success when applied to protein modelling, probably due to the complexity of protein backbone geometry and sequence-structure relationships. Here we show that by fine-tuning the RoseTTAFold structure prediction network on protein structure denoising tasks, we obtain a generative model of protein backbones that achieves outstanding performance on unconditional and topology-constrained protein monomer design, protein binder design, symmetric oligomer design, enzyme active site scaffolding and symmetric motif scaffolding for therapeutic and metal-binding protein design. We demonstrate the power and generality of the method, called RoseTTAFold diffusion (RFdiffusion), by experimentally characterizing the structures and functions of hundreds of designed symmetric assemblies, metal-binding proteins and protein binders. The accuracy of RFdiffusion is confirmed by the cryogenic electron microscopy structure of a designed binder in complex with influenza haemagglutinin that is nearly identical to the design model. In a manner analogous to networks that produce images from user-specified inputs, RFdiffusion enables the design of diverse functional proteins from simple molecular specifications.
Asunto(s)
Aprendizaje Profundo , Proteínas , Dominio Catalítico , Microscopía por Crioelectrón , Glicoproteínas Hemaglutininas del Virus de la Influenza/química , Glicoproteínas Hemaglutininas del Virus de la Influenza/metabolismo , Glicoproteínas Hemaglutininas del Virus de la Influenza/ultraestructura , Unión Proteica , Proteínas/química , Proteínas/metabolismo , Proteínas/ultraestructuraRESUMEN
The design of proteins that bind to a specific site on the surface of a target protein using no information other than the three-dimensional structure of the target remains a challenge1-5. Here we describe a general solution to this problem that starts with a broad exploration of the vast space of possible binding modes to a selected region of a protein surface, and then intensifies the search in the vicinity of the most promising binding modes. We demonstrate the broad applicability of this approach through the de novo design of binding proteins to 12 diverse protein targets with different shapes and surface properties. Biophysical characterization shows that the binders, which are all smaller than 65 amino acids, are hyperstable and, following experimental optimization, bind their targets with nanomolar to picomolar affinities. We succeeded in solving crystal structures of five of the binder-target complexes, and all five closely match the corresponding computational design models. Experimental data on nearly half a million computational designs and hundreds of thousands of point mutants provide detailed feedback on the strengths and limitations of the method and of our current understanding of protein-protein interactions, and should guide improvements of both. Our approach enables the targeted design of binders to sites of interest on a wide variety of proteins for therapeutic and diagnostic applications.
Asunto(s)
Proteínas Portadoras , Proteínas , Aminoácidos/metabolismo , Sitios de Unión , Proteínas Portadoras/metabolismo , Unión Proteica , Proteínas/químicaRESUMEN
Ordered two-dimensional arrays such as S-layers1,2 and designed analogues3-5 have intrigued bioengineers6,7, but with the exception of a single lattice formed with flexible linkers8, they are constituted from just one protein component. Materials composed of two components have considerable potential advantages for modulating assembly dynamics and incorporating more complex functionality9-12. Here we describe a computational method to generate co-assembling binary layers by designing rigid interfaces between pairs of dihedral protein building blocks, and use it to design a p6m lattice. The designed array components are soluble at millimolar concentrations, but when combined at nanomolar concentrations, they rapidly assemble into nearly crystalline micrometre-scale arrays nearly identical to the computational design model in vitro and in cells without the need for a two-dimensional support. Because the material is designed from the ground up, the components can be readily functionalized and their symmetry reconfigured, enabling formation of ligand arrays with distinguishable surfaces, which we demonstrate can drive extensive receptor clustering, downstream protein recruitment and signalling. Using atomic force microscopy on supported bilayers and quantitative microscopy on living cells, we show that arrays assembled on membranes have component stoichiometry and structure similar to arrays formed in vitro, and that our material can therefore impose order onto fundamentally disordered substrates such as cell membranes. In contrast to previously characterized cell surface receptor binding assemblies such as antibodies and nanocages, which are rapidly endocytosed, we find that large arrays assembled at the cell surface suppress endocytosis in a tunable manner, with potential therapeutic relevance for extending receptor engagement and immune evasion. Our work provides a foundation for a synthetic cell biology in which multi-protein macroscale materials are designed to modulate cell responses and reshape synthetic and living systems.
Asunto(s)
Diseño de Fármacos , Ingeniería de Proteínas , Proteínas/síntesis química , Proteínas/metabolismo , Células 3T3 , Animales , Biología Celular , Supervivencia Celular , Biología Computacional , Endocitosis , Escherichia coli/genética , Escherichia coli/metabolismo , Técnicas In Vitro , Cinética , Ligandos , Ratones , Microscopía de Fuerza Atómica , Modelos Moleculares , Biología SintéticaRESUMEN
Computationally designed protein nanoparticles have recently emerged as a promising platform for the development of new vaccines and biologics. For many applications, secretion of designed nanoparticles from eukaryotic cells would be advantageous, but in practice, they often secrete poorly. Here we show that designed hydrophobic interfaces that drive nanoparticle assembly are often predicted to form cryptic transmembrane domains, suggesting that interaction with the membrane insertion machinery could limit efficient secretion. We develop a general computational protocol, the Degreaser, to design away cryptic transmembrane domains without sacrificing protein stability. The retroactive application of the Degreaser to previously designed nanoparticle components and nanoparticles considerably improves secretion, and modular integration of the Degreaser into design pipelines results in new nanoparticles that secrete as robustly as naturally occurring protein assemblies. Both the Degreaser protocol and the nanoparticles we describe may be broadly useful in biotechnological applications.
Asunto(s)
Nanopartículas , Vacunas , Proteínas , Nanopartículas/químicaRESUMEN
Protein crystallization plays a central role in structural biology. Despite this, the process of crystallization remains poorly understood and highly empirical, with crystal contacts, lattice packing arrangements and space group preferences being largely unpredictable. Programming protein crystallization through precisely engineered side-chain-side-chain interactions across protein-protein interfaces is an outstanding challenge. Here we develop a general computational approach for designing three-dimensional protein crystals with prespecified lattice architectures at atomic accuracy that hierarchically constrains the overall number of degrees of freedom of the system. We design three pairs of oligomers that can be individually purified, and upon mixing, spontaneously self-assemble into >100 µm three-dimensional crystals. The structures of these crystals are nearly identical to the computational design models, closely corresponding in both overall architecture and the specific protein-protein interactions. The dimensions of the crystal unit cell can be systematically redesigned while retaining the space group symmetry and overall architecture, and the crystals are extremely porous and highly stable. Our approach enables the computational design of protein crystals with high accuracy, and the designed protein crystals, which have both structural and assembly information encoded in their primary sequences, provide a powerful platform for biological materials engineering.
Asunto(s)
Proteínas , Proteínas/química , CristalizaciónRESUMEN
Computationally designed multi-subunit assemblies have shown considerable promise for a variety of applications, including a new generation of potent vaccines. One of the major routes to such materials is rigid body sequence-independent docking of cyclic oligomers into architectures with point group or lattice symmetries. Current methods for docking and designing such assemblies are tailored to specific classes of symmetry and are difficult to modify for novel applications. Here we describe RPXDock, a fast, flexible, and modular software package for sequence-independent rigid-body protein docking across a wide range of symmetric architectures that is easily customizable for further development. RPXDock uses an efficient hierarchical search and a residue-pair transform (RPX) scoring method to rapidly search through multidimensional docking space. We describe the structure of the software, provide practical guidelines for its use, and describe the available functionalities including a variety of score functions and filtering tools that can be used to guide and refine docking results towards desired configurations.
Asunto(s)
Algoritmos , Nanoestructuras , Conformación Proteica , Proteínas/química , Programas Informáticos , Unión Proteica , Simulación del Acoplamiento MolecularRESUMEN
The regular arrangements of ß-strands around a central axis in ß-barrels and of α-helices in coiled coils contrast with the irregular tertiary structures of most globular proteins, and have fascinated structural biologists since they were first discovered. Simple parametric models have been used to design a wide range of α-helical coiled-coil structures, but to date there has been no success with ß-barrels. Here we show that accurate de novo design of ß-barrels requires considerable symmetry-breaking to achieve continuous hydrogen-bond connectivity and eliminate backbone strain. We then build ensembles of ß-barrel backbone models with cavity shapes that match the fluorogenic compound DFHBI, and use a hierarchical grid-based search method to simultaneously optimize the rigid-body placement of DFHBI in these cavities and the identities of the surrounding amino acids to achieve high shape and chemical complementarity. The designs have high structural accuracy and bind and fluorescently activate DFHBI in vitro and in Escherichia coli, yeast and mammalian cells. This de novo design of small-molecule binding activity, using backbones custom-built to bind the ligand, should enable the design of increasingly sophisticated ligand-binding proteins, sensors and catalysts that are not limited by the backbone geometries available in known protein structures.
Asunto(s)
Compuestos de Bencilo/química , Fluorescencia , Imidazolinas/química , Proteínas/química , Animales , Compuestos de Bencilo/análisis , Células COS , Chlorocebus aethiops , Escherichia coli , Proteínas Fluorescentes Verdes/genética , Proteínas Fluorescentes Verdes/metabolismo , Enlace de Hidrógeno , Imidazolinas/análisis , Ligandos , Unión Proteica , Dominios Proteicos , Pliegue de Proteína , Estabilidad Proteica , Estructura Secundaria de Proteína , Reproducibilidad de los Resultados , LevadurasRESUMEN
The dodecahedron [corrected] is the largest of the Platonic solids, and icosahedral protein structures are widely used in biological systems for packaging and transport. There has been considerable interest in repurposing such structures for applications ranging from targeted delivery to multivalent immunogen presentation. The ability to design proteins that self-assemble into precisely specified, highly ordered icosahedral structures would open the door to a new generation of protein containers with properties custom-tailored to specific applications. Here we describe the computational design of a 25-nanometre icosahedral nanocage that self-assembles from trimeric protein building blocks. The designed protein was produced in Escherichia coli, and found by electron microscopy to assemble into a homogenous population of icosahedral particles nearly identical to the design model. The particles are stable in 6.7 molar guanidine hydrochloride at up to 80 degrees Celsius, and undergo extremely abrupt, but reversible, disassembly between 2 molar and 2.25 molar guanidinium thiocyanate. The dodecahedron [corrected] is robust to genetic fusions: one or two copies of green fluorescent protein (GFP) can be fused to each of the 60 subunits to create highly fluorescent 'standard candles' for use in light microscopy, and a designed protein pentamer can be placed in the centre of each of the 20 pentameric faces to modulate the size of the entrance/exit channels of the cage. Such robust and customizable nanocages should have considerable utility in targeted drug delivery, vaccine design and synthetic biology.
Asunto(s)
Diseño de Fármacos , Multimerización de Proteína , Subunidades de Proteína/química , Simulación por Computador , Microscopía por Crioelectrón , Escherichia coli/metabolismo , Proteínas Fluorescentes Verdes/química , Proteínas Fluorescentes Verdes/genética , Modelos Moleculares , Nanoestructuras/química , Nanoestructuras/ultraestructura , Estabilidad Proteica/efectos de los fármacos , Subunidades de Proteína/genética , Proteínas Recombinantes de Fusión/química , Proteínas Recombinantes de Fusión/genéticaRESUMEN
Computational protein design provides the tools to expand the diversity of protein complexes beyond those found in nature. Understanding the rules that drive proteins to interact with each other enables the design of protein-protein interactions to generate specific protein assemblies. In this work, we designed protein-protein interfaces between dimers and trimers to generate dodecameric protein assemblies with dihedral point group symmetry. We subsequently analyzed the designed protein complexes by native MS. We show that the use of ion mobility MS in combination with surface-induced dissociation (SID) allows for the rapid determination of the stoichiometry and topology of designed complexes. The information collected along with the speed of data acquisition and processing make SID ion mobility MS well-suited to determine key structural features of designed protein complexes, thereby circumventing the requirement for more time- and sample-consuming structural biology approaches.
Asunto(s)
Espectrometría de Masas/métodos , Complejos Multiproteicos/química , Avidina/química , Lactoglobulinas/química , Modelos Moleculares , Complejos Multiproteicos/metabolismo , Prealbúmina/química , Ingeniería de Proteínas/métodos , Dominios y Motivos de Interacción de Proteínas , Proteínas Recombinantes/químicaRESUMEN
The self-assembly of proteins into highly ordered nanoscale architectures is a hallmark of biological systems. The sophisticated functions of these molecular machines have inspired the development of methods to engineer self-assembling protein nanostructures; however, the design of multi-component protein nanomaterials with high accuracy remains an outstanding challenge. Here we report a computational method for designing protein nanomaterials in which multiple copies of two distinct subunits co-assemble into a specific architecture. We use the method to design five 24-subunit cage-like protein nanomaterials in two distinct symmetric architectures and experimentally demonstrate that their structures are in close agreement with the computational design models. The accuracy of the method and the number and variety of two-component materials that it makes accessible suggest a route to the construction of functional protein nanomaterials tailored to specific applications.
Asunto(s)
Nanoestructuras/química , Proteínas/química , Simulación por Computador , Cristalografía por Rayos X , Diseño de Fármacos , Modelos Moleculares , Nanoestructuras/ultraestructura , Subunidades de Proteína/química , Proteínas/ultraestructuraRESUMEN
Motivation: Binding-induced conformational changes challenge current computational docking algorithms by exponentially increasing the conformational space to be explored. To restrict this search to relevant space, some computational docking algorithms exploit the inherent flexibility of the protein monomers to simulate conformational selection from pre-generated ensembles. As the ensemble size expands with increased flexibility, these methods struggle with efficiency and high false positive rates. Results: Here, we develop and benchmark RosettaDock 4.0, which efficiently samples large conformational ensembles of flexible proteins and docks them using a novel, six-dimensional, coarse-grained score function. A strong discriminative ability allows an eight-fold higher enrichment of near-native candidate structures in the coarse-grained phase compared to RosettaDock 3.2. It adaptively samples 100 conformations each of the ligand and the receptor backbone while increasing computational time by only 20-80%. In local docking of a benchmark set of 88 proteins of varying degrees of flexibility, the expected success rate (defined as cases with ≥50% chance of achieving 3 near-native structures in the 5 top-ranked ones) for blind predictions after resampling is 77% for rigid complexes, 49% for moderately flexible complexes and 31% for highly flexible complexes. These success rates on flexible complexes are a substantial step forward from all existing methods. Additionally, for highly flexible proteins, we demonstrate that when a suitable conformer generation method exists, the method successfully docks the complex. Availability and implementation: As a part of the Rosetta software suite, RosettaDock 4.0 is available at https://www.rosettacommons.org to all non-commercial users for free and to commercial users for a fee. Supplementary information: Supplementary data are available at Bioinformatics online.
Asunto(s)
Proteínas/metabolismo , Algoritmos , Ligandos , Unión Proteica , Conformación Proteica , Proteínas/química , Programas InformáticosRESUMEN
Metal-chelating heteroaryl small molecules have found widespread use as building blocks for coordination-driven, self-assembling nanostructures. The metal-chelating noncanonical amino acid (2,2'-bipyridin-5yl)alanine (Bpy-ala) could, in principle, be used to nucleate specific metalloprotein assemblies if introduced into proteins such that one assembly had much lower free energy than all alternatives. Here we describe the use of the Rosetta computational methodology to design a self-assembling homotrimeric protein with [Fe(Bpy-ala)3]2+ complexes at the interface between monomers. X-ray crystallographic analysis of the homotrimer showed that the design process had near-atomic-level accuracy: The all-atom rmsd between the design model and crystal structure for the residues at the protein interface is â¼1.4 Å. These results demonstrate that computational protein design together with genetically encoded noncanonical amino acids can be used to drive formation of precisely specified metal-mediated protein assemblies that could find use in a wide range of photophysical applications.
Asunto(s)
Metaloproteínas/química , Ingeniería de Proteínas/métodos , Piridinas/química , Aminoácidos/química , Clonación Molecular , Biología Computacional/métodos , Simulación por Computador , Cristalografía por Rayos X , Metales/química , Modelos Moleculares , Conformación Proteica , Mapeo de Interacción de Proteínas , Multimerización de Proteína , Programas InformáticosRESUMEN
The design of inducibly assembling protein nanomaterials is an outstanding challenge. Here, we describe the computational design of a protein filament formed from a monomeric subunit which binds a peptide ligand. The cryoEM structure of the micron scale fibers is very close to the computational design model. The ligand acts as a tunable allosteric modulator: while not part of the fiber subunit-subunit interfaces, the assembly of the filament is dependent on ligand addition, with longer peptides having more extensive interaction surfaces with the monomer promoting more rapid growth. Seeded growth and capping experiments reveal that the filaments grow primarily from one end. Oligomers containing 12 copies of the peptide ligand nucleate fiber assembly from monomeric subunit and peptide mixtures at concentrations where assembly occurs very slowly, likely by generating critical local concentrations of monomer in the assembly competent conformation. Following filament assembly, the peptide ligand can be exchanged with free peptide in solution, and it can be readily fused to any functional protein of interest, opening the door to a wide variety of tunable engineered materials.
RESUMEN
Biological evolution has led to precise and dynamic nanostructures that reconfigure in response to pH and other environmental conditions. However, designing micrometre-scale protein nanostructures that are environmentally responsive remains a challenge. Here we describe the de novo design of pH-responsive protein filaments built from subunits containing six or nine buried histidine residues that assemble into micrometre-scale, well-ordered fibres at neutral pH. The cryogenic electron microscopy structure of an optimized design is nearly identical to the computational design model for both the subunit internal geometry and the subunit packing into the fibre. Electron, fluorescent and atomic force microscopy characterization reveal a sharp and reversible transition from assembled to disassembled fibres over 0.3 pH units, and rapid fibre disassembly in less than 1 s following a drop in pH. The midpoint of the transition can be tuned by modulating buried histidine-containing hydrogen bond networks. Computational protein design thus provides a route to creating unbound nanomaterials that rapidly respond to small pH changes.
Asunto(s)
Histidina , Concentración de Iones de Hidrógeno , Histidina/química , Proteínas/química , Nanoestructuras/química , Modelos Moleculares , Enlace de Hidrógeno , Microscopía por CrioelectrónRESUMEN
We describe a modular bond-centric approach to protein nanomaterial design inspired by the rich diversity of chemical structures that can be generated from the small number of atomic valencies and bonding interactions. We design protein building blocks with regular coordination geometries and bonding interactions that enable the assembly of a wide variety of closed and opened nanomaterials using simple geometrical principles. Experimental characterization confirms successful formation of more than twenty multi-component polyhedral protein cages, 2D arrays, and 3D protein lattices, with a high (10-50 %) success rate and electron microscopy data closely matching the corresponding design models. Because of the modularity, individual building blocks can assemble with different partners to generate distinct regular assemblies, resulting in an economy of parts and enabling the construction of reconfigurable systems.
RESUMEN
Photosystem I is a highly efficient and potent light-induced reductase that is considered to be an appealing target for integration into hybrid solar fuel production systems. However, rapid transport of multiple electrons from the reducing end of photosystem I to downstream processes in vivo is limited by the diffusion of its native redox partner ferredoxin that is a single electron carrier. Here, we describe the design and construction of a faster electron transfer interface based on anchoring ferredoxin to the reducing end of photosystem I thereby confining the diffusion space of ferredoxin to the near vicinity of its photosystem I binding and reduction site. This was achieved by fusing ferredoxin to the PsaE subunit of photosystem I by a flexible peptide linker and reconstituting PSI in vitro with the new fusion protein. A computational algorithm was developed in order to determine the optimal linker length that will confine ferredoxin to the vicinity of photosystem I's reducing end without restricting the formation of electron transfer complexes. According to the calculation, we reconstituted photosystem I with three fusion proteins comprising PsaE and ferredoxin separated by linkers of different lengths, namely 14, 19, and 25 amino acids, and tested their effect on electron transfer rates from photosystem I to downstream processes. Indeed, we found a significant enhancement of light dependent NADPH synthesis using photosystems containing the PsaE-ferredoxin fusion proteins, equivalent to a ten-fold increase in soluble ferredoxin concentration. We propose that such a system could be used for other ferredoxin dependent redox reactions, such as the enzymatic production of hydrogen, a promising alternative fuel. As the system is comprised entirely of natural amino acids and biological cofactors, it could be integrated into the energy conversion apparatus of photosynthetic organisms by genetic engineering.
Asunto(s)
Ferredoxinas/metabolismo , Complejo de Proteína del Fotosistema I/genética , Complejo de Proteína del Fotosistema I/metabolismo , Ingeniería de Proteínas/métodos , Transporte de Electrón , Cinética , Modelos Moleculares , NADP/metabolismo , Complejo de Proteína del Fotosistema I/química , Conformación Proteica , Proteínas Recombinantes de Fusión/química , Proteínas Recombinantes de Fusión/genética , Proteínas Recombinantes de Fusión/metabolismo , Synechococcus/genéticaRESUMEN
Despite remarkable advances in the assembly of highly structured coordination polymers and metal-organic frameworks, the rational design of such materials using more conformationally flexible organic ligands such as peptides remains challenging. In an effort to make the design of such materials fully programmable, we first developed a computational design method for generating metal-mediated 3D frameworks using rigid and symmetric peptide macrocycles with metal-coordinating sidechains. We solved the structures of six crystalline networks involving conformationally constrained 6 to 12 residue cyclic peptides with C2, C3, and S2 internal symmetry and three different types of metals (Zn2+, Co2+, or Cu2+) by single-crystal X-ray diffraction, which reveals how the peptide sequences, backbone symmetries, and metal coordination preferences drive the assembly of the resulting structures. In contrast to smaller ligands, these peptides associate through peptide-peptide interactions without full coordination of the metals, contrary to one of the assumptions underlying our computational design method. The cyclic peptides are the largest peptidic ligands reported to form crystalline coordination polymers with transition metals to date, and while more work is required to develop methods for fully programming their crystal structures, the combination of high chemical diversity with synthetic accessibility makes them attractive building blocks for engineering a broader set of new crystalline materials for use in applications such as sensing, asymmetric catalysis, and chiral separation.