ABSTRACT
A challenge for design of protein-small-molecule recognition is that incorporation of cavities with size, shape, and composition suitable for specific recognition can considerably destabilize protein monomers. This challenge can be overcome through binding pockets formed at homo-oligomeric interfaces between folded monomers. Interfaces surrounding the central homo-oligomer symmetry axes necessarily have the same symmetry and so may not be well suited to binding asymmetric molecules. To enable general recognition of arbitrary asymmetric substrates and small molecules, we developed an approach to designing asymmetric interfaces at off-axis sites on homo-oligomers, analogous to those found in native homo-oligomeric proteins such as glutamine synthetase. We symmetrically dock curved helical repeat proteins such that they form pockets at the asymmetric interface of the oligomer with sizes ranging from several angstroms, appropriate for binding a single ion, to up to more than 20 Å across. Of the 133 proteins tested, 84 had soluble expression in E. coli, 47 had correct oligomeric states in solution, 35 had small-angle X-ray scattering (SAXS) data largely consistent with design models, and 8 had negative-stain electron microscopy (nsEM) 2D class averages showing the structures coming together as designed. Both an X-ray crystal structure and a cryogenic electron microscopy (cryoEM) structure are close to the computational design models. The nature of these proteins as homo-oligomers allows them to be readily built into higher-order structures such as nanocages, and the asymmetric pockets of these structures open rich possibilities for small-molecule binder design free from the constraints associated with monomer destabilization.
Subject(s)
Proteins , Escherichia coli/genetics , Glutamate-Ammonia Ligase , Proteins/chemistry , Scattering, Small Angle , X-Ray DiffractionABSTRACT
Understanding how proteins evolve under selective pressure is a longstanding challenge. The immensity of the search space has limited efforts to systematically evaluate the impact of multiple simultaneous mutations, so mutations have typically been assessed individually. However, epistasis, or the way in which mutations interact, prevents accurate prediction of combinatorial mutations based on measurements of individual mutations. Here, we use artificial intelligence to define the entire functional sequence landscape of a protein binding site in silico, and we call this approach Complete Combinatorial Mutational Enumeration (CCME). By leveraging CCME, we are able to construct a comprehensive map of the evolutionary connectivity within this functional sequence landscape. As a proof of concept, we applied CCME to the ACE2 binding site of the SARS-CoV-2 spike protein receptor binding domain. We selected representative variants from across the functional sequence landscape for testing in the laboratory. We identified variants that retained functionality to bind ACE2 despite changing over 40% of evaluated residue positions, and the variants now escape binding and neutralization by monoclonal antibodies. This work represents a crucial initial stride towards achieving precise predictions of pathogen evolution, opening avenues for proactive mitigation.