ABSTRACT
The knowledge of ligand binding hot spots and of the important interactions within such hot spots is crucial for the design of lead compounds in the early stages of structure-based drug discovery. The computational solvent mapping server FTMap can reliably identify binding hot spots as consensus clusters, free energy minima that bind a variety of organic probe molecules. However, in its current implementation, FTMap provides limited information on regions within the hot spots that tend to interact with specific pharmacophoric features of potential ligands. E-FTMap is a new server that expands on the original FTMap protocol. E-FTMap uses 119 organic probes, rather than the 16 in the original FTMap, to exhaustively map binding sites, and identifies pharmacophore features as atomic consensus sites where similar chemical groups bind. We validate E-FTMap against a set of 109 experimentally derived structures of fragment-lead pairs, finding that highly ranked pharmacophore features overlap with the corresponding atoms in both fragments and lead compounds. Additionally, comparisons of mapping results to ensembles of bound ligands reveal that pharmacophores generated with E-FTMap tend to sample highly conserved protein-ligand interactions. E-FTMap is available as a web server at https://eftmap.bu.edu.
Subject(s)
Drug Discovery , Pharmacophore , Ligands , Binding Sites , Drug Discovery/methods , Protein BindingABSTRACT
Antibodies are key proteins produced by the immune system to target pathogen proteins termed antigens via specific binding to surface regions called epitopes. Given an antigen and the sequence of an antibody the knowledge of the epitope is critical for the discovery and development of antibody based therapeutics. In this work, we present a computational protocol that uses template-based modeling and docking to predict epitope residues. This protocol is implemented in three major steps. First, a template-based modeling approach is used to build the antibody structures. We tested several options, including generation of models using AlphaFold2. Second, each antibody model is docked to the antigen using the fast Fourier transform (FFT) based docking program PIPER. Attention is given to optimally selecting the docking energy parameters depending on the input data. In particular, the van der Waals energy terms are reduced for modeled antibodies relative to x-ray structures. Finally, ranking of antigen surface residues is produced. The ranking relies on the docking results, that is, how often the residue appears in the docking poses' interface, and also on the energy favorability of the docking pose in question. The method, called PIPER-Map, has been tested on a widely used antibody-antigen docking benchmark. The results show that PIPER-Map improves upon the existing epitope prediction methods. An interesting observation is that epitope prediction accuracy starting from antibody sequence alone does not significantly differ from that of starting from unbound (i.e., separately crystallized) antibody structure.
Subject(s)
Antibodies , Antigens , Epitopes/metabolism , Antibodies/chemistry , Antigens/chemistry , Molecular Dynamics Simulation , Proteins/chemistry , Protein BindingABSTRACT
The design of PROteolysis-TArgeting Chimeras (PROTACs) requires bringing an E3 ligase into proximity with a target protein to modulate the concentration of the latter through its ubiquitination and degradation. Here, we present a method for generating high-accuracy structural models of E3 ligase-PROTAC-target protein ternary complexes. The method is dependent on two computational innovations: adding a "silent" convolution term to an efficient protein-protein docking program to eliminate protein poses that do not have acceptable linker conformations and clustering models of multiple PROTACs that use the same E3 ligase and target the same protein. Results show that the largest consensus clusters always have high predictive accuracy and that the ensemble of models can be used to predict the dissociation rate and cooperativity of the ternary complex that relate to the degrading activity of the PROTAC. The method is demonstrated by applications to known PROTAC structures and a blind test involving PROTACs against BRAF mutant V600E. The results confirm that PROTACs function by stabilizing a favorable interaction between the E3 ligase and the target protein but do not necessarily exploit the most energetically favorable geometry for interaction between the proteins.
Subject(s)
Proteins , Ubiquitin-Protein Ligases , Proteolysis , Ubiquitin-Protein Ligases/metabolism , Proteins/metabolism , UbiquitinationABSTRACT
Molecular dynamics (MD) simulations of proteins reveal the existence of many transient surface pockets; however, the factors determining what small subset of these represent druggable or functionally relevant ligand binding sites, called "cryptic sites," are not understood. Here, we examine multiple X-ray structures for a set of proteins with validated cryptic sites, using the computational hot spot identification tool FTMap. The results show that cryptic sites in ligand-free structures generally have a strong binding energy hot spot very close by. As expected, regions around cryptic sites exhibit above-average flexibility, and close to 50% of the proteins studied here have unbound structures that could accommodate the ligand without clashes. Nevertheless, the strong hot spot neighboring each cryptic site is almost always exploited by the bound ligand, suggesting that binding may frequently involve an induced fit component. We additionally evaluated the structural basis for cryptic site formation, by comparing unbound to bound structures. Cryptic sites are most frequently occluded in the unbound structure by intrusion of loops (22.5%), side chains (19.4%), or in some cases entire helices (5.4%), but motions that create sites that are too open can also eliminate pockets (19.4%). The flexibility of cryptic sites frequently leads to missing side chains or loops (12%) that are particularly evident in low resolution crystal structures. An interesting observation is that cryptic sites formed solely by the movement of side chains, or of backbone segments with fewer than five residues, result only in low affinity binding sites with limited use for drug discovery.
Subject(s)
Proteins/chemistry , Binding Sites , Ligands , Molecular Dynamics Simulation , Protein Binding , Protein ConformationABSTRACT
Targets in the protein docking experiment CAPRI (Critical Assessment of Predicted Interactions) generally present new challenges and contribute to new developments in methodology. In rounds 38 to 45 of CAPRI, most targets could be effectively predicted using template-based methods. However, the server ClusPro required structures rather than sequences as input, and hence we had to generate and dock homology models. The available templates also provided distance restraints that were directly used as input to the server. We show here that such an approach has some advantages. Free docking with template-based restraints using ClusPro reproduced some interfaces suggested by weak or ambiguous templates while not reproducing others, resulting in correct server predicted models. More recently we developed the fully automated ClusPro TBM server that performs template-based modeling and thus can use sequences rather than structures of component proteins as input. The performance of the server, freely available for noncommercial use at https://tbm.cluspro.org, is demonstrated by predicting the protein-protein targets of rounds 38 to 45 of CAPRI.
Subject(s)
Molecular Docking Simulation , Peptides/chemistry , Proteins/chemistry , Software , Amino Acid Sequence , Benchmarking , Binding Sites , Humans , Ligands , Peptides/metabolism , Protein Binding , Protein Conformation, alpha-Helical , Protein Conformation, beta-Strand , Protein Interaction Domains and Motifs , Protein Interaction Mapping , Protein Multimerization , Proteins/metabolism , Research Design , Structural Homology, Protein , ThermodynamicsABSTRACT
Binding hot spots are regions of proteins that, due to their potentially high contribution to the binding free energy, have high propensity to bind small molecules. We present benchmark sets for testing computational methods for the identification of binding hot spots with emphasis on fragment-based ligand discovery. Each protein structure in the set binds a fragment, which is extended into larger ligands in other structures without substantial change in its binding mode. Structures of the same proteins without any bound ligand are also collected to form an unbound benchmark. We also discuss a set developed by Astex Pharmaceuticals for the validation of hot and warm spots for fragment binding. The set is based on the assumption that a fragment that occurs in diverse ligands in the same subpocket identifies a binding hot spot. Since this set includes only ligand-bound proteins, we added a set with unbound structures. All four sets were tested using FTMap, a computational analogue of fragment screening experiments to form a baseline for testing other prediction methods, and differences among the sets are discussed.
Subject(s)
Benchmarking , Proteins , Binding Sites , Ligands , Protein Binding , Proteins/metabolismABSTRACT
As a participant in the joint CASP13-CAPRI46 assessment, the ClusPro server debuted its new template-based modeling functionality. The addition of this feature, called ClusPro TBM, was motivated by the previous CASP-CAPRI assessments and by the proven ability of template-based methods to produce higher-quality models, provided templates are available. In prior assessments, ClusPro submissions consisted of models that were produced via free docking of pre-generated homology models. This method was successful in terms of the number of acceptable predictions across targets; however, analysis of results showed that purely template-based methods produced a substantially higher number of medium-quality models for targets for which there were good templates available. The addition of template-based modeling has expanded ClusPro's ability to produce higher accuracy predictions, primarily for homomeric but also for some heteromeric targets. Here we review the newest additions to the ClusPro web server and discuss examples of CASP-CAPRI targets that continue to drive further development. We also describe ongoing work not yet implemented in the server. This includes the development of methods to improve template-based models and the use of co-evolutionary information for data-assisted free docking.
Subject(s)
Computational Biology , Protein Conformation , Proteins/ultrastructure , Software , Algorithms , Binding Sites/genetics , Databases, Protein , Humans , Molecular Docking Simulation , Molecular Dynamics Simulation , Protein Interaction Mapping , Proteins/chemistry , Proteins/genetics , Structural Homology, ProteinABSTRACT
We present the results for CAPRI Round 46, the third joint CASP-CAPRI protein assembly prediction challenge. The Round comprised a total of 20 targets including 14 homo-oligomers and 6 heterocomplexes. Eight of the homo-oligomer targets and one heterodimer comprised proteins that could be readily modeled using templates from the Protein Data Bank, often available for the full assembly. The remaining 11 targets comprised 5 homodimers, 3 heterodimers, and two higher-order assemblies. These were more difficult to model, as their prediction mainly involved "ab-initio" docking of subunit models derived from distantly related templates. A total of ~30 CAPRI groups, including 9 automatic servers, submitted on average ~2000 models per target. About 17 groups participated in the CAPRI scoring rounds, offered for most targets, submitting ~170 models per target. The prediction performance, measured by the fraction of models of acceptable quality or higher submitted across all predictors groups, was very good to excellent for the nine easy targets. Poorer performance was achieved by predictors for the 11 difficult targets, with medium and high quality models submitted for only 3 of these targets. A similar performance "gap" was displayed by scorer groups, highlighting yet again the unmet challenge of modeling the conformational changes of the protein components that occur upon binding or that must be accounted for in template-based modeling. Our analysis also indicates that residues in binding interfaces were less well predicted in this set of targets than in previous Rounds, providing useful insights for directions of future improvements.
Subject(s)
Computational Biology , Protein Conformation , Proteins/ultrastructure , Software , Algorithms , Binding Sites/genetics , Databases, Protein , Models, Molecular , Protein Binding/genetics , Protein Interaction Mapping , Proteins/chemistry , Proteins/genetics , Structural Homology, ProteinABSTRACT
SUMMARY: We present an approach for the efficient docking of peptide motifs to their free receptor structures. Using a motif based search, we can retrieve structural fragments from the Protein Data Bank (PDB) that are very similar to the peptide's final, bound conformation. We use a Fast Fourier Transform (FFT) based docking method to quickly perform global rigid body docking of these fragments to the receptor. According to CAPRI peptide docking criteria, an acceptable conformation can often be found among the top-ranking predictions. AVAILABILITY AND IMPLEMENTATION: The method is available as part of the protein-protein docking server ClusPro at https://peptidock.cluspro.org/nousername.php. CONTACT: midas@laufercenter.org or oraf@ekmd.huji.ac.il. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Subject(s)
Computational Biology/methods , Molecular Docking Simulation/methods , Protein Conformation , Protein Interaction Domains and Motifs , Software , Algorithms , Cyclins/chemistry , Cyclins/metabolism , Databases, Protein , Fourier Analysis , Peptides/chemistry , Peptides/metabolismABSTRACT
Fast Fourier transform (FFT) based approaches have been successful in application to modeling of relatively rigid protein-protein complexes. Recently, we have been able to adapt the FFT methodology to treatment of flexible protein-peptide interactions. Here, we report our latest attempt to expand the capabilities of the FFT approach to treatment of flexible protein-ligand interactions in application to the D3R PL-2016-1 challenge. Based on the D3R assessment, our FFT approach in conjunction with Monte Carlo minimization off-grid refinement was among the top performing methods in the challenge. The potential advantage of our method is its ability to globally sample the protein-ligand interaction landscape, which will be explored in further applications.
Subject(s)
17-alpha-Hydroxyprogesterone/pharmacology , Calcifediol/pharmacology , Fourier Analysis , Molecular Docking Simulation , Proteins/metabolism , 17-alpha-Hydroxyprogesterone/chemistry , Binding Sites , Calcifediol/chemistry , Computer-Aided Design , Drug Design , Humans , Ligands , Monte Carlo Method , Protein Binding , Proteins/chemistryABSTRACT
The heavily used protein-protein docking server ClusPro performs three computational steps as follows: (1) rigid body docking, (2) RMSD based clustering of the 1000 lowest energy structures, and (3) the removal of steric clashes by energy minimization. In response to challenges encountered in recent CAPRI targets, we added three new options to ClusPro. These are (1) accounting for small angle X-ray scattering data in docking; (2) considering pairwise interaction data as restraints; and (3) enabling discrimination between biological and crystallographic dimers. In addition, we have developed an extremely fast docking algorithm based on 5D rotational manifold FFT, and an algorithm for docking flexible peptides that include known sequence motifs. We feel that these developments will further improve the utility of ClusPro. However, CAPRI emphasized several shortcomings of the current server, including the problem of selecting the right energy parameters among the five options provided, and the problem of selecting the best models among the 10 generated for each parameter set. In addition, results convinced us that further development is needed for docking homology models. Finally, we discuss the difficulties we have encountered when attempting to develop a refinement algorithm that would be computationally efficient enough for inclusion in a heavily used server. Proteins 2017; 85:435-444. © 2016 Wiley Periodicals, Inc.
Subject(s)
Algorithms , Computational Biology/methods , Molecular Docking Simulation/methods , Proteins/chemistry , Software , Water/chemistry , Benchmarking , Binding Sites , Cluster Analysis , Crystallography, X-Ray , Databases, Protein , Internet , Protein Binding , Protein Conformation , Protein Interaction Mapping , Protein Multimerization , Research Design , Structural Homology, Protein , ThermodynamicsABSTRACT
The potential utility of synthetic macrocycles (MCs) as drugs, particularly against low-druggability targets such as protein-protein interactions, has been widely discussed. There is little information, however, to guide the design of MCs for good target protein-binding activity or bioavailability. To address this knowledge gap, we analyze the binding modes of a representative set of MC-protein complexes. The results, combined with consideration of the physicochemical properties of approved macrocyclic drugs, allow us to propose specific guidelines for the design of synthetic MC libraries with structural and physicochemical features likely to favor strong binding to protein targets as well as good bioavailability. We additionally provide evidence that large, natural product-derived MCs can bind targets that are not druggable by conventional, drug-like compounds, supporting the notion that natural product-inspired synthetic MCs can expand the number of proteins that are druggable by synthetic small molecules.
Subject(s)
Macrocyclic Compounds/metabolism , Protein Binding/drug effects , Binding Sites/drug effects , Biological Availability , Crystallography, X-Ray , Drug Design , Mass Spectrometry , Models, Molecular , Molecular Weight , Pharmaceutical Preparations/metabolism , Proteins/metabolism , Small Molecule LibrariesABSTRACT
Initiation of transcription in human mitochondria involves two factors, TFAM and TFB2M, in addition to the mitochondrial RNA polymerase, POLRMT. We have investigated the organization of the human mitochondrial transcription initiation complex on the light-strand promoter (LSP) through solution X-ray scattering, electron microscopy (EM) and biochemical studies. Our EM results demonstrate a compact organization of the initiation complex, suggesting that protein-protein interactions might help mediate initiation. We demonstrate that, in the absence of DNA, only POLRMT and TFAM form a stable interaction, albeit one with low affinity. This is consistent with the expected transient nature of the interactions necessary for initiation and implies that the promoter DNA acts as a scaffold that enables formation of the full initiation complex. Docking of known crystal structures into our EM maps results in a model for transcriptional initiation that strongly correlates with new and existing biochemical observations. Our results reveal the organization of TFAM, POLRMT and TFB2M around the LSP and represent the first structural characterization of the entire mitochondrial transcriptional initiation complex.
Subject(s)
DNA-Binding Proteins/chemistry , DNA-Directed RNA Polymerases/chemistry , Methyltransferases/chemistry , Mitochondria/genetics , Mitochondrial Proteins/chemistry , Transcription Factors/chemistry , Transcription Initiation, Genetic , DNA-Binding Proteins/metabolism , DNA-Directed RNA Polymerases/metabolism , Humans , Mitochondrial Proteins/metabolism , Models, Molecular , Promoter Regions, Genetic , Transcription Factors/metabolismABSTRACT
Structures of the influenza A virus M2 proton channel in the open conformation have been determined by X-ray crystallography, and in the closed conformation by NMR. Whereas the X-ray structure shows a single inhibitor molecule in the middle of the channel, four inhibitor molecules bind the channel's outer surface in the NMR structure. In both structures, the strongest hot spots (i.e., regions that contribute substantially to the free energy of binding any potential ligand) lie inside the pore, and other hot spots are found at exterior locations. By considering all available models, we propose the primary drug binding site is inside the pore, but that exterior binding occurs under appropriate conditions.
Subject(s)
Amantadine/metabolism , Influenza A virus/metabolism , Viral Matrix Proteins/antagonists & inhibitors , Viral Matrix Proteins/metabolism , Amantadine/chemistry , Amantadine/pharmacology , Crystallography, X-Ray , Influenza A virus/drug effects , Models, Molecular , Nuclear Magnetic Resonance, Biomolecular , Viral Matrix Proteins/chemistryABSTRACT
UNLABELLED: Computational solvent mapping finds binding hot spots, determines their druggability and provides information for drug design. While mapping of a ligand-bound structure yields more accurate results, usually the apo structure serves as the starting point in design. The FTFlex algorithm, implemented as a server, can modify an apo structure to yield mapping results that are similar to those of the respective bound structure. Thus, FTFlex is an extension of our FTMap server, which only considers rigid structures. FTFlex identifies flexible residues within the binding site and determines alternative conformations using a rotamer library. In cases where the mapping results of the apo structure were in poor agreement with those of the bound structure, FTFlex was able to yield a modified apo structure, which lead to improved FTMap results. In cases where the mapping results of the apo and bound structures were in good agreement, no new structure was predicted. AVAILABILITY: FTFlex is freely available as a web-based server at http://ftflex.bu.edu/.
Subject(s)
Algorithms , Drug Design , Proteins/chemistry , Binding Sites , Ligands , Models, Molecular , Protein Conformation , Proteins/metabolism , Software , Solvents/chemistryABSTRACT
In screening a library of natural and synthetic products for eukaryotic translation modulators, we identified two natural products, isohymenialdisine and hymenialdisine, that exhibit stimulatory effects on translation. The characterization of these compounds led to the insight that mRNA used to program the translation extracts during high-throughput assay setup was leading to phosphorylation of eIF2α, a potent negative regulatory event that is mediated by one of four kinases. We identified double-stranded RNA-dependent protein kinase (PKR) as the eIF2α kinase that was being activated by exogenously added mRNA template. Characterization of the mode of action of isohymenialdisine revealed that it directly acts on PKR by inhibiting autophosphorylation, perturbs the PKR-eIF2α phosphorylation axis, and can be modeled into the PKR ATP binding site. Our results identify a source of "false positives" for high-throughput screen campaigns using translation extracts, raising a cautionary note for this type of screen.
Subject(s)
Protein Biosynthesis/drug effects , Small Molecule Libraries/pharmacology , Adenosine Triphosphate/metabolism , Azepines/pharmacology , Drug Evaluation, Preclinical , High-Throughput Screening Assays , Models, Molecular , Protein Conformation , Pyrroles/pharmacology , eIF-2 Kinase/chemistry , eIF-2 Kinase/metabolismABSTRACT
Many proteins of widely differing functionality and structure are capable of binding heparin and heparan sulfate. Since crystallizing protein-heparin complexes for structure determination is generally difficult, computational docking can be a useful approach for understanding specific interactions. Previous studies used programs originally developed for docking small molecules to well-defined pockets, rather than for docking polysaccharides to highly charged shallow crevices that usually bind heparin. We have extended the program PIPER and the automated protein-protein docking server ClusPro to heparin docking. Using a molecular mechanics energy function for scoring and the fast Fourier transform correlation approach, the method generates and evaluates close to a billion poses of a heparin tetrasaccharide probe. The docked structures are clustered using pairwise root-mean-square deviations as the distance measure. It was shown that clustering of heparin molecules close to each other but having different orientations and selecting the clusters with the highest protein-ligand contacts reliably predicts the heparin binding site. In addition, the centers of the five most populated clusters include structures close to the native orientation of the heparin. These structures can provide starting points for further refinement by methods that account for flexibility such as molecular dynamics. The heparin docking method is available as an advanced option of the ClusPro server at http://cluspro.bu.edu/ .
Subject(s)
Heparin/metabolism , Molecular Docking Simulation , Proteins/chemistry , Proteins/metabolism , Binding Sites , Heparitin Sulfate/metabolism , Humans , Monte Carlo Method , Protein Conformation , Solvents/chemistryABSTRACT
Formaldehyde has long been recognized as a hazardous environmental agent highly reactive with DNA. Recently, it has been realized that due to the activity of histone demethylation enzymes within the cell nucleus, formaldehyde is produced endogenously, in direct vicinity of genomic DNA. Should it lead to extensive DNA damage? We address this question with the aid of a computational mapping method, analogous to X-ray and nuclear magnetic resonance techniques for observing weakly specific interactions of small organic compounds with a macromolecule in order to establish important functional sites. We concentrate on the leading reaction of formaldehyde with free bases: hydroxymethylation of cytosine amino groups. Our results show that in B-DNA, cytosine amino groups are totally inaccessible for the formaldehyde attack. Then, we explore the effect of recently discovered transient flipping of Watson-Crick (WC) pairs into Hoogsteen (HG) pairs (HG breathing). Our results show that the HG base pair formation dramatically affects the accessibility for formaldehyde of cytosine amino nitrogens within WC base pairs adjacent to HG base pairs. The extensive literature on DNA interaction with formaldehyde is analyzed in light of the new findings. The obtained data emphasize the significance of DNA HG breathing.
Subject(s)
DNA, B-Form/chemistry , Formaldehyde/chemistry , Algorithms , Base Pairing , Binding Sites , Computational Biology , Cytosine/chemistry , Models, Molecular , Nitrogen/chemistryABSTRACT
Binding hot spots, protein sites with high-binding affinity, can be identified using X-ray crystallography or NMR by screening libraries of small organic molecules that tend to cluster at such regions. FTMAP, a direct computational analog of the experimental screening approaches, globally samples the surface of a target protein using small organic molecules as probes, finds favorable positions, clusters the conformations and ranks the clusters on the basis of the average energy. The regions that bind several probe clusters predict the binding hot spots, in good agreement with experimental results. Small molecules discovered by fragment-based approaches to drug design also bind at the hot spot regions. To identify such molecules and their most likely bound positions, we extend the functionality of FTMAP (http://ftmap.bu.edu/param) to accept any small molecule as an additional probe. In its updated form, FTMAP identifies the hot spots based on a standard set of probes, and for each additional probe shows representative structures of nearby low energy clusters. This approach helps to predict bound poses of the user-selected molecules, detects if a compound is not likely to bind in the hot spot region, and provides input for the design of larger ligands.
Subject(s)
Proteins/chemistry , Software , Algorithms , Binding Sites , Internet , Ligands , Molecular Probes/chemistry , Protein Binding , Thrombin/chemistryABSTRACT
Despite the growing number of examples of small-molecule inhibitors that disrupt protein-protein interactions (PPIs), the origin of druggability of such targets is poorly understood. To identify druggable sites in protein-protein interfaces we combine computational solvent mapping, which explores the protein surface using a variety of small "probe" molecules, with a conformer generator to account for side-chain flexibility. Applications to unliganded structures of 15 PPI target proteins show that the druggable sites comprise a cluster of binding hot spots, distinguishable from other regions of the protein due to their concave topology combined with a pattern of hydrophobic and polar functionality. This combination of properties confers on the hot spots a tendency to bind organic species possessing some polar groups decorating largely hydrophobic scaffolds. Thus, druggable sites at PPI are not simply sites that are complementary to particular organic functionality, but rather possess a general tendency to bind organic compounds with a variety of structures, including key side chains of the partner protein. Results also highlight the importance of conformational adaptivity at the binding site to allow the hot spots to expand to accommodate a ligand of drug-like dimensions. The critical components of this adaptivity are largely local, involving primarily low energy side-chain motions within 6 Å of a hot spot. The structural and physicochemical signature of druggable sites at PPI interfaces is sufficiently robust to be detectable from the structure of the unliganded protein, even when substantial conformational adaptation is required for optimal ligand binding.