RESUMO
Predicting the activities of new compounds against biophysical or phenotypic assays based on the known activities of one or a few existing compounds is a common goal in early stage drug discovery. This problem can be cast as a "few-shot learning" challenge, and prior studies have developed few-shot learning methods to classify compounds as active versus inactive. However, the ability to go beyond classification and rank compounds by expected affinity is more valuable. We describe Few-Shot Compound Activity Prediction (FS-CAP), a novel neural architecture trained on a large bioactivity data set to predict compound activities against an assay outside the training set, based on only the activities of a few known compounds against the same assay. Our model aggregates encodings generated from the known compounds and their activities to capture assay information and uses a separate encoder for the new compound whose activity is to be predicted. The new method provides encouraging results relative to traditional chemical-similarity-based techniques as well as other state-of-the-art few-shot learning methods in tests on a variety of ligand-based drug discovery settings and data sets. The code for FS-CAP is available at https://github.com/Rose-STL-Lab/FS-CAP.
Assuntos
Descoberta de Drogas , Ligantes , Descoberta de Drogas/métodos , Aprendizado de Máquina , Redes Neurais de ComputaçãoRESUMO
Model systems are widely used in biology and chemistry to gain insight into more complex systems. In the field of computational chemistry, researchers use host-guest systems, relatively simple exemplars of noncovalent binding, to train and test the computational methods used in drug discovery. Indeed, host-guest systems have been developed to support the community-wide blinded SAMPL prediction challenges for over a decade. While seeking new host-guest systems for the recent SAMPL9 binding prediction challenge, which is the focus of the present PCCP Themed Collection, we identified phenothiazine as a privileged scaffold for guests of ß cyclodextrin (ßCD) and its derivatives. Building on this observation, we used calorimetry and NMR spectroscopy to characterize the noncovalent association of native ßCD and three methylated derivatives of ßCD with five phenothiazine drugs. The strongest association observed, that of thioridazine and one of the methyl derivatives, exceeds the well-known high affinity of rimantidine with ßCD. Intriguingly, however, methylation of ßCD at the 3 position abolished detectible binding for all of the drugs studied. The dataset has a clear pattern of entropy-enthalpy compensation. The NMR data show that all of the drugs position at least one aromatic proton at the secondary face of the CD, and most also show evidence of deep penetration of the binding site. The results of this study were used in the SAMPL9 blinded binding affinity-prediction challenge, which are detailed in accompanying papers of the present Themed Collection. These data also open the phenothiazines and, potentially, chemically similar drugs, such as the tricyclic antidepressants, as relatively potent binders of ßCD, setting the stage for future SAMPL challenge datasets and for possible applications as drug reversal agents.
Assuntos
Ciclodextrinas , Ciclodextrinas/química , Fenotiazinas , Sítios de Ligação , TermodinâmicaRESUMO
Here, we present remarkable epoxyketone-based proteasome inhibitors with low nanomolar inâ vitro potency for blood-stage Plasmodium falciparum and low cytotoxicity for human cells. Our best compound has more than 2,000-fold greater selectivity for erythrocytic-stage P.â falciparum over HepG2 and H460 cells, which is largely driven by the accommodation of the parasite proteasome for a D-amino acid in the P3 position and the preference for a difluorobenzyl group in the P1 position. We isolated the proteasome from P.â falciparum cell extracts and determined that the best compound is 171-fold more potent at inhibiting the ß5 subunit of P.â falciparum proteasome when compared to the same subunit of the human constitutive proteasome. These compounds also significantly reduce parasitemia in a P. berghei mouse infection model and prolong survival of animals by an average of 6â days. The current epoxyketone inhibitors are ideal starting compounds for orally bioavailable anti-malarial drugs.
Assuntos
Antimaláricos , Plasmodium , Camundongos , Animais , Humanos , Inibidores de Proteassoma/química , Complexo de Endopeptidases do Proteassoma/química , Plasmodium falciparum , Antimaláricos/farmacologiaRESUMO
Recently, we presented a strategy for packaging peptides as side-chains in high-density brush polymers. For this globular protein-like polymer (PLP) formulation, therapeutic peptides were shown to resist proteolytic degradation, enter cells efficiently and maintain biological function. In this paper, we establish the role charge plays in dictating the cellular uptake of these peptide formulations, finding that peptides with a net positive charge will enter cells when polymerized, while those formed from anionic or neutral peptides remain outside of cells. Given these findings, we explored whether cellular uptake could be selectively induced by a stimulus. In our design, a cationic peptide is appended to a sequence of charge-neutralizing anionic amino acids through stimuli-responsive cleavable linkers. As a proof-of-concept study, we tested this strategy with two different classes of stimuli, exogenous UV light and an enzyme (a matrix metalloproteinase) associated with the inflammatory response. The key finding is that these materials enter cells only when acted upon by the stimulus. This approach makes it possible to achieve delivery of the polymers, therapeutic peptides or an appended cargo into cells in response to an appropriate stimulus.
Assuntos
Peptídeos , Polímeros , Peptídeo Hidrolases , Polimerização , ProteínasRESUMO
One of the main challenges of structure-based virtual screening (SBVS) is the incorporation of the receptor's flexibility, as its explicit representation in every docking run implies a high computational cost. Therefore, a common alternative to include the receptor's flexibility is the approach known as ensemble docking. Ensemble docking consists of using a set of receptor conformations and performing the docking assays over each of them. However, there is still no agreement on how to combine the ensemble docking results to obtain the final ligand ranking. A common choice is to use consensus strategies to aggregate the ensemble docking scores, but these strategies exhibit slight improvement regarding the single-structure approach. Here, we claim that using machine learning (ML) methodologies over the ensemble docking results could improve the predictive power of SBVS. To test this hypothesis, four proteins were selected as study cases: CDK2, FXa, EGFR, and HSP90. Protein conformational ensembles were built from crystallographic structures, whereas the evaluated compound library comprised up to three benchmarking data sets (DUD, DEKOIS 2.0, and CSAR-2012) and cocrystallized molecules. Ensemble docking results were processed through 30 repetitions of 4-fold cross-validation to train and validate two ML classifiers: logistic regression and gradient boosting trees. Our results indicate that the ML classifiers significantly outperform traditional consensus strategies and even the best performance case achieved with single-structure docking. We provide statistical evidence that supports the effectiveness of ML to improve the ensemble docking performance.
Assuntos
Aprendizado de Máquina , Proteínas , Benchmarking , Ligantes , Simulação de Acoplamento Molecular , Ligação Proteica , Conformação Proteica , Proteínas/metabolismoRESUMO
In the last two decades, a large number of machine-learning-based predictors for the activities of antimicrobial peptides (AMPs) have been proposed. These predictors differ from one another in the learning method and in the training and testing data sets used. Unfortunately, the training data sets present several drawbacks, such as a low representativenessâ¯regarding the experimentally validated AMP space, and duplicated peptide sequences between negative and positive data sets. These limitations give a low confidence to most of the approaches to be used in prospective studies. To address these weaknesses, we propose novel modeling and assessing data sets from the largest experimentally validated nonredundant peptide data set reported to date. From these novel data sets, alignment-free quantitative sequence-activity models (AF-QSAMs) based on Random Forest are created to identify general AMPs and their antibacterial, antifungal, antiparasitic, and antiviral functional types. An applicability domain analysis is carried out to determine the reliability of the predictions obtained, which, to the best of our knowledge, is performed for the first time for AMP recognition. A benchmarking is undertaken between the models proposed and several models from the literature that are freely available in 13 programs (ClassAMP, iAMP-2L, ADAM, MLAMP, AMPScanner v2.0, AntiFP, AMPfun, PEPred-suite, AxPEP, CAMPR3, iAMPpred, APIN, and Meta-iAVP). The models proposed are those with the best performance in all of the endpoints modeled, while most of the methods from the literature have weak-to-random predictive agreements. The models proposed are also assessed through Y-scrambling and repeated k-fold cross-validation tests, demonstrating that the outcomes obtained by them are not given by chance. Three chemometric analyses also confirmed the relevance of the peptides descriptors used in the modeling. Therefore, it can be concluded that the models built by fixing the drawbacks existing in the literature contribute to identifying antibacterial, antifungal, antiparasitic, and antiviral peptides with high effectivity and reliability. Models are freely available via the AMPDiscover tool at https://biocom-ampdiscover.cicese.mx/.
Assuntos
Aprendizado de Máquina , Peptídeos , Humanos , Proteínas Citotóxicas Formadoras de Poros , Estudos Prospectivos , Reprodutibilidade dos TestesRESUMO
Water molecules can be found interacting with the surface and within cavities in proteins. However, water exchange between bulk and buried hydration sites can be slow compared to simulation timescales, thus leading to the inefficient sampling of the locations of water. This can pose problems for free energy calculations for computer-aided drug design. Here, we apply a hybrid method that combines nonequilibrium candidate Monte Carlo (NCMC) simulations and molecular dynamics (MD) to enhance sampling of water in specific areas of a system, such as the binding site of a protein. Our approach uses NCMC to gradually remove interactions between a selected water molecule and its environment, then translates the water to a new region, before turning the interactions back on. This approach of gradual removal of interactions, followed by a move and then reintroduction of interactions, allows the environment to relax in response to the proposed water translation, improving acceptance of moves and thereby accelerating water exchange and sampling. We validate this approach on several test systems including the ligand-bound MUP-1 and HSP90 proteins with buried crystallographic waters removed. We show that our BLUES (NCMC/MD) method enhances water sampling relative to normal MD when applied to these systems. Thus, this approach provides a strategy to improve water sampling in molecular simulations which may be useful in practical applications in drug discovery and biomolecular design.
Assuntos
Proteínas/química , Sítios de Ligação , Ligantes , Simulação de Dinâmica Molecular , Método de Monte Carlo , Ligação Proteica , Conformação Proteica , Termodinâmica , ÁguaRESUMO
We analyze light-driven overcrowded alkene-based molecular motors, an intriguing class of small molecules that have the potential to generate MHz-scale rotation rates. The full rotation process is simulated at multiple scales by combining quantum surface-hopping molecular dynamics (MD) simulations for the photoisomerization step with classical MD simulations for the thermal helix inversion step. A Markov state analysis resolves conformational substates, their interconversion kinetics, and their roles in the motor's rotation process. Furthermore, motor performance metrics, including rotation rate and maximal power output, are computed to validate computations against experimental measurements and to inform future designs. Lastly, we find that to correctly model these motors, the force field must be optimized by fitting selected parameters to reference quantum mechanical energy surfaces. Overall, our simulations yield encouraging agreement with experimental observables such as rotation rates, and provide mechanistic insights that may help future designs.
RESUMO
Approaches for computing small molecule binding free energies based on molecular simulations are now regularly being employed by academic and industry practitioners to study receptor-ligand systems and prioritize the synthesis of small molecules for ligand design. Given the variety of methods and implementations available, it is natural to ask how the convergence rates and final predictions of these methods compare. In this study, we describe the concept and results for the SAMPL6 SAMPLing challenge, the first challenge from the SAMPL series focusing on the assessment of convergence properties and reproducibility of binding free energy methodologies. We provided parameter files, partial charges, and multiple initial geometries for two octa-acid (OA) and one cucurbit[8]uril (CB8) host-guest systems. Participants submitted binding free energy predictions as a function of the number of force and energy evaluations for seven different alchemical and physical-pathway (i.e., potential of mean force and weighted ensemble of trajectories) methodologies implemented with the GROMACS, AMBER, NAMD, or OpenMM simulation engines. To rank the methods, we developed an efficiency statistic based on bias and variance of the free energy estimates. For the two small OA binders, the free energy estimates computed with alchemical and potential of mean force approaches show relatively similar variance and bias as a function of the number of energy/force evaluations, with the attach-pull-release (APR), GROMACS expanded ensemble, and NAMD double decoupling submissions obtaining the greatest efficiency. The differences between the methods increase when analyzing the CB8-quinine system, where both the guest size and correlation times for system dynamics are greater. For this system, nonequilibrium switching (GROMACS/NS-DS/SB) obtained the overall highest efficiency. Surprisingly, the results suggest that specifying force field parameters and partial charges is insufficient to generally ensure reproducibility, and we observe differences between seemingly converged predictions ranging approximately from 0.3 to 1.0 kcal/mol, even with almost identical simulations parameters and system setup (e.g., Lennard-Jones cutoff, ionic composition). Further work will be required to completely identify the exact source of these discrepancies. Among the conclusions emerging from the data, we found that Hamiltonian replica exchange-while displaying very small variance-can be affected by a slowly-decaying bias that depends on the initial population of the replicas, that bidirectional estimators are significantly more efficient than unidirectional estimators for nonequilibrium free energy calculations for systems considered, and that the Berendsen barostat introduces non-negligible artifacts in expanded ensemble simulations.
Assuntos
Compostos Macrocíclicos/química , Proteínas/química , Solventes/química , Termodinâmica , Hidrocarbonetos Aromáticos com Pontes/química , Entropia , Imidazóis/química , Ligantes , Fenômenos Físicos , Ligação Proteica , Teoria QuânticaRESUMO
The Drug Design Data Resource (D3R) aims to identify best practice methods for computer aided drug design through blinded ligand pose prediction and affinity challenges. Herein, we report on the results of Grand Challenge 4 (GC4). GC4 focused on proteins beta secretase 1 and Cathepsin S, and was run in an analogous manner to prior challenges. In Stage 1, participant ability to predict the pose and affinity of BACE1 ligands were assessed. Following the completion of Stage 1, all BACE1 co-crystal structures were released, and Stage 2 tested affinity rankings with co-crystal structures. We provide an analysis of the results and discuss insights into determined best practice methods.
Assuntos
Secretases da Proteína Precursora do Amiloide/antagonistas & inibidores , Ácido Aspártico Endopeptidases/antagonistas & inibidores , Desenho de Fármacos , Inibidores Enzimáticos/farmacologia , Bibliotecas de Moléculas Pequenas/farmacologia , Secretases da Proteína Precursora do Amiloide/metabolismo , Ácido Aspártico Endopeptidases/metabolismo , Inibidores Enzimáticos/química , Humanos , Ligantes , Aprendizado de Máquina , Simulação de Acoplamento Molecular , Bibliotecas de Moléculas Pequenas/química , TermodinâmicaRESUMO
Binding-site water is often displaced upon ligand recognition, but is commonly neglected in structure-based ligand discovery. Inhomogeneous solvation theory (IST) has become popular for treating this effect, but it has not been tested in controlled experiments at atomic resolution. To do so, we turned to a grid-based version of this method, GIST, readily implemented in molecular docking. Whereas the term only improves docking modestly in retrospective ligand enrichment, it could be added without disrupting performance. We thus turned to prospective docking of large libraries to investigate GIST's impact on ligand discovery, geometry, and water structure in a model cavity site well-suited to exploring these terms. Although top-ranked docked molecules with and without the GIST term often overlapped, many ligands were meaningfully prioritized or deprioritized; some of these were selected for testing. Experimentally, 13/14 molecules prioritized by GIST did bind, whereas none of the molecules that it deprioritized were observed to bind. Nine crystal complexes were determined. In six, the ligand geometry corresponded to that predicted by GIST, for one of these the pose without the GIST term was wrong, and three crystallographic poses differed from both predictions. Notably, in one structure, an ordered water molecule with a high GIST displacement penalty was observed to stay in place. Inclusion of this water-displacement term can substantially improve the hit rates and ligand geometries from docking screens, although the magnitude of its effects can be small and its impact in drug binding sites merits further controlled studies.
Assuntos
Biologia Computacional/métodos , Simulação de Acoplamento Molecular , Soluções/química , Solventes/química , Algoritmos , Sítios de Ligação , Cristalografia por Raios X , Cinética , Ligantes , Estrutura Molecular , Ligação Proteica , Conformação Proteica , Termodinâmica , Água/químicaRESUMO
A number of enzymes reportedly exhibit enhanced diffusion in the presence of their substrates, with a Michaelis-Menten-like concentration dependence. Although no definite explanation of this phenomenon has emerged, a physical picture of enzyme self-propulsion using energy from the catalyzed reaction has been widely considered. Here, we present a kinematic and thermodynamic analysis of enzyme self-propulsion that is independent of any specific propulsion mechanism. Using this theory, along with biophysical data compiled for all enzymes so far shown to undergo enhanced diffusion, we show that the propulsion speed required to generate experimental levels of enhanced diffusion exceeds the speeds of well-known active biomolecules, such as myosin, by several orders of magnitude. Furthermore, the minimal power dissipation required to account for enzyme enhanced diffusion by self-propulsion markedly exceeds the chemical power available from enzyme-catalyzed reactions. Alternative explanations for the observation of enhanced enzyme diffusion therefore merit stronger consideration.
Assuntos
Enzimas/metabolismo , Modelos Biológicos , Difusão , Cinética , TermodinâmicaRESUMO
We describe the design, synthesis, and antitumor activity of an 18 carbon α,ω-dicarboxylic acid monoconjugated via an ester linkage to paclitaxel (PTX). This 1,18-octadecanedioic acid-PTX (ODDA-PTX) prodrug readily forms a noncovalent complex with human serum albumin (HSA). Preservation of the terminal carboxylic acid moiety on ODDA-PTX enables binding to HSA in the same manner as native long-chain fatty acids (LCFAs), within hydrophobic pockets, maintaining favorable electrostatic contacts between the ω-carboxylate of ODDA-PTX and positively charged amino acid residues of the protein. This carrier strategy for small molecule drugs is based on naturally evolved interactions between LCFAs and HSA, demonstrated here for PTX. ODDA-PTX shows differentiated pharmacokinetics, higher maximum tolerated doses and increased efficacy in vivo in multiple subcutaneous murine xenograft models of human cancer, as compared to two FDA-approved clinical formulations, Cremophor EL-formulated paclitaxel (crPTX) and Abraxane (nanoparticle albumin-bound (nab)-paclitaxel).
Assuntos
Antineoplásicos/farmacologia , Ácidos Dicarboxílicos/farmacologia , Paclitaxel/farmacologia , Pró-Fármacos/farmacologia , Albumina Sérica Humana/química , Ácidos Esteáricos/farmacologia , Animais , Antineoplásicos/síntese química , Antineoplásicos/química , Linhagem Celular Tumoral , Proliferação de Células/efeitos dos fármacos , Ácidos Dicarboxílicos/química , Relação Dose-Resposta a Droga , Humanos , Camundongos , Camundongos Nus , Modelos Moleculares , Estrutura Molecular , Neoplasias Experimentais/tratamento farmacológico , Neoplasias Experimentais/patologia , Paclitaxel/química , Pró-Fármacos/síntese química , Pró-Fármacos/química , Ácidos Esteáricos/químicaRESUMO
The Drug Design Data Resource aims to test and advance the state of the art in protein-ligand modeling by holding community-wide blinded, prediction challenges. Here, we report on our third major round, Grand Challenge 3 (GC3). Held 2017-2018, GC3 centered on the protein Cathepsin S and the kinases VEGFR2, JAK2, p38-α, TIE2, and ABL1, and included both pose-prediction and affinity-ranking components. GC3 was structured much like the prior challenges GC2015 and GC2. First, Stage 1 tested pose prediction and affinity ranking methods; then all available crystal structures were released, and Stage 2 tested only affinity rankings, now in the context of the available structures. Unique to GC3 was the addition of a Stage 1b self-docking subchallenge, in which the protein coordinates from all of the cocrystal structures used in the cross-docking challenge were released, and participants were asked to predict the pose of CatS ligands using these newly released structures. We provide an overview of the outcomes and discuss insights into trends and best-practices.
Assuntos
Catepsinas/química , Simulação de Acoplamento Molecular/métodos , Inibidores de Proteínas Quinases/química , Proteínas Quinases/química , Sítios de Ligação , Desenho Assistido por Computador , Cristalografia por Raios X , Bases de Dados de Proteínas , Desenho de Fármacos , Ligantes , Ligação Proteica , Conformação Proteica , TermodinâmicaRESUMO
Molecular motors are thought to generate force and directional motion via nonequilibrium switching between energy surfaces. Because all enzymes can undergo such switching, we hypothesized that the ability to generate rotary motion and torque is not unique to highly adapted biological motor proteins but is instead a common feature of enzymes. We used molecular dynamics simulations to compute energy surfaces for hundreds of torsions in three enzymes-adenosine kinase, protein kinase A, and HIV-1 protease-and used these energy surfaces within a kinetic model that accounts for intersurface switching and intrasurface probability flows. When substrate is out of equilibrium with product, we find computed torsion rotation rates up â¼140 cycles s-1, with stall torques up to â¼2 kcal mol-1 cycle-1, and power outputs up to â¼50 kcal mol-1 s-1. We argue that these enzymes are instances of a general phenomenon of directional probability flows on asymmetric energy surfaces for systems out of equilibrium. Thus, we conjecture that cyclic probability fluxes, corresponding to rotations of torsions and higher-order collective variables, exist in any chiral molecule driven between states in a nonequilibrium manner; we call this the "Asymmetry-Directionality" conjecture. This is expected to apply as well to synthetic chiral molecules switched in a nonequilibrium manner between energy surfaces by light, redox chemistry, or catalysis.
Assuntos
Simulação de Dinâmica Molecular , Adenosina Quinase/química , Adenosina Quinase/metabolismo , Proteínas Quinases Dependentes de AMP Cíclico/química , Proteínas Quinases Dependentes de AMP Cíclico/metabolismo , Protease de HIV/química , Protease de HIV/metabolismo , Movimento , Conformação Proteica , TermodinâmicaRESUMO
BACKGROUND: In theory, binding enthalpies directly obtained from calorimetry (such as ITC) and the temperature dependence of the binding free energy (van't Hoff method) should agree. However, previous studies have often found them to be discrepant. METHODS: Experimental binding enthalpies (both calorimetric and van't Hoff) are obtained for two host-guest pairs using ITC, and the discrepancy between the two enthalpies is examined. Modeling of artificial ITC data is also used to examine how different sources of error propagate to both types of binding enthalpies. RESULTS: For the host-guest pairs examined here, good agreement, to within about 0.4kcal/mol, is obtained between the two enthalpies. Additionally, using artificial data, we find that different sources of error propagate to either enthalpy uniquely, with concentration error and heat error propagating primarily to calorimetric and van't Hoff enthalpies, respectively. CONCLUSIONS: With modern calorimeters, good agreement between van't Hoff and calorimetric enthalpies should be achievable, barring issues due to non-ideality or unanticipated measurement pathologies. Indeed, disagreement between the two can serve as a flag for error-prone datasets. A review of the underlying theory supports the expectation that these two quantities should be in agreement. GENERAL SIGNIFICANCE: We address and arguably resolve long-standing questions regarding the relationship between calorimetric and van't Hoff enthalpies. In addition, we show that comparison of these two quantities can be used as an internal consistency check of a calorimetry study.
Assuntos
Calorimetria/métodos , Termodinâmica , Algoritmos , Amantadina/química , Calorimetria/instrumentação , Transferência de Energia , Temperatura Alta , Cinética , Rimantadina/química , beta-Ciclodextrinas/químicaRESUMO
Accurately predicting the binding affinities of small organic molecules to biological macromolecules can greatly accelerate drug discovery by reducing the number of compounds that must be synthesized to realize desired potency and selectivity goals. Unfortunately, the process of assessing the accuracy of current computational approaches to affinity prediction against binding data to biological macromolecules is frustrated by several challenges, such as slow conformational dynamics, multiple titratable groups, and the lack of high-quality blinded datasets. Over the last several SAMPL blind challenge exercises, host-guest systems have emerged as a practical and effective way to circumvent these challenges in assessing the predictive performance of current-generation quantitative modeling tools, while still providing systems capable of possessing tight binding affinities. Here, we present an overview of the SAMPL6 host-guest binding affinity prediction challenge, which featured three supramolecular hosts: octa-acid (OA), the closely related tetra-endo-methyl-octa-acid (TEMOA), and cucurbit[8]uril (CB8), along with 21 small organic guest molecules. A total of 119 entries were received from ten participating groups employing a variety of methods that spanned from electronic structure and movable type calculations in implicit solvent to alchemical and potential of mean force strategies using empirical force fields with explicit solvent models. While empirical models tended to obtain better performance than first-principle methods, it was not possible to identify a single approach that consistently provided superior results across all host-guest systems and statistical metrics. Moreover, the accuracy of the methodologies generally displayed a substantial dependence on the system considered, emphasizing the need for host diversity in blind evaluations. Several entries exploited previous experimental measurements of similar host-guest systems in an effort to improve their physical-based predictions via some manner of rudimentary machine learning; while this strategy succeeded in reducing systematic errors, it did not correspond to an improvement in statistical correlation. Comparison to previous rounds of the host-guest binding free energy challenge highlights an overall improvement in the correlation obtained by the affinity predictions for OA and TEMOA systems, but a surprising lack of improvement regarding root mean square error over the past several challenge rounds. The data suggests that further refinement of force field parameters, as well as improved treatment of chemical effects (e.g., buffer salt conditions, protonation states), may be required to further enhance predictive accuracy.
Assuntos
Hidrocarbonetos Aromáticos com Pontes/química , Ácidos Carboxílicos/química , Imidazóis/química , Compostos Macrocíclicos/química , Proteínas/química , Simulação por Computador , Cicloparafinas/química , Desenho de Fármacos , Ligantes , Estrutura Molecular , Ligação Proteica , Software , TermodinâmicaRESUMO
The Drug Design Data Resource (D3R) ran Grand Challenge 2 (GC2) from September 2016 through February 2017. This challenge was based on a dataset of structures and affinities for the nuclear receptor farnesoid X receptor (FXR), contributed by F. Hoffmann-La Roche. The dataset contained 102 IC50 values, spanning six orders of magnitude, and 36 high-resolution co-crystal structures with representatives of four major ligand classes. Strong global participation was evident, with 49 participants submitting 262 prediction submission packages in total. Procedurally, GC2 mimicked Grand Challenge 2015 (GC2015), with a Stage 1 subchallenge testing ligand pose prediction methods and ranking and scoring methods, and a Stage 2 subchallenge testing only ligand ranking and scoring methods after the release of all blinded co-crystal structures. Two smaller curated sets of 18 and 15 ligands were developed to test alchemical free energy methods. This overview summarizes all aspects of GC2, including the dataset details, challenge procedures, and participant results. We also consider implications for progress in the field, while highlighting methodological areas that merit continued development. Similar to GC2015, the outcome of GC2 underscores the pressing need for methods development in pose prediction, particularly for ligand scaffolds not currently represented in the Protein Data Bank ( http://www.pdb.org ), and in affinity ranking and scoring of bound ligands.
Assuntos
Desenho de Fármacos , Receptores Citoplasmáticos e Nucleares/metabolismo , Desenho Assistido por Computador , Bases de Dados de Proteínas , Humanos , Concentração Inibidora 50 , Ligantes , Simulação de Acoplamento Molecular , Ligação Proteica , Receptores Citoplasmáticos e Nucleares/agonistas , Receptores Citoplasmáticos e Nucleares/antagonistas & inibidores , Receptores Citoplasmáticos e Nucleares/química , Software , TermodinâmicaRESUMO
Calorimetric studies of protein-ligand binding sometimes yield thermodynamic data that are difficult to understand. Today, molecular simulations can be used to seek insight into such calorimetric puzzles, and, when simulations and experiments diverge, the results can usefully motivate further improvements in computational methods. Here, we apply near-millisecond duration simulations to estimate the relative binding enthalpies of four peptidic ligands with the Grb2 SH2 domain. The ligands fall into matched pairs, where one member of each pair has an added bond that preorganizes the ligand for binding and thus may be expected to favor binding entropically, due to a smaller loss in configurational entropy. Calorimetric studies have shown that the constrained ligands do in fact bind the SH2 domain more tightly than the flexible ones, but, paradoxically, the improvement in affinity for the constrained ligands is enthalpic, rather than entropic. The present enthalpy calculations yield the opposite trend, as they suggest that the flexible ligands bind more exothermically. Additionally, the small relative binding enthalpies are found to be balances of large differences in the energies of structural components such as ligand and the binding site residues. As a consequence, the deviations from experiment in the relative binding enthalpies represent small differences between these large numbers and hence may be particularly susceptible to error, due, for example, to approximations in the force field. We also computed first-order estimates of changes in configurational entropy on binding. These too are, arguably, paradoxical, as they tend to favor binding of the flexible ligands. The paradox is explained in part by the fact that the more rigid constrained ligands reduce the entropy of binding site residues more than their flexible analogs do, at least in the simulations. This result offers a rather general counterargument to the expectation that preorganized ligands should be associated with more favorable binding entropies, other things being equal.
Assuntos
Proteína Adaptadora GRB2/química , Oligopeptídeos/química , Termodinâmica , Ligantes , Simulação de Dinâmica Molecular , Análise de Componente Principal , Ligação Proteica , Conformação Proteica , Água/química , Domínios de Homologia de srcRESUMO
BindingDB, www.bindingdb.org, is a publicly accessible database of experimental protein-small molecule interaction data. Its collection of over a million data entries derives primarily from scientific articles and, increasingly, US patents. BindingDB provides many ways to browse and search for data of interest, including an advanced search tool, which can cross searches of multiple query types, including text, chemical structure, protein sequence and numerical affinities. The PDB and PubMed provide links to data in BindingDB, and vice versa; and BindingDB provides links to pathway information, the ZINC catalog of available compounds, and other resources. The BindingDB website offers specialized tools that take advantage of its large data collection, including ones to generate hypotheses for the protein targets bound by a bioactive compound, and for the compounds bound by a new protein of known sequence; and virtual compound screening by maximal chemical similarity, binary kernel discrimination, and support vector machine methods. Specialized data sets are also available, such as binding data for hundreds of congeneric series of ligands, drawn from BindingDB and organized for use in validating drug design methods. BindingDB offers several forms of programmatic access, and comes with extensive background material and documentation. Here, we provide the first update of BindingDB since 2007, focusing on new and unique features and highlighting directions of importance to the field as a whole.