RESUMO
Molecular docking has become an important component of the drug discovery process. Since first being developed in the 1980s, advancements in the power of computer hardware and the increasing number of and ease of access to small molecule and protein structures have contributed to the development of improved methods, making docking more popular in both industrial and academic settings. Over the years, the modalities by which docking is used to assist the different tasks of drug discovery have changed. Although initially developed and used as a standalone method, docking is now mostly employed in combination with other computational approaches within integrated workflows. Despite its invaluable contribution to the drug discovery process, molecular docking is still far from perfect. In this chapter we will provide an introduction to molecular docking and to the different docking procedures with a focus on several considerations and protocols, including protonation states, active site waters and consensus, that can greatly improve the docking results.
Assuntos
Descoberta de Drogas/métodos , Simulação de Acoplamento Molecular , Proteínas/química , Proteínas/metabolismo , Ligação Proteica , Conformação Proteica , Relação Estrutura-AtividadeRESUMO
Ligand-based methods play a crucial role in virtual screening when the 3D structure of the target is not available. This study discusses the results of a validation study of the CSD field-based ligand screener using a novel benchmarking data set containing 56 targets. The data set was created starting from the target UniProt IDs in a previously published data set (i.e., the AZ data set), by mining ChEMBL to find known active molecules for these targets and by using DUD-E to generate property-matched decoys of the identified actives. Several experiments were performed to assess the virtual screening performance of the new method. One of its strengths is that it can use an overlay of multiple flexible ligands as a query without the need to run several parallel calculations with one ligand at a time. Here, we discuss how changes to different parameter settings or adoption of different query models can influence the final performance compared to the performance when using the experimentally observed overlay of ligands. We have also generated the enrichment scores based on three external benchmark data sets to enable the comparison with existing methods previously validated using these data sets. Here, we present results for the standard DUD-E data set, the DUD-E+ data set, as well as the DUD_Lib_VS_1.0 data set which was designed for ligand-based virtual screening validation and hence is more suitable for this type of methods.
Assuntos
Benchmarking , LigantesRESUMO
We recently published an improved methodology for overlaying multiple flexible ligands and an extensive data set for validating pharmacophore programs. Here, we combine these two developments and present evidence of the effectiveness of the new overlay methodology at predicting correct superimpositions for systems with varying levels of complexity. The overlay program was able to generate correct predictions for 95%, 73%, and 39% of systems classified as easy, moderate, and hard, respectively.
Assuntos
Bases de Dados de Produtos Farmacêuticos , Descoberta de Drogas/métodos , Caseína Quinase II/metabolismo , Modelos Moleculares , Conformação Molecular , Receptores de Mineralocorticoides/metabolismo , Ativador de Plasminogênio Tipo Uroquinase/metabolismoRESUMO
The pharmacophore hypothesis plays a central role in both the design and optimization of drug-like ligands. Pharmacophore patterns are invoked to explain the binding affinity of ligands and to enable the design of chemically distinct scaffolds that show affinity for a protein target of interest. The importance of pharmacophores in rationalizing ligand affinity has led to numerous algorithms that seek to overlay ligands based on their pharmacophoric features. All such algorithms must be validated with respect to known ligand overlays, usually by extracting ligand overlay sets from the Protein Data Bank (PDB). This validation step creates the problem of which of the known overlays to select and from which proteins. The large number of structures and protein families in the PDB makes it difficult to establish a definitive overlay set; as a result, validation studies have rarely employed the same data sets. We have therefore undertaken an exhaustive analysis of the RCSB PDB to identify 121 distinct ligand overlay sets. We have defined a robust protein overlay protocol, which is free from subjective interpretation over which residues to include, and we have analyzed each overlay set on the basis of whether they provide evidence for the pharmacophore hypothesis. Our final data set spans a broad range of structural types and degrees of difficulty and includes overlays that any algorithm should be able to reproduce, as well as some for which there is very weak evidence for a conserved pharmacophore at all. We provide this set in the hope that it will prove definitive, at least until the PDB is greatly enriched with further structures or with radically different protein folds and families. Upon publication, the data set will be available for free download from the Web site of the Cambridge Crystallographic Data Centre.
Assuntos
Algoritmos , Proteínas/química , Bibliotecas de Moléculas Pequenas/química , Software , Animais , Bactérias/química , Sítios de Ligação , Bases de Dados de Proteínas , Desenho de Fármacos , Descoberta de Drogas , Humanos , Internet , Ligantes , Conformação Molecular , Simulação de Acoplamento Molecular , Plantas/química , Ligação Proteica , Relação Quantitativa Estrutura-AtividadeRESUMO
Identifying bioactive conformations of small molecules is an essential process for virtual screening applications relying on three-dimensional structure such as molecular docking. For most small molecules, conformer generators retrieve at least one bioactive-like conformation, with an atomic root-mean-square deviation (ARMSD) lower than 1 Å, among the set of low-energy conformers generated. However, there is currently no general method to prioritise these likely target-bound conformations in the ensemble. In this work, we trained atomistic neural networks (AtNNs) on 3D information of generated conformers of a curated subset of PDBbind ligands to predict the ARMSD to their closest bioactive conformation, and evaluated the early enrichment of bioactive-like conformations when ranking conformers by AtNN prediction. AtNN ranking was compared with bioactivity-unaware baselines such as ascending Sage force field energy ranking, and a slower bioactivity-based baseline ranking by ascending Torsion Fingerprint Deviation to the Maximum Common Substructure to the most similar molecule in the training set (TFD2SimRefMCS). On test sets from random ligand splits of PDBbind, ranking conformers using ComENet, the AtNN encoding the most 3D information, leads to early enrichment of bioactive-like conformations with a median BEDROC of 0.29 ± 0.02, outperforming the best bioactivity-unaware Sage energy ranking baseline (median BEDROC of 0.18 ± 0.02), and performing on a par with the bioactivity-based TFD2SimRefMCS baseline (median BEDROC of 0.31 ± 0.02). The improved performance of the AtNN and TFD2SimRefMCS baseline is mostly observed on test set ligands that bind proteins similar to proteins observed in the training set. On a more challenging subset of flexible molecules, the bioactivity-unaware baselines showed median BEDROCs up to 0.02, while AtNNs and TFD2SimRefMCS showed median BEDROCs between 0.09 and 0.13. When performing rigid ligand re-docking of PDBbind ligands with GOLD using the 1% top-ranked conformers, ComENet ranked conformers showed a higher successful docking rate than bioactivity-unaware baselines, with a rate of 0.48 ± 0.02 compared to CSD probability baseline with a rate of 0.39 ± 0.02. Similarly, on a pharmacophore searching experiment, selecting the 20% top-ranked conformers ranked by ComENet showed higher hit rate compared to baselines. Hence, the approach presented here uses AtNNs successfully to focus conformer ensembles towards bioactive-like conformations, representing an opportunity to reduce computational expense in virtual screening applications on known targets that require input conformations.
RESUMO
BACKGROUND: Matrix metalloproteinases (MMPs) are well-known biological targets implicated in tumour progression, homeostatic regulation, innate immunity, impaired delivery of pro-apoptotic ligands, and the release and cleavage of cell-surface receptors. With this in mind, the perception of the intimate relationships among diverse MMPs could be a solid basis for accelerated learning in designing new selective MMP inhibitors. In this regard, decrypting the latent molecular reasons in order to elucidate similarity among MMPs is a key challenge. RESULTS: We describe a pairwise variant of the non-parametric chaotic map clustering (CMC) algorithm and its application to 104 X-ray MMP structures. In this analysis electrostatic potentials are computed and used as input for the CMC algorithm. It was shown that differences between proteins reflect genuine variation of their electrostatic potentials. In addition, the analysis has been also extended to analyze the protein primary structures and the molecular shapes of the MMP co-crystallised ligands. CONCLUSIONS: The CMC algorithm was shown to be a valuable tool in knowledge acquisition and transfer from MMP structures. Based on the variation of electrostatic potentials, CMC was successful in analysing the MMP target family landscape and different subsites. The first investigation resulted in rational figure interpretation of both domain organization as well as of substrate specificity classifications. The second made it possible to distinguish the MMP classes, demonstrating the high specificity of the S1' pocket, to detect both the occurrence of punctual mutations of ionisable residues and different side-chain conformations that likely account for induced-fit phenomena. In addition, CMC demonstrated a potential comparable to the most popular UPGMA (Unweighted Pair Group Method with Arithmetic mean) method that, at present, represents a standard clustering bioinformatics approach. Interestingly, CMC and UPGMA resulted in closely comparable outcomes, but often CMC produced more informative and more easy interpretable dendrograms. Finally, CMC was successful for standard pairwise analysis (i.e., Smith-Waterman algorithm) of protein sequences and was used to convincingly explain the complementarity existing between the molecular shapes of the co-crystallised ligand molecules and the accessible MMP void volumes.
Assuntos
Cristalografia por Raios X , Metaloproteinases da Matriz/química , Algoritmos , Análise por Conglomerados , Ligantes , Metaloproteinases da Matriz/metabolismo , Modelos Moleculares , Conformação Proteica , Especificidade por SubstratoRESUMO
A series of 27 benzamidine inhibitors covering a wide range of biological activity and chemical diversity was analysed to derive a Linear Interaction Energy in Continuum Electrostatics (LIECE) model for analysing the thrombin inhibitory activity. The main interactions occurring at the thrombin binding site and the preferred binding conformations of inhibitors were explicitly biased by including into the LIECE model 10 compounds extracted from X-ray solved thrombin-inhibitor complexes available from the Protein Data Bank (PDB). Supported by a robust statistics (r(2) = 0.698; q(2) = 0.662), the LIECE model was successful in predicting the inhibitory activity for about 76% of compounds (r (ext) (2) > or = 0.600) from a larger external test set encompassing 88 known thrombin inhibitors and, more importantly, in retrieving, at high sensitivity and with better performance than docking and shape-based methods, active compounds from a thrombin combinatorial library of 10240 mimetic chemical products. The herein proposed LIECE model has the potential for successfully driving the design of novel thrombin inhibitors with benzamidine and/or benzamidine-like chemical structure.
Assuntos
Benzamidinas/química , Inibidores de Serina Proteinase/química , Trombina/antagonistas & inibidores , Benzamidinas/farmacologia , Técnicas de Química Combinatória , Simulação por Computador , Desenho de Fármacos , Modelos Moleculares , Ligação Proteica/efeitos dos fármacos , Relação Quantitativa Estrutura-Atividade , Inibidores de Serina Proteinase/farmacologia , Eletricidade Estática , Trombina/químicaRESUMO
A multiobjective optimization algorithm was proposed for the automated integration of structure- and ligand-based molecular design. Driven by a genetic algorithm, the herein proposed approach enabled the detection of a number of trade-off QSAR models accounting simultaneously for two independent objectives. The first was biased toward best regressions among docking scores and biological affinities; the second minimized the atom displacements from a properly established crystal-based binding topology. Based on the concept of dominance, 3D QSAR equivalent models profiled the Pareto frontier and were, thus, designated as nondominated solutions of the search space. K-means clustering was, then, operated to select a representative subset of the available trade-off models. These were effectively subjected to GRID/GOLPE analyses for quantitatively featuring molecular determinants of ligand binding affinity. More specifically, it was demonstrated that a) diverse binding conformations occurred on the basis of the ligand ability to profitably contact different part of protein binding site; b) enzyme selectivity was better approached and interpreted by combining diverse equivalent models; and c) trade-off models were successful and even better than docking virtual screening, in retrieving at high sensitivity active hits from a large pool of chemically similar decoys. The approach was tested on a large series, very well-known to QSAR practitioners, of 3-amidinophenylalanine inhibitors of thrombin and trypsin, two serine proteases having rather different biological actions despite a high sequence similarity.
Assuntos
Desenho de Fármacos , Relação Quantitativa Estrutura-Atividade , Modelos Moleculares , Fenilalanina/química , Fenilalanina/metabolismo , Fenilalanina/farmacologia , Conformação Proteica , Especificidade por Substrato , Trombina/antagonistas & inibidores , Trombina/química , Trombina/metabolismo , Tripsina/química , Tripsina/metabolismo , Inibidores da Tripsina/química , Inibidores da Tripsina/metabolismo , Inibidores da Tripsina/farmacologiaRESUMO
The Cambridge Structural Database (CSD) is the worldwide resource for the dissemination of all published three-dimensional structures of small-molecule organic and metal-organic compounds. This paper briefly describes how this collection of crystal structures can be used en masse in the context of macromolecular crystallography. Examples highlight how the CSD and associated software aid protein-ligand complex validation, and show how the CSD could be further used in the generation of geometrical restraints for protein structure refinement.
Assuntos
Proteínas/química , Bibliotecas de Moléculas Pequenas/química , Cristalografia por Raios X , Bases de Dados de Compostos Químicos , Ligantes , Modelos Moleculares , Conformação Molecular , Conformação Proteica , Proteínas/metabolismo , Bibliotecas de Moléculas Pequenas/metabolismo , SoftwareRESUMO
An addendum to the Introduction of Cole et al. [(2017), Acta Cryst. D73, 234-239] is made to recognize the work of Bricogne, Smart and others in the development of methods to make use of Cambridge Structural Database data in protein structure solution.
RESUMO
The Protein Data Bank (PDB) contains a wealth of data on nonbonded biomolecular interactions. If this information could be distilled down to nonbonded interaction potentials, these would have some key advantages over standard force fields. However, there are some important outstanding issues to address in order to do this successfully. This paper introduces the protein-ligand informatics "force field", PLIff, which begins to address these key challenges ( https://bitbucket.org/AstexUK/pli ). As a result of their knowledge-based nature, the next-generation nonbonded potentials that make up PLIff automatically capture a wide range of interaction types, including special interactions that are often poorly described by standard force fields. We illustrate how PLIff may be used in structure-based design applications, including interaction fields, fragment mapping, and protein-ligand docking. PLIff performs at least as well as state-of-the art scoring functions in terms of pose predictions and ranking compounds in a virtual screening context.
Assuntos
Biologia Computacional , Proteínas/química , Algoritmos , Bases de Dados de Proteínas , Bases de Conhecimento , Ligantes , Modelos Moleculares , Estrutura Molecular , Relação Estrutura-AtividadeRESUMO
This analysis attempts to answer the question of whether similar molecules crystallize in a similar manner. An analysis of structures in the Cambridge Structural Database shows that the answer is yes - sometimes they do, particularly for single-component structures. However, one does need to define what we mean by similar in both cases. Building on this observation we then demonstrate how this correlation between shape similarity and packing similarity can be used to generate potential lattices for molecules with no known crystal structure. Simple intermolecular interaction potentials can be used to minimize these potential lattices. Finally we discuss the many limitations of this approach.
RESUMO
We have analyzed the protein-binding pharmacophore of NAD and its close analogues in all protein-ligand structures available in the RCSB database as of February 2012; this analysis has then been used to assess the novelty of structures emerging after that date. We show that proteins have evolved diverse pharmacophore motifs for binding the adenine moiety, fewer, but still diverse, motifs for nicotinamide, and a very limited set of motifs for binding the pyrophosphate linker. Our exhaustive analysis includes a pharmacophore contact analysis for over 1900 protein-ligand structures containing NAD analogues; we have benchmarked this set of contacts against nearly 27 000 protein-ligand structures to demonstrate that the diversity of interactions seen with NAD is very similar to that seen for all other ligands. Hence, variation in binding motifs for NAD is not distinct from that observed for other ligands and they show significant variation across protein families.
Assuntos
NADP/química , NAD/análogos & derivados , NAD/química , Proteínas/química , Motivos de Aminoácidos , Bases de Dados de Proteínas , Ligantes , Ligação Proteica , Domínios e Motivos de Interação entre ProteínasRESUMO
Matrix metalloproteinases (MMPs) are attractive biological targets that play a key role in many physiopathological processes such as degradation of extracellular matrix proteins, release and cleavage of cell-surface receptors, tumour progression, homeostatic regulation and innate immunity. A series of 5-hydroxy, 5-substituted pyrimidine-2,4,6-triones were rationally designed, prepared and tested as inhibitors of gelatinases MMP-2 and MMP-9 and collagenase MMP-8. On one side, the presence of the 5-hydroxyl group, that represents an typical feature of this class of compounds, ensured an attractive pharmacokinetic profile while on the other suitably substituted biaryl molecular fragments, attached to position 5 through a ketomethylene linker, guaranteed favourable interaction in the deep region of the S(1)' enzymatic subsite. This rational design led to the discovery of highly potent MMP inhibitors. In particular, biphenyl derivatives bearing at the para position COCH(3) and OCF(3) substituents permitted to inhibit gelatinases MMP-2 and MMP-9, with IC(50) values as low as 30 nM and 21 nM, respectively, whereas the introduction at the same position of the bulkier SO(2)CH(3) group afforded a potent collagenase MMP-8 inhibitor with an IC(50) value equal to 66 nM. Molecular docking simulations allowed us to elucidate key interactions driving the binding of the top active compounds towards their preferred MMP target.
Assuntos
Barbitúricos/farmacologia , Desenho de Fármacos , Inibidores Enzimáticos/farmacologia , Metaloproteinase 2 da Matriz/metabolismo , Metaloproteinase 9 da Matriz/metabolismo , Barbitúricos/síntese química , Barbitúricos/química , Cristalografia por Raios X , Relação Dose-Resposta a Droga , Inibidores Enzimáticos/síntese química , Inibidores Enzimáticos/química , Humanos , Modelos Moleculares , Estrutura Molecular , Albumina Sérica/química , Relação Estrutura-AtividadeRESUMO
INTRODUCTION: Drug discovery and development is a typical multi-objective problem and its successes or failures depend on the simultaneous control of numerous, often conflicting, molecular and pharmacological properties. Multi-objective optimization strategies represent a new approach to capture the occurrence of varying optimal solutions based on trade-offs among the objectives taken into account. In view of this, multi-objective optimization aims to discover a set of satisfactory compromises that may in turn be used to find the global optimal solution by optimizing numerous dependent properties simultaneously. AREAS COVERED: The authors review the potential of multi-objective strategies in a number of fields including: drug library design; substructure mining; the derivation of quantitative structure-activity relationship models; ranking of docking poses. The authors also discuss the potential of multi-objective strategies in controlling competing properties for absorption, distribution, metabolism and elimination/toxicity optimization. EXPERT OPINION: It is very clear to those who work in drug discovery and development that the success of rational drug design is largely dependent on the control of a number of, often conflicting, objectives. Therefore, multi-objective optimization methods, which have recently been introduced to the field of molecular discovery, represent the ultimate frontier in chemoinformatics. The widespread use of these multi-objective techniques has provided new opportunities in medicinal chemistry as seen through its use in a number of applications for chemoinformatics both within academia and the pharmaceutical industry.
RESUMO
This paper addresses two questions of key interest to researchers working with protein-ligand docking methods: (i) Why is there such a large variation in docking performance between different test sets reported in the literature? (ii) Are fragments more difficult to dock than druglike compounds? To answer these, we construct a test set of in-house X-ray structures of protein-ligand complexes from drug discovery projects, half of which contain fragment ligands, the other half druglike ligands. We find that a key factor affecting docking performance is ligand efficiency (LE). High LE compounds are significantly easier to dock than low LE compounds, which we believe could explain the differences observed between test sets reported in the literature. There is no significant difference in docking performance between fragments and druglike compounds, but the reasons why dockings fail appear to be different.
Assuntos
Ligantes , Ligação Proteica , Proteínas/química , Sítios de Ligação , Simulação por Computador , Modelos MolecularesRESUMO
Matrix metalloproteinases (MMP) are well-known biological targets implicated in tumour progression, homeostatic regulation, innate immunity, impaired delivery of pro-apoptotic ligands, and the release and cleavage of cell-surface receptors. Hence, the development of potent and selective inhibitors targeting these enzymes continues to be eagerly sought. In this paper, a number of alloxan-based compounds, initially conceived to bias other therapeutically relevant enzymes, were rationally modified and successfully repurposed to inhibit MMP-2 (also named gelatinase A) in the nanomolar range. Importantly, the alloxan core makes its debut as zinc binding group since it ensures a stable tetrahedral coordination of the catalytic zinc ion in concert with the three histidines of the HExxHxxGxxH metzincin signature motif, further stabilized by a hydrogen bond with the glutamate residue belonging to the same motif. The molecular decoration of the alloxan core with a biphenyl privileged structure allowed to sample the deep S(1)' specificity pocket of MMP-2 and to relate the high affinity towards this enzyme with the chance of forming a hydrogen bond network with the backbone of Leu116 and Asn147 and the side chains of Tyr144, Thr145 and Arg149 at the bottom of the pocket. The effect of even slight structural changes in determining the interaction at the S(1)' subsite of MMP-2 as well as the nature and strength of the binding is elucidated via molecular dynamics simulations and free energy calculations. Among the herein presented compounds, the highest affinity (pIC(50) = 7.06) is found for BAM, a compound exhibiting also selectivity (>20) towards MMP-2, as compared to MMP-9, the other member of the gelatinases.
Assuntos
Aloxano/metabolismo , Aloxano/farmacologia , Metaloproteinase 2 da Matriz/metabolismo , Inibidores de Metaloproteinases de Matriz , Simulação de Dinâmica Molecular , Inibidores de Proteases/metabolismo , Inibidores de Proteases/farmacologia , Aloxano/análogos & derivados , Domínio Catalítico , Metaloproteinase 2 da Matriz/química , Inibidores de Proteases/química , Ligação Proteica , TermodinâmicaRESUMO
A large series of substituted coumarins linked through an appropriate spacer to 3-hydroxy-N,N-dimethylanilino or 3-hydroxy-N,N,N-trialkylbenzaminium moieties were synthesized and evaluated as acetylcholinesterase (AChE) and butyrylcholinesterase (BChE) inhibitors. The highest AChE inhibitory potency in the 3-hydroxy-N,N-dimethylanilino series was observed with a 6,7-dimethoxy-3-substituted coumarin derivative, which, along with an outstanding affinity (IC(50)=0.236 nM) exhibits excellent AChE/BChE selectivity (SI>300 000). Most of the synthesized 3-hydroxy-N,N,N-trialkylbenzaminium salts display an AChE affinity in the sub-nanomolar to picomolar range along with excellent AChE/BChE selectivities (SI values up to 138 333). The combined use of docking and molecular dynamics simulations permitted us to shed light on the observed structure-affinity and structure-selectivity relationships, to detect two possible alternative binding modes, and to assess the critical role of pi-pi stacking interactions in the AChE peripheral binding site.