Search | VHL Regional Portal

1.

It is theoretically possible to avoid misfolding into non-covalent lasso entanglements using small molecule drugs.

Jiang, Yang; Deane, Charlotte M; Morris, Garrett M; O'Brien, Edward P.

PLoS Comput Biol ; 20(3): e1011901, 2024 Mar.

Article in English | MEDLINE | ID: mdl-38470915

ABSTRACT

A novel class of protein misfolding characterized by either the formation of non-native noncovalent lasso entanglements in the misfolded structure or loss of native entanglements has been predicted to exist and found circumstantial support through biochemical assays and limited-proteolysis mass spectrometry data. Here, we examine whether it is possible to design small molecule compounds that can bind to specific folding intermediates and thereby avoid these misfolded states in computer simulations under idealized conditions (perfect drug-binding specificity, zero promiscuity, and a smooth energy landscape). Studying two proteins, type III chloramphenicol acetyltransferase (CAT-III) and D-alanyl-D-alanine ligase B (DDLB), that were previously suggested to form soluble misfolded states through a mechanism involving a failure-to-form of native entanglements, we explore two different drug design strategies using coarse-grained structure-based models. The first strategy, in which the native entanglement is stabilized by drug binding, failed to decrease misfolding because it formed an alternative entanglement at a nearby region. The second strategy, in which a small molecule was designed to bind to a non-native tertiary structure and thereby destabilize the native entanglement, succeeded in decreasing misfolding and increasing the native state population. This strategy worked because destabilizing the entanglement loop provided more time for the threading segment to position itself correctly to be wrapped by the loop to form the native entanglement. Further, we computationally identified several FDA-approved drugs with the potential to bind these intermediate states and rescue misfolding in these proteins. This study suggests it is possible for small molecule drugs to prevent protein misfolding of this type.

Subject(s)

Protein Folding , Proteins , Proteins/chemistry , Computer Simulation , Software , Mass Spectrometry

2.

PoseBusters: AI-based docking methods fail to generate physically valid poses or generalise to novel sequences.

Buttenschoen, Martin; Morris, Garrett M; Deane, Charlotte M.

Chem Sci ; 15(9): 3130-3139, 2024 Feb 28.

Article in English | MEDLINE | ID: mdl-38425520

ABSTRACT

The last few years have seen the development of numerous deep learning-based protein-ligand docking methods. They offer huge promise in terms of speed and accuracy. However, despite claims of state-of-the-art performance in terms of crystallographic root-mean-square deviation (RMSD), upon closer inspection, it has become apparent that they often produce physically implausible molecular structures. It is therefore not sufficient to evaluate these methods solely by RMSD to a native binding mode. It is vital, particularly for deep learning-based methods, that they are also evaluated on steric and energetic criteria. We present PoseBusters, a Python package that performs a series of standard quality checks using the well-established cheminformatics toolkit RDKit. The PoseBusters test suite validates chemical and geometric consistency of a ligand including its stereochemistry, and the physical plausibility of intra- and intermolecular measurements such as the planarity of aromatic rings, standard bond lengths, and protein-ligand clashes. Only methods that both pass these checks and predict native-like binding modes should be classed as having "state-of-the-art" performance. We use PoseBusters to compare five deep learning-based docking methods (DeepDock, DiffDock, EquiBind, TankBind, and Uni-Mol) and two well-established standard docking methods (AutoDock Vina and CCDC Gold) with and without an additional post-prediction energy minimisation step using a molecular mechanics force field. We show that both in terms of physical plausibility and the ability to generalise to examples that are distinct from the training data, no deep learning-based method yet outperforms classical docking tools. In addition, we find that molecular mechanics force fields contain docking-relevant physics missing from deep-learning methods. PoseBusters allows practitioners to assess docking and molecular generation methods and may inspire new inductive biases still required to improve deep learning-based methods, which will help drive the development of more accurate and more realistic predictions.

3.

Discovery and pharmacophoric characterization of chemokine network inhibitors using phage-display, saturation mutagenesis and computational modelling.

Vales, Serena; Kryukova, Jhanna; Chandra, Soumyanetra; Smagurauskaite, Gintare; Payne, Megan; Clark, Charlie J; Hafner, Katrin; Mburu, Philomena; Denisov, Stepan; Davies, Graham; Outeiral, Carlos; Deane, Charlotte M; Morris, Garrett M; Bhattacharya, Shoumo.

Nat Commun ; 14(1): 5763, 2023 09 16.

Article in English | MEDLINE | ID: mdl-37717048

ABSTRACT

CC and CXC-chemokines are the primary drivers of chemotaxis in inflammation, but chemokine network redundancy thwarts pharmacological intervention. Tick evasins promiscuously bind CC and CXC-chemokines, overcoming redundancy. Here we show that short peptides that promiscuously bind both chemokine classes can be identified from evasins by phage-display screening performed with multiple chemokines in parallel. We identify two conserved motifs within these peptides and show using saturation-mutagenesis phage-display and chemotaxis studies of an exemplar peptide that an anionic patch in the first motif and hydrophobic, aromatic and cysteine residues in the second are functionally necessary. AlphaFold2-Multimer modelling suggests that the peptide occludes distinct receptor-binding regions in CC and in CXC-chemokines, with the first and second motifs contributing ionic and hydrophobic interactions respectively. Our results indicate that peptides with broad-spectrum anti-chemokine activity and therapeutic potential may be identified from evasins, and the pharmacophore characterised by phage display, saturation mutagenesis and computational modelling.

Subject(s)

Bacteriophages , Chemokines , Chemical Phenomena , Computer Simulation , Mutagenesis

4.

Exploring QSAR models for activity-cliff prediction.

Dablander, Markus; Hanser, Thierry; Lambiotte, Renaud; Morris, Garrett M.

J Cheminform ; 15(1): 47, 2023 Apr 17.

Article in English | MEDLINE | ID: mdl-37069675

ABSTRACT

INTRODUCTION AND METHODOLOGY: Pairs of similar compounds that only differ by a small structural modification but exhibit a large difference in their binding affinity for a given target are known as activity cliffs (ACs). It has been hypothesised that QSAR models struggle to predict ACs and that ACs thus form a major source of prediction error. However, the AC-prediction power of modern QSAR methods and its quantitative relationship to general QSAR-prediction performance is still underexplored. We systematically construct nine distinct QSAR models by combining three molecular representation methods (extended-connectivity fingerprints, physicochemical-descriptor vectors and graph isomorphism networks) with three regression techniques (random forests, k-nearest neighbours and multilayer perceptrons); we then use each resulting model to classify pairs of similar compounds as ACs or non-ACs and to predict the activities of individual molecules in three case studies: dopamine receptor D2, factor Xa, and SARS-CoV-2 main protease. RESULTS AND CONCLUSIONS: Our results provide strong support for the hypothesis that indeed QSAR models frequently fail to predict ACs. We observe low AC-sensitivity amongst the evaluated models when the activities of both compounds are unknown, but a substantial increase in AC-sensitivity when the actual activity of one of the compounds is given. Graph isomorphism features are found to be competitive with or superior to classical molecular representations for AC-classification and can thus be employed as baseline AC-prediction models or simple compound-optimisation tools. For general QSAR-prediction, however, extended-connectivity fingerprints still consistently deliver the best performance amongs the tested input representations. A potential future pathway to improve QSAR-modelling performance might be the development of techniques to increase AC-sensitivity.

5.

Scoring Functions for Protein-Ligand Binding Affinity Prediction using Structure-Based Deep Learning: A Review.

Meli, Rocco; Morris, Garrett M; Biggin, Philip C.

Front Bioinform ; 22022 Jun 17.

Article in English | MEDLINE | ID: mdl-36187180

ABSTRACT

The rapid and accurate in silico prediction of protein-ligand binding free energies or binding affinities has the potential to transform drug discovery. In recent years, there has been a rapid growth of interest in deep learning methods for the prediction of protein-ligand binding affinities based on the structural information of protein-ligand complexes. These structure-based scoring functions often obtain better results than classical scoring functions when applied within their applicability domain. Here we review structure-based scoring functions for binding affinity prediction based on deep learning, focussing on different types of architectures, featurization strategies, data sets, methods for training and evaluation, and the role of explainable artificial intelligence in building useful models for real drug-discovery applications.

6.

Identification of Histone Peptide Binding Specificity and Small-Molecule Ligands for the TRIM33α and TRIM33ß Bromodomains.

Sekirnik, Angelina R; Reynolds, Jessica K; See, Larissa; Bluck, Joseph P; Scorah, Amy R; Tallant, Cynthia; Lee, Bernadette; Leszczynska, Katarzyna B; Grimley, Rachel L; Storer, R Ian; Malattia, Marta; Crespillo, Sara; Caria, Sofia; Duclos, Stephanie; Hammond, Ester M; Knapp, Stefan; Morris, Garrett M; Duarte, Fernanda; Biggin, Philip C; Conway, Stuart J.

ACS Chem Biol ; 17(10): 2753-2768, 2022 10 21.

Article in English | MEDLINE | ID: mdl-36098557

ABSTRACT

TRIM33 is a member of the tripartite motif (TRIM) family of proteins, some of which possess E3 ligase activity and are involved in the ubiquitin-dependent degradation of proteins. Four of the TRIM family proteins, TRIM24 (TIF1α), TRIM28 (TIF1ß), TRIM33 (TIF1Î³) and TRIM66, contain C-terminal plant homeodomain (PHD) and bromodomain (BRD) modules, which bind to methylated lysine (KMen) and acetylated lysine (KAc), respectively. Here we investigate the differences between the two isoforms of TRIM33, TRIM33α and TRIM33ß, using structural and biophysical approaches. We show that the N1039 residue, which is equivalent to N140 in BRD4(1) and which is conserved in most BRDs, has a different orientation in each isoform. In TRIM33ß, this residue coordinates KAc, but this is not the case in TRIM33α. Despite these differences, both isoforms show similar affinities for H31-27K18Ac, and bind preferentially to H31-27K9Me3K18Ac. We used this information to develop an AlphaScreen assay, with which we have identified four new ligands for the TRIM33 PHD-BRD cassette. These findings provide fundamental new information regarding which histone marks are recognized by both isoforms of TRIM33 and suggest starting points for the development of chemical probes to investigate the cellular function of TRIM33.

Subject(s)

Histones , Transcription Factors , Transcription Factors/metabolism , Histones/metabolism , Nuclear Proteins/metabolism , Lysine/metabolism , Peptide T/metabolism , Ligands , DNA-Binding Proteins/metabolism , Ubiquitins/metabolism , Ubiquitin-Protein Ligases/metabolism

7.

Understanding the genetics of viral drug resistance by integrating clinical data and mining of the scientific literature.

Goto, An; Rodriguez-Esteban, Raul; Scharf, Sebastian H; Morris, Garrett M.

Sci Rep ; 12(1): 14476, 2022 08 25.

Article in English | MEDLINE | ID: mdl-36008431

ABSTRACT

Drug resistance caused by mutations is a public health threat for existing and emerging viral diseases. A wealth of evidence about these mutations and their clinically associated phenotypes is scattered across the literature, but a comprehensive perspective is usually lacking. This work aimed to produce a clinically relevant view for the case of Hepatitis B virus (HBV) mutations by combining a chronic HBV clinical study with a compendium of genetic mutations systematically gathered from the scientific literature. We enriched clinical mutation data by systematically mining 2,472,725 scientific articles from PubMed Central in order to gather information about the HBV mutational landscape. By performing this analysis, we were able to identify mutational hotspots for each HBV genotype (A-E) and gene (C, X, P, S), as well as the location of disulfide bonds associated with these mutations. Through a modelling study, we also identified a mutation position common in both the clinical data and the literature that is located at the binding pocket for a known anti-HBV drug, namely entecavir. The results of this novel approach show the potential of integrated analyses to assist in the development of new drugs for viral diseases that are more robust to resistance. Such analyses should be of particular interest due to the increasing importance of viral resistance in established and emerging viruses, such as for newly developed drugs against SARS-CoV-2.

Subject(s)

COVID-19 Drug Treatment , Hepatitis B, Chronic , Antiviral Agents/pharmacology , Antiviral Agents/therapeutic use , DNA, Viral/genetics , Drug Resistance, Viral/genetics , Genotype , Hepatitis B virus/genetics , Humans , Mutation , SARS-CoV-2/genetics

8.

Characterization of the SARS-CoV-2 ExoN (nsp14ExoN-nsp10) complex: implications for its role in viral genome stability and inhibitor identification.

Baddock, Hannah T; Brolih, Sanja; Yosaatmadja, Yuliana; Ratnaweera, Malitha; Bielinski, Marcin; Swift, Lonnie P; Cruz-Migoni, Abimael; Fan, Haitian; Keown, Jeremy R; Walker, Alexander P; Morris, Garrett M; Grimes, Jonathan M; Fodor, Ervin; Schofield, Christopher J; Gileadi, Opher; McHugh, Peter J.

Nucleic Acids Res ; 50(3): 1484-1500, 2022 02 22.

Article in English | MEDLINE | ID: mdl-35037045

ABSTRACT

The SARS-CoV-2 coronavirus is the causal agent of the current global pandemic. SARS-CoV-2 belongs to an order, Nidovirales, with very large RNA genomes. It is proposed that the fidelity of coronavirus (CoV) genome replication is aided by an RNA nuclease complex, comprising the non-structural proteins 14 and 10 (nsp14-nsp10), an attractive target for antiviral inhibition. Our results validate reports that the SARS-CoV-2 nsp14-nsp10 complex has RNase activity. Detailed functional characterization reveals nsp14-nsp10 is a versatile nuclease capable of digesting a wide variety of RNA structures, including those with a blocked 3'-terminus. Consistent with a role in maintaining viral genome integrity during replication, we find that nsp14-nsp10 activity is enhanced by the viral RNA-dependent RNA polymerase complex (RdRp) consisting of nsp12-nsp7-nsp8 (nsp12-7-8) and demonstrate that this stimulation is mediated by nsp8. We propose that the role of nsp14-nsp10 in maintaining replication fidelity goes beyond classical proofreading by purging the nascent replicating RNA strand of a range of potentially replication-terminating aberrations. Using our developed assays, we identify drug and drug-like molecules that inhibit nsp14-nsp10, including the known SARS-CoV-2 major protease (Mpro) inhibitor ebselen and the HIV integrase inhibitor raltegravir, revealing the potential for multifunctional inhibitors in COVID-19 treatment.

Subject(s)

Antiviral Agents/pharmacology , Drug Evaluation, Preclinical , Exoribonucleases/metabolism , Genome, Viral/genetics , Genomic Instability , SARS-CoV-2/enzymology , SARS-CoV-2/genetics , Viral Nonstructural Proteins/metabolism , Viral Regulatory and Accessory Proteins/metabolism , Coronavirus RNA-Dependent RNA Polymerase/metabolism , Exoribonucleases/antagonists & inhibitors , Genome, Viral/drug effects , Genomic Instability/drug effects , Genomic Instability/genetics , HIV Integrase Inhibitors/pharmacology , Isoindoles/pharmacology , Multienzyme Complexes/antagonists & inhibitors , Multienzyme Complexes/metabolism , Organoselenium Compounds/pharmacology , RNA, Viral/biosynthesis , RNA, Viral/genetics , Raltegravir Potassium/pharmacology , SARS-CoV-2/drug effects , Viral Nonstructural Proteins/antagonists & inhibitors , Viral Regulatory and Accessory Proteins/antagonists & inhibitors , Virus Replication/drug effects , Virus Replication/genetics

9.

Learning from Docked Ligands: Ligand-Based Features Rescue Structure-Based Scoring Functions When Trained on Docked Poses.

Boyles, Fergus; Deane, Charlotte M; Morris, Garrett M.

J Chem Inf Model ; 62(22): 5329-5341, 2022 11 28.

Article in English | MEDLINE | ID: mdl-34469150

ABSTRACT

Machine learning scoring functions for protein-ligand binding affinity have been found to consistently outperform classical scoring functions when trained and tested on crystal structures of bound protein-ligand complexes. However, it is less clear how these methods perform when applied to docked poses of complexes. We explore how the use of docked rather than crystallographic poses for both training and testing affects the performance of machine learning scoring functions. Using the PDBbind Core Sets as benchmarks, we show that the performance of a structure-based machine learning scoring function trained and tested on docked poses is lower than that of the same scoring function trained and tested on crystallographic poses. We construct a hybrid scoring function by combining both structure-based and ligand-based features, and show that its ability to predict binding affinity using docked poses is comparable to that of purely structure-based scoring functions trained and tested on crystal poses. We also present a new, freely available validation setâthe Updated DUD-E Diverse Subsetâfor binding affinity prediction using data from DUD-E and ChEMBL. Despite strong performance on docked poses of the PDBbind Core Sets, we find that our hybrid scoring function sometimes generalizes poorly to a protein target not represented in the training set, demonstrating the need for improved scoring functions and additional validation benchmarks.

Subject(s)

Machine Learning , Proteins , Ligands , Protein Binding , Proteins/chemistry , Molecular Docking Simulation

10.

Discovery of SARS-CoV-2 M^pro peptide inhibitors from modelling substrate and ligand binding.

Chan, H T Henry; Moesser, Marc A; Walters, Rebecca K; Malla, Tika R; Twidale, Rebecca M; John, Tobias; Deeks, Helen M; Johnston-Wood, Tristan; Mikhailov, Victor; Sessions, Richard B; Dawson, William; Salah, Eidarus; Lukacik, Petra; Strain-Damerell, Claire; Owen, C David; Nakajima, Takahito; Swiderek, Katarzyna; Lodola, Alessio; Moliner, Vicent; Glowacki, David R; Spencer, James; Walsh, Martin A; Schofield, Christopher J; Genovese, Luigi; Shoemark, Deborah K; Mulholland, Adrian J; Duarte, Fernanda; Morris, Garrett M.

Chem Sci ; 12(41): 13686-13703, 2021 Oct 27.

Article in English | MEDLINE | ID: mdl-34760153

ABSTRACT

The main protease (Mpro) of SARS-CoV-2 is central to viral maturation and is a promising drug target, but little is known about structural aspects of how it binds to its 11 natural cleavage sites. We used biophysical and crystallographic data and an array of biomolecular simulation techniques, including automated docking, molecular dynamics (MD) and interactive MD in virtual reality, QM/MM, and linear-scaling DFT, to investigate the molecular features underlying recognition of the natural Mpro substrates. We extensively analysed the subsite interactions of modelled 11-residue cleavage site peptides, crystallographic ligands, and docked COVID Moonshot-designed covalent inhibitors. Our modelling studies reveal remarkable consistency in the hydrogen bonding patterns of the natural Mpro substrates, particularly on the N-terminal side of the scissile bond. They highlight the critical role of interactions beyond the immediate active site in recognition and catalysis, in particular plasticity at the S2 site. Building on our initial Mpro-substrate models, we used predictive saturation variation scanning (PreSaVS) to design peptides with improved affinity. Non-denaturing mass spectrometry and other biophysical analyses confirm these new and effective 'peptibitors' inhibit Mpro competitively. Our combined results provide new insights and highlight opportunities for the development of Mpro inhibitors as anti-COVID-19 drugs.

11.

Learning protein-ligand binding affinity with atomic environment vectors.

Meli, Rocco; Anighoro, Andrew; Bodkin, Mike J; Morris, Garrett M; Biggin, Philip C.

J Cheminform ; 13(1): 59, 2021 Aug 14.

Article in English | MEDLINE | ID: mdl-34391475

ABSTRACT

Scoring functions for the prediction of protein-ligand binding affinity have seen renewed interest in recent years when novel machine learning and deep learning methods started to consistently outperform classical scoring functions. Here we explore the use of atomic environment vectors (AEVs) and feed-forward neural networks, the building blocks of several neural network potentials, for the prediction of protein-ligand binding affinity. The AEV-based scoring function, which we term AEScore, is shown to perform as well or better than other state-of-the-art scoring functions on binding affinity prediction, with an RMSE of 1.22 pK units and a Pearson's correlation coefficient of 0.83 for the CASF-2016 benchmark. However, AEScore does not perform as well in docking and virtual screening tasks, for which it has not been explicitly trained. Therefore, we show that the model can be combined with the classical scoring function AutoDock Vina in the context of [Formula: see text]-learning, where corrections to the AutoDock Vina scoring function are learned instead of the protein-ligand binding affinity itself. Combined with AutoDock Vina, [Formula: see text]-AEScore has an RMSE of 1.32 pK units and a Pearson's correlation coefficient of 0.80 on the CASF-2016 benchmark, while retaining the docking and screening power of the underlying classical scoring function.

12.

Understanding Conformational Entropy in Small Molecules.

Chan, Lucian; Morris, Garrett M; Hutchison, Geoffrey R.

J Chem Theory Comput ; 17(4): 2099-2106, 2021 Apr 13.

Article in English | MEDLINE | ID: mdl-33759518

ABSTRACT

The calculation of the entropy of flexible molecules can be challenging, since the number of possible conformers can grow exponentially with molecule size and many low-energy conformers may be thermally accessible. Different methods have been proposed to approximate the contribution of conformational entropy to the molecular standard entropy, including performing thermochemistry calculations with all possible stable conformations and developing empirical corrections from experimental data. We have performed conformer sampling on over 120,000 small molecules generating some 12 million conformers, to develop models to predict conformational entropy across a wide range of molecules. Using insight into the nature of conformational disorder, our cross-validated physically motivated statistical model gives a mean absolute error of â¼4.8 J/mol·K or under 0.4 kcal/mol at 300 K. Beyond predicting molecular entropies and free energies, the model implies a high degree of correlation between torsions in most molecules, often assumed to be independent. While individual dihedral rotations may have low energetic barriers, the shape and chemical functionality of most molecules necessarily correlate their torsional degrees of freedom and hence restrict the number of low-energy conformations immensely. Our simple models capture these correlations and advance our understanding of small molecule conformational entropy.

13.

Understanding Ring Puckering in Small Molecules and Cyclic Peptides.

Chan, Lucian; Hutchison, Geoffrey R; Morris, Garrett M.

J Chem Inf Model ; 61(2): 743-755, 2021 02 22.

Article in English | MEDLINE | ID: mdl-33544592

ABSTRACT

The geometry of a molecule plays a significant role in determining its physical and chemical properties. Despite its importance, there are relatively few studies on ring puckering and conformations, often focused on small cycloalkanes, 5- and 6-membered carbohydrate rings, and specific macrocycle families. We lack a general understanding of the puckering preferences of medium-sized rings and macrocycles. To address this, we provide an extensive conformational analysis of a diverse set of rings. We used Cremer-Pople puckering coordinates to study the trends of the ring conformation across a set of 140â¯000 diverse small molecules, including small rings, macrocycles, and cyclic peptides. By standardizing using key atoms, we show that the ring conformations can be classified into relatively few conformational clusters, based on their canonical forms. The number of such canonical clusters increases slowly with ring size. Ring puckering motions, especially pseudo-rotations, are generally restricted and differ between clusters. More importantly, we propose models to map puckering preferences to torsion space, which allows us to understand the inter-related changes in torsion angles during pseudo-rotation and other puckering motions. Beyond ring puckers, our models also explain the change in substituent orientation upon puckering. We also present a novel knowledge-based sampling method using the puckering preferences and coupled substituent motion to generate ring conformations efficiently. In summary, this work provides an improved understanding of general ring puckering preferences, which will in turn accelerate the identification of low-energy ring conformations for applications from polymeric materials to drug binding.

Subject(s)

Peptides, Cyclic , Molecular Conformation

14.

BOKEI: Bayesian optimization using knowledge of correlated torsions and expected improvement for conformer generation.

Chan, Lucian; Hutchison, Geoffrey R; Morris, Garrett M.

Phys Chem Chem Phys ; 22(9): 5211-5219, 2020 Mar 04.

Article in English | MEDLINE | ID: mdl-32091055

ABSTRACT

A key challenge in conformer sampling is finding low-energy conformations with a small number of energy evaluations. We recently demonstrated the Bayesian Optimization Algorithm (BOA) is an effective method for finding the lowest energy conformation of a small molecule. Our approach balances between exploitation and exploration, and is more efficient than exhaustive or random search methods. Here, we extend strategies used on proteins and oligopeptides (e.g. Ramachandran plots of secondary structure) and study correlated torsions in small molecules. We use bivariate von Mises distributions to capture correlations, and use them to constrain the search space. We validate the performance of our new method, Bayesian Optimization with Knowledge-based Expected Improvement (BOKEI), on a dataset consisting of 533 diverse small molecules, using (i) a force field (MMFF94); and (ii) a semi-empirical method (GFN2), as the objective function. We compare the search performance of BOKEI, BOA with Expected Improvement (BOA-EI), and a genetic algorithm (GA), using a fixed number of energy evaluations. In more than 60% of the cases examined, BOKEI finds lower energy conformations than global optimization with BOA-EI or GA. More importantly, we find correlated torsions in up to 15% of small molecules in larger data sets, up to 8 times more often than previously reported. The BOKEI patterns not only describe steric clashes, but also reflect favorable intramolecular interactions such as hydrogen bonds and π-π stacking. Increasing our understanding of the conformational preferences of molecules will help improve our ability to find low energy conformers efficiently, which will have impact in a wide range of computational modeling applications.

15.

Learning from the ligand: using ligand-based features to improve binding affinity prediction.

Boyles, Fergus; Deane, Charlotte M; Morris, Garrett M.

Bioinformatics ; 36(3): 758-764, 2020 02 01.

Article in English | MEDLINE | ID: mdl-31598630

ABSTRACT

MOTIVATION: Machine learning scoring functions for protein-ligand binding affinity prediction have been found to consistently outperform classical scoring functions. Structure-based scoring functions for universal affinity prediction typically use features describing interactions derived from the protein-ligand complex, with limited information about the chemical or topological properties of the ligand itself. RESULTS: We demonstrate that the performance of machine learning scoring functions are consistently improved by the inclusion of diverse ligand-based features. For example, a Random Forest (RF) combining the features of RF-Score v3 with RDKit molecular descriptors achieved Pearson correlation coefficients of up to 0.836, 0.780 and 0.821 on the PDBbind 2007, 2013 and 2016 core sets, respectively, compared to 0.790, 0.746 and 0.814 when using the features of RF-Score v3 alone. Excluding proteins and/or ligands that are similar to those in the test sets from the training set has a significant effect on scoring function performance, but does not remove the predictive power of ligand-based features. Furthermore a RF using only ligand-based features is predictive at a level similar to classical scoring functions and it appears to be predicting the mean binding affinity of a ligand for its protein targets. AVAILABILITY AND IMPLEMENTATION: Data and code to reproduce all the results are freely available at http://opig.stats.ox.ac.uk/resources. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Subject(s)

Machine Learning , Proteins , Ligands , Protein Binding

16.

Ligity: A Non-Superpositional, Knowledge-Based Approach to Virtual Screening.

Ebejer, Jean-Paul; Finn, Paul W; Wong, Wing Ki; Deane, Charlotte M; Morris, Garrett M.

J Chem Inf Model ; 59(6): 2600-2616, 2019 06 24.

Article in English | MEDLINE | ID: mdl-31117509

ABSTRACT

We present Ligity, a hybrid ligand-structure-based, non-superpositional method for virtual screening of large databases of small molecules. Ligity uses the relative spatial distribution of pharmacophoric interaction points (PIPs) derived from the conformations of small molecules. These are compared with the PIPs derived from key interaction features found in protein-ligand complexes and are used to prioritize likely binders. We investigated the effect of generating PIPs using the single lowest energy conformer versus an ensemble of conformers for each screened ligand, using different bin sizes for the distance between two features, utilizing triangular sets of pharmacophoric features (3-PIPs) versus chiral tetrahedral sets (4-PIPs), fusing data for targets with multiple protein-ligand complex structures, and applying different similarity measures. Ligity was benchmarked using the Directory of Useful Decoys-Enhanced (DUD-E). Optimal results were obtained using the tetrahedral PIPs derived from an ensemble of bound ligand conformers and a bin size of 1.5 Å, which are used as the default settings for Ligity. The high-throughput screening mode of Ligity, using only the lowest-energy conformer of each ligand, was used for benchmarking against the whole of the DUD-E, and a more resource-intensive, "information-rich" mode of Ligity, using a conformational ensemble of each ligand, were used for a representative subset of 10 targets. Against the full DUD-E database, mean area under the receiver operating characteristic curve (AUC) values ranged from 0.44 to 0.99, while for the representative subset they ranged from 0.61 to 0.86. Data fusion further improved Ligity's performance, with mean AUC values ranging from 0.64 to 0.95. Ligity is very efficient compared to a protein-ligand docking method such as AutoDock Vina: if the time taken for the precalculation of Ligity descriptors is included in the comparason, then Ligity is about 20 times faster than docking. A direct comparison of the virtual screening steps shows Ligity to be over 5000 times faster. Ligity highly ranks the lowest-energy conformers of DUD-E actives, in a statistically significant manner, behavior that is not observed for DUD-E decoys. Thus, our results suggest that active compounds tend to bind in relatively low-energy conformations compared to decoys. This may be because actives-and thus their lowest-energy conformations-have been optimized for conformational complementarity with their cognate binding sites.

Subject(s)

Drug Design , Proteins/metabolism , Small Molecule Libraries/chemistry , Small Molecule Libraries/pharmacology , Algorithms , Binding Sites , Humans , Knowledge Bases , Ligands , Molecular Conformation , Molecular Docking Simulation , Proteins/chemistry , Thermodynamics

17.

Bayesian optimization for conformer generation.

Chan, Lucian; Hutchison, Geoffrey R; Morris, Garrett M.

J Cheminform ; 11(1): 32, 2019 May 21.

Article in English | MEDLINE | ID: mdl-31115707

ABSTRACT

Generating low-energy molecular conformers is a key task for many areas of computational chemistry, molecular modeling and cheminformatics. Most current conformer generation methods primarily focus on generating geometrically diverse conformers rather than finding the most probable or energetically lowest minima. Here, we present a new stochastic search method called the Bayesian optimization algorithm (BOA) for finding the lowest energy conformation of a given molecule. We compare BOA with uniform random search, and systematic search as implemented in Confab, to determine which method finds the lowest energy. Energetic difference, root-mean-square deviation, and torsion fingerprint deviation are used to quantify the performance of the conformer search algorithms. In general, we find BOA requires far fewer evaluations than systematic or uniform random search to find low-energy minima. For molecules with four or more rotatable bonds, Confab typically evaluates [Formula: see text] (median) conformers in its search, while BOA only requires [Formula: see text] energy evaluations to find top candidates. Despite using evaluating fewer conformers, 20-40% of the time BOA finds lower-energy conformations than a systematic Confab search for molecules with four or more rotatable bonds.

18.

Exploration of piperidinols as potential antitubercular agents.

Abuhammad, Areej; Fullam, Elizabeth; Bhakta, Sanjib; Russell, Angela J; Morris, Garrett M; Finn, Paul W; Sim, Edith.

Molecules ; 19(10): 16274-90, 2014 Oct 10.

Article in English | MEDLINE | ID: mdl-25310152

ABSTRACT

Novel drugs to treat tuberculosis are required and the identification of potential targets is important. Piperidinols have been identified as potential antimycobacterial agents (MIC < 5 µg/mL), which also inhibit mycobacterial arylamine N-acetyltransferase (NAT), an enzyme essential for mycobacterial survival inside macrophages. The NAT inhibition involves a prodrug-like mechanism in which activation leads to the formation of bioactive phenyl vinyl ketone (PVK). The PVK fragment selectively forms an adduct with the cysteine residue in the active site. Time dependent inhibition of the NAT enzyme from Mycobacterium marinum (M. marinum) demonstrates a covalent binding mechanism for all inhibitory piperidinol analogues. The structure activity relationship highlights the importance of halide substitution on the piperidinol benzene ring. The structures of the NAT enzymes from M. marinum and M. tuberculosis, although 74% identical, have different residues in their active site clefts and allow the effects of amino acid substitutions to be assessed in understanding inhibitory potency. In addition, we have used the piperidinol 3-dimensional shape and electrostatic properties to identify two additional distinct chemical scaffolds as inhibitors of NAT. While one of the scaffolds has anti-tubercular activity, both inhibit NAT but through a non-covalent mechanism.

Subject(s)

Antitubercular Agents/chemistry , Antitubercular Agents/pharmacology , Piperidines/chemistry , Piperidines/pharmacology , Acetyltransferases/antagonists & inhibitors , Acetyltransferases/metabolism , Binding Sites , Humans , Molecular Conformation , Mycobacterium tuberculosis/drug effects , Mycobacterium tuberculosis/enzymology , Protein Binding

19.

Understanding the structural requirements for activators of the Kef bacterial potassium efflux system.

Healy, Jessica; Ekkerman, Silvia; Pliotas, Christos; Richard, Morgiane; Bartlett, Wendy; Grayer, Samuel C; Morris, Garrett M; Miller, Samantha; Booth, Ian R; Conway, Stuart J; Rasmussen, Tim.

Biochemistry ; 53(12): 1982-92, 2014 Apr 01.

Article in English | MEDLINE | ID: mdl-24601535

ABSTRACT

The potassium efflux system, Kef, protects bacteria against the detrimental effects of electrophilic compounds via acidification of the cytoplasm. Kef is inhibited by glutathione (GSH) but activated by glutathione-S-conjugates (GS-X) formed in the presence of electrophiles. GSH and GS-X bind to overlapping sites on Kef, which are located in a cytosolic regulatory domain. The central paradox of this activation mechanism is that GSH is abundant in cells (at concentrations of â¼10-20 mM), and thus, activating ligands must possess a high differential over GSH in their affinity for Kef. To investigate the structural requirements for binding of a ligand to Kef, a novel fluorescent reporter ligand, S-{[5-(dimethylamino)naphthalen-1-yl]sulfonylaminopropyl} glutathione (DNGSH), was synthesized. By competition assays using DNGSH, complemented by direct binding assays and thermal shift measurements, we show that the well-characterized Kef activator, N-ethylsuccinimido-S-glutathione, has a 10-20-fold higher affinity for Kef than GSH. In contrast, another native ligand that is a poor activator, S-lactoylglutathione, exhibits a similar Kef affinity to GSH. Synthetic ligands were synthesized to contain either rigid or flexible structures and investigated as ligands for Kef. Compounds with rigid structures and high affinity activated Kef. In contrast, flexible ligands with similar binding affinities did not activate Kef. These data provide insight into the structural requirements for Kef gating, paving the way for the development of a screen for potential therapeutic lead compounds targeting the Kef system.

Subject(s)

Escherichia coli Proteins/chemistry , Glutathione/analogs & derivatives , Potassium-Hydrogen Antiporters/chemistry , Potassium/chemistry , Succinimides/chemistry , Biological Transport, Active/physiology , Escherichia coli Proteins/metabolism , Glutathione/chemistry , Glutathione/metabolism , Ion Channel Gating/physiology , Ligands , Potassium/metabolism , Potassium-Hydrogen Antiporters/metabolism , Protein Binding , Protein Structure, Secondary , Protein Structure, Tertiary , Shewanella/chemistry , Shewanella/metabolism , Succinimides/metabolism

20.

One Size Does Not Fit All: The Limits of Structure-Based Models in Drug Discovery.

Ross, Gregory A; Morris, Garrett M; Biggin, Philip C.

J Chem Theory Comput ; 9(9): 4266-4274, 2013 Sep 10.

Article in English | MEDLINE | ID: mdl-24124403

ABSTRACT

A major goal in computational chemistry has been to discover the set of rules that can accurately predict the binding affinity of any protein-drug complex, using only a single snapshot of its three-dimensional structure. Despite the continual development of structure-based models, predictive accuracy remains low, and the fundamental factors that inhibit the inference of all-encompassing rules have yet to be fully explored. Using statistical learning theory and information theory, here we prove that even the very best generalized structure-based model is inherently limited in its accuracy, and protein-specific models are always likely to be better. Our results refute the prevailing assumption that large data sets and advanced machine learning techniques will yield accurate, universally applicable models. We anticipate that the results will aid the development of more robust virtual screening strategies and scoring function error estimations.

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL