ABSTRACT
Machine learning models support computer-aided molecular design and compound optimization. However, the initial phases of drug discovery often face a scarcity of training data for these models. Meta-learning has emerged as a potentially promising strategy, harnessing the wealth of structure-activity data available for known targets to facilitate efficient few-shot model training for the specific target of interest. In this study, we assessed the effectiveness of two different meta-learning methods, namely model-agnostic meta-learning (MAML) and adaptive deep kernel fitting (ADKF), specifically in the regression setting. We investigated how factors such as dataset size and the similarity of training tasks impact predictability. The results indicate that ADKF significantly outperformed both MAML and a single-task baseline model on the inhibition data. However, the performance of ADKF varied across different test tasks. Our findings suggest that considerable enhancements in performance can be anticipated primarily when the task of interest is similar to the tasks incorporated in the meta-learning process.
Subject(s)
Machine Learning , Structure-Activity Relationship , Humans , Drug DiscoveryABSTRACT
G protein-coupled receptors (GPCRs) are important pharmaceutical targets for the treatment of a broad spectrum of diseases. Although there are structures of GPCRs in their active conformation with bound ligands and G proteins, the detailed molecular interplay between the receptors and their signaling partners remains challenging to decipher. To address this, we developed a high-sensitivity, high-throughput matrix-assisted laser desorption/ionization mass spectrometry (MALDI-MS) method to interrogate the first stage of signal transduction. GPCR-G protein complex formation is detected as a proxy for the effect of ligands on GPCR conformation and on coupling selectivity. Over 70 ligand-GPCR-partner protein combinations were studied using as little as 1.25 pmol protein per sample. We determined the selectivity profile and binding affinities of three GPCRs (rhodopsin, beta-1 adrenergic receptor [ß1AR], and angiotensin II type 1 receptor) to engineered Gα-proteins (mGs, mGo, mGi, and mGq) and nanobody 80 (Nb80). We found that GPCRs in the absence of ligand can bind mGo, and that the role of the G protein C terminus in GPCR recognition is receptor-specific. We exemplified our quantification method using ß1AR and demonstrated the allosteric effect of Nb80 binding in assisting displacement of nadolol to isoprenaline. We also quantified complex formation with wild-type heterotrimeric Gαißγ and ß-arrestin-1 and showed that carvedilol induces an increase in coupling of ß-arrestin-1 and Gαißγ to ß1AR. A normalization strategy allows us to quantitatively measure the binding affinities of GPCRs to partner proteins. We anticipate that this methodology will find broad use in screening and characterization of GPCR-targeting drugs.
Subject(s)
GTP-Binding Proteins/metabolism , Receptors, Opioid/metabolism , Spectrometry, Mass, Matrix-Assisted Laser Desorption-Ionization/methods , Animals , Arrestin/genetics , Arrestin/metabolism , GTP-Binding Proteins/genetics , Gene Expression Regulation , HEK293 Cells , Humans , Ligands , Mice , Models, Molecular , Protein Binding , Protein Conformation , Receptors, Opioid/chemistry , Single-Chain Antibodies , Turkeys , beta-Arrestin 1/genetics , beta-Arrestin 1/metabolismABSTRACT
The endocannabinoid system (ECS) is a critical regulatory network composed of endogenous cannabinoids (eCBs), their synthesizing and degrading enzymes, and associated receptors. It is integral to maintaining homeostasis and orchestrating key functions within the central nervous and immune systems. Given its therapeutic significance, we have launched a series of drug discovery endeavors aimed at ECS targets, including peroxisome proliferator-activated receptors (PPARs), cannabinoid receptors types 1 (CB1R) and 2 (CB2R), and monoacylglycerol lipase (MAGL), addressing a wide array of medical needs. The pursuit of new therapeutic agents has been enhanced by the creation of specialized labeled chemical probes, which aid in target localization, mechanistic studies, assay development, and the establishment of biomarkers for target engagement. By fusing medicinal chemistry with chemical biology in a comprehensive, translational end-to-end drug discovery strategy, we have expedited the development of novel therapeutics. Additionally, this strategy promises to foster highly productive partnerships between industry and academia, as will be illustrated through various examples.
Subject(s)
Chemistry, Pharmaceutical , Drug Discovery , Endocannabinoids , Endocannabinoids/metabolism , Endocannabinoids/chemistry , Humans , Drug Industry , Monoacylglycerol Lipases/metabolism , Monoacylglycerol Lipases/antagonists & inhibitors , Drug Development , AcademiaABSTRACT
Chemical language models (CLMs) can be employed to design molecules with desired properties. CLMs generate new chemical structures in the form of textual representations, such as the simplified molecular input line entry system (SMILES) strings. However, the quality of these de novo generated molecules is difficult to assess a priori. In this study, we apply the perplexity metric to determine the degree to which the molecules generated by a CLM match the desired design objectives. This model-intrinsic score allows identifying and ranking the most promising molecular designs based on the probabilities learned by the CLM. Using perplexity to compare "greedy" (beam search) with "explorative" (multinomial sampling) methods for SMILES generation, certain advantages of multinomial sampling become apparent. Additionally, perplexity scoring is performed to identify undesired model biases introduced during model training and allows the development of a new ranking system to remove those undesired biases.
Subject(s)
Language , Models, Chemical , ProbabilityABSTRACT
Many molecular design tasks benefit from fast and accurate calculations of quantum-mechanical (QM) properties. However, the computational cost of QM methods applied to drug-like molecules currently renders large-scale applications of quantum chemistry challenging. Aiming to mitigate this problem, we developed DelFTa, an open-source toolbox for the prediction of electronic properties of drug-like molecules at the density functional (DFT) level of theory, using Δ-machine-learning. Δ-Learning corrects the prediction error (Δ) of a fast but inaccurate property calculation. DelFTa employs state-of-the-art three-dimensional message-passing neural networks trained on a large dataset of QM properties. It provides access to a wide array of quantum observables on the molecular, atomic and bond levels by predicting approximations to DFT values from a low-cost semiempirical baseline. Δ-Learning outperformed its direct-learning counterpart for most of the considered QM endpoints. The results suggest that predictions for non-covalent intra- and intermolecular interactions can be extrapolated to larger biomolecular systems. The software is fully open-sourced and features documented command-line and Python APIs.
Subject(s)
Chemistry, Pharmaceutical , Quantum Theory , Machine Learning , Neural Networks, Computer , SoftwareABSTRACT
COVID-19 has resulted in huge numbers of infections and deaths worldwide and brought the most severe disruptions to societies and economies since the Great Depression. Massive experimental and computational research effort to understand and characterize the disease and rapidly develop diagnostics, vaccines, and drugs has emerged in response to this devastating pandemic and more than 130 000 COVID-19-related research papers have been published in peer-reviewed journals or deposited in preprint servers. Much of the research effort has focused on the discovery of novel drug candidates or repurposing of existing drugs against COVID-19, and many such projects have been either exclusively computational or computer-aided experimental studies. Herein, we provide an expert overview of the key computational methods and their applications for the discovery of COVID-19 small-molecule therapeutics that have been reported in the research literature. We further outline that, after the first year the COVID-19 pandemic, it appears that drug repurposing has not produced rapid and global solutions. However, several known drugs have been used in the clinic to cure COVID-19 patients, and a few repurposed drugs continue to be considered in clinical trials, along with several novel clinical candidates. We posit that truly impactful computational tools must deliver actionable, experimentally testable hypotheses enabling the discovery of novel drugs and drug combinations, and that open science and rapid sharing of research results are critical to accelerate the development of novel, much needed therapeutics for COVID-19.
Subject(s)
COVID-19 Drug Treatment , Computer Simulation , Drug Design , Drug Discovery/methods , Drug Repositioning , Antiviral Agents/therapeutic use , COVID-19/virology , Clinical Trials as Topic , Humans , Pandemics , SARS-CoV-2/drug effectsABSTRACT
The computer-assisted design of new chemical entities has made a leap forward with the development of machine learning models for automated molecule generation. The overarching goal of this conceptual approach is to augment the creativity of medicinal chemists with a machine intelligence. In this Perspective we highlight prospective applications of "de novo" drug design and target prediction, aiming to generate natural product-inspired bioactive compounds from scratch. A virtual chemist transforms pharmacologically active natural products into new, easily synthesizable small molecules with desired properties and activity. Computational activity prediction and automated compound generation offer the possibility to systematically transfer the wealth of pharmaceutically active natural products to synthetic small molecule drug discovery. We present selected prospective examples and dare a forecast into the future of natural product-inspired drug discovery.
ABSTRACT
Fast and efficient handling of ligands and biological targets are required in bioaffinity screening based on native electrospray ionization mass spectrometry (ESI-MS). We use a prototype microfluidic autosampler, called the "gap sampler", to sequentially mix and electrospray individual small molecule ligands together with a target protein and compare the screening results with data from thermal shift assay and surface plasmon resonance. In a first round, all three techniques were used for a screening of 110 ligands against bovine carbonic anhydrase II, which resulted in five mutual hits and some false positives with ESI-MS presumably due to the high ligand concentration or interferences from dimethyl sulfoxide. In a second round, 33 compounds were screened in lower concentrations and in a less complex matrix, resulting in only true positives with ESI-MS. Within a cycle time of 30 s, dissociation constants were determined within an order of magnitude accuracy consuming only 5 pmol of ligand and less than 15 pmol of protein per screened compound. In a third round, dissociation constants of five compounds were accurately determined in a titration experiment. Thus, the gap sampler can rapidly and efficiently be used for high-throughput screening.
Subject(s)
Research , Spectrometry, Mass, Electrospray Ionization , Animals , CattleABSTRACT
Graph neural networks are able to solve certain drug discovery tasks such as molecular property prediction and de novo molecule generation. However, these models are considered "black-box" and "hard-to-debug". This study aimed to improve modeling transparency for rational molecular design by applying the integrated gradients explainable artificial intelligence (XAI) approach for graph neural network models. Models were trained for predicting plasma protein binding, hERG channel inhibition, passive permeability, and cytochrome P450 inhibition. The proposed methodology highlighted molecular features and structural elements that are in agreement with known pharmacophore motifs, correctly identified property cliffs, and provided insights into unspecific ligand-target interactions. The developed XAI approach is fully open-sourced and can be used by practitioners to train new models on other clinically relevant endpoints.
Subject(s)
Artificial Intelligence , Neural Networks, Computer , Drug Discovery , LigandsABSTRACT
African and American trypanosomiases are estimated to affect several million people across the world, with effective treatments distinctly lacking. New, ideally oral, treatments with higher efficacy against these diseases are desperately needed. Peroxisomal import matrix (PEX) proteins represent a very interesting target for structure- and ligand-based drug design. The PEX5-PEX14 protein-protein interface in particular has been highlighted as a target, with inhibitors shown to disrupt essential cell processes in trypanosomes, leading to cell death. In this work, we present a drug development campaign that utilizes the synergy between structural biology, computer-aided drug design, and medicinal chemistry in the quest to discover and develop new potential compounds to treat trypanosomiasis by targeting the PEX14-PEX5 interaction. Using the structure of the known lead compounds discovered by Dawidowski et al. as the template for a chemically advanced template search (CATS) algorithm, we performed scaffold-hopping to obtain a new class of compounds with trypanocidal activity, based on 2,3,4,5-tetrahydrobenzo[f][1,4]oxazepines chemistry. The initial compounds obtained were taken forward to a first round of hit-to-lead optimization by synthesis of derivatives, which show activities in the range of low- to high-digit micromolar IC50 in the in vitro tests. The NMR measurements confirm binding to PEX14 in solution, while immunofluorescent microscopy indicates disruption of protein import into the glycosomes, indicating that the PEX14-PEX5 protein-protein interface was successfully disrupted. These studies result in development of a novel scaffold for future lead optimization, while ADME testing gives an indication of further areas of improvement in the path from lead molecules toward a new drug active against trypanosomes.
Subject(s)
Oxazepines , Trypanocidal Agents , Computer-Aided Design , Membrane Proteins/metabolism , Peroxisome-Targeting Signal 1 Receptor , Receptors, Cytoplasmic and Nuclear , Repressor Proteins/metabolism , Trypanocidal Agents/pharmacologyABSTRACT
Artificial intelligence (AI), and, in particular, deep learning as a subcategory of AI, provides opportunities for the discovery and development of innovative drugs. Various machine learning approaches have recently (re)emerged, some of which may be considered instances of domain-specific AI which have been successfully employed for drug discovery and design. This review provides a comprehensive portrayal of these machine learning techniques and of their applications in medicinal chemistry. After introducing the basic principles, alongside some application notes, of the various machine learning algorithms, the current state-of-the art of AI-assisted pharmaceutical discovery is discussed, including applications in structure- and ligand-based virtual screening, de novo drug design, physicochemical and pharmacokinetic property prediction, drug repurposing, and related aspects. Finally, several challenges and limitations of the current methods are summarized, with a view to potential future directions for AI-assisted drug discovery and design.
Subject(s)
Artificial Intelligence , Drug Discovery/methods , Algorithms , Bayes Theorem , Drug Design , Machine Learning , Neural Networks, Computer , Small Molecule Libraries/chemistry , Small Molecule Libraries/pharmacologyABSTRACT
Chemical language models enable deâ novo drug design without the requirement for explicit molecular construction rules. While such models have been applied to generate novel compounds with desired bioactivity, the actual prioritization and selection of the most promising computational designs remains challenging. Herein, we leveraged the probabilities learnt by chemical language models with the beam search algorithm as a model-intrinsic technique for automated molecule design and scoring. Prospective application of this method yielded novel inverse agonists of retinoic acid receptor-related orphan receptors (RORs). Each design was synthesizable in three reaction steps and presented low-micromolar to nanomolar potency towards RORγ. This model-intrinsic sampling technique eliminates the strict need for external compound scoring functions, thereby further extending the applicability of generative artificial intelligence to data-driven drug discovery.
Subject(s)
Automation , Biological Products/pharmacology , Drug Design , Receptors, Retinoic Acid/agonists , Algorithms , Biological Products/chemical synthesis , Biological Products/chemistry , Humans , Ligands , Molecular StructureABSTRACT
Naturally occurring membranolytic antimicrobial peptides (AMPs) are rarely cell-type selective and highly potent at the same time. Template-based peptide design can be used to generate AMPs with improved properties de novo. Following this approach, 18 linear peptides were obtained by computationally morphing the natural AMP Aurein 2.2d2 GLFDIVKKVVGALG into the synthetic model AMP KLLKLLKKLLKLLK. Eleven of the 18 chimeric designs inhibited the growth of Staphylococcus aureus, and six peptides were tested and found to be active against one resistant pathogenic strain or more. One of the peptides was broadly active against bacterial and fungal pathogens without exhibiting toxicity to certain human cell lines. Solution nuclear magnetic resonance and molecular dynamics simulation suggested an oblique-oriented membrane insertion mechanism of this helical de novo peptide. Temperature-resolved circular dichroism spectroscopy pointed to conformational flexibility as an essential feature of cell-type selective AMPs.
Subject(s)
Anti-Bacterial Agents/chemistry , Anti-Bacterial Agents/pharmacology , Antimicrobial Cationic Peptides/chemistry , Antimicrobial Cationic Peptides/pharmacology , Staphylococcus aureus/drug effects , Amino Acid Sequence , Drug Design , HEK293 Cells , Humans , Molecular Dynamics Simulation , Protein Conformation, alpha-Helical , Staphylococcal Infections/drug therapy , Staphylococcal Infections/microbiology , Staphylococcus aureus/growth & developmentABSTRACT
Several cationic amphiphilic drugs (CADs) have been found to inhibit cell entry of filoviruses and other enveloped viruses. Structurally unrelated CADs may have antiviral activity, yet the underlying common mechanism and structure-activity relationship are incompletely understood. We aimed to understand how widespread antiviral activity is among CADs and which structural and physico-chemical properties are linked to entry inhibition. We measured inhibition of Marburg virus pseudoparticle (MARVpp) cell entry by 45 heterogeneous and mostly FDA-approved CADs and cytotoxicity in EA.hy926 cells. We analyzed correlation of antiviral activity with four chemical properties: pKa, hydrophobicity (octanol/water partitioning coefficient; ClogP), molecular weight, and distance between the basic group and hydrophobic ring structures. Additionally, we quantified drug-induced phospholipidosis (DIPL) of a CAD subset by flow cytometry. Structurally similar compounds (derivatives) and those with similar chemical properties but unrelated structures (analogues) to those of strong inhibitors were obtained by two in silico similarity search approaches and tested for antiviral activity. Overall, 11 out of 45 (24%) CADs inhibited MARVpp by 40% or more. The strongest antiviral compounds were dronedarone, triparanol, and quinacrine. Structure-activity relationship studies revealed highly significant correlations between antiviral activity, hydrophobicity (ClogP > 4), and DIPL. Moreover, pKa and intramolecular distance between hydrophobic and hydrophilic moieties correlated with antiviral activity but to a lesser extent. We also showed that in contrast to analogues, derivatives had antiviral activity similar to that of the seed compound dronedarone. Overall, one-quarter of CADs inhibit MARVpp entry in vitro, and antiviral activity of CADs mostly relies on their hydrophobicity yet is promoted by the individual structure.
Subject(s)
Filoviridae , Marburgvirus , Pharmaceutical Preparations , Antiviral Agents/pharmacology , Virus InternalizationABSTRACT
Deep convolutional neural networks (CNNs) are a method of choice for image recognition. Herein a hybrid CNN approach is presented for molecular pattern recognition in drug discovery. Using self-organizing map images of molecular pharmacophores as input, CNN models were trained to identify chemokine receptor CXCR4 modulators with high accuracy. This machine learning classifier identified first-in-class synthetic CXCR4 full agonists. The receptor-activating effects were confirmed by intracellular cAMP response and in a phenotypic spheroid invasion assay of medulloblastoma cell invasion. Additional macromolecular targets of the small molecules were predicted inâ silico and tested inâ vitro, revealing modulatory effects on dopamine receptors and CCR1. These results positively advocate the applicability of molecular image recognition by CNNs to ligand-based virtual compound screening, and demonstrate the complementarity of machine intelligence and human expert knowledge.
Subject(s)
Cell Movement , Deep Learning , Receptors, CXCR4/agonists , Receptors, CXCR4/antagonists & inhibitors , Cell Line, Tumor , Drug Design , HumansABSTRACT
Recurrent neural networks (RNNs) are able to generate de novo molecular designs using simplified molecular input line entry systems (SMILES) string representations of the chemical structure. RNN-based structure generation is usually performed unidirectionally, by growing SMILES strings from left to right. However, there is no natural start or end of a small molecule, and SMILES strings are intrinsically nonunivocal representations of molecular graphs. These properties motivate bidirectional structure generation. Here, bidirectional generative RNNs for SMILES-based molecule design are introduced. To this end, two established bidirectional methods were implemented, and a new method for SMILES string generation and data augmentation is introduced-the bidirectional molecule design by alternate learning (BIMODAL). These three bidirectional strategies were compared to the unidirectional forward RNN approach for SMILES string generation, in terms of the (i) novelty, (ii) scaffold diversity, and (iii) chemical-biological relevance of the computer-generated molecules. The results positively advocate bidirectional strategies for SMILES-based molecular de novo design, with BIMODAL showing superior results to the unidirectional forward RNN for most of the criteria in the tested conditions. The code of the methods and the pretrained models can be found at URL https://github.com/ETHmodlab/BIMODAL.
Subject(s)
Neural Networks, ComputerABSTRACT
Drug discovery benefits from computational models aiding the identification of new chemical matter with bespoke properties. The field of de novo drug design has been particularly revitalized by adaptation of generative machine learning models from the field of natural language processing. These deep neural network models are trained on recognizing molecular structures and generate new molecular entities without relying on pre-determined sets of molecular building blocks and chemical transformations for virtual molecule construction. Implicit representation of chemical knowledge provides an alternative to formulating the molecular design task in terms of the established, explicit chemical vocabulary. Here, we review de novo molecular design approaches from the field of 'artificial intelligence', focusing on instances of deep generative models, and highlight the prospective application of long short-term memory models to hit and lead finding in medicinal chemistry.
Subject(s)
Memory, Short-Term , Drug Design , Machine Learning , Neural Networks, Computer , Prospective StudiesABSTRACT
Medicinal chemistry and, in particular, drug design have often been perceived as more of an art than a science. The many unknowns of human disease and the sheer complexity of chemical space render decision making in medicinal chemistry exceptionally demanding. Computational models can assist the medicinal chemist in this endeavour. Provided here is an overview of recent examples of automated de novo molecular design, a discussion of the concepts and computational approaches involved, and the daring prediction of some of the possibilities and limitations of drug design using machine intelligence.
Subject(s)
Automation , Drug Design , Artificial Intelligence , Chemistry, Pharmaceutical , HumansABSTRACT
Short linear peptides can overcome certain limitations of small molecules for targeting protein-protein interactions (PPIs). Herein, the interaction between the human chemokine CCL19 with chemokine receptor CCR7 was investigated to obtain receptor-derived CCL19-binding peptides. After identifying a linear binding site of CCR7, five hexapeptides binding to CCL19 in the low micromolar to nanomolar range were designed, guided by pharmacophore and lipophilicity screening of computationally generated peptide libraries. The results corroborate the applicability of the computational approach and the chosen selection criteria to obtain short linear peptides mimicking a protein-protein interaction site.
Subject(s)
Chemokine CCL19/metabolism , Peptide Fragments/metabolism , Protein Interaction Domains and Motifs , Receptors, CCR7/metabolism , Binding Sites , Computer Simulation , Humans , Ligands , Peptide Library , Protein Binding , Signal TransductionABSTRACT
A computational technique based on a simulated molecular evolution protocol was employed for anticancer peptide (ACP) design. Starting from known ACPs, innovative bioactive peptides were automatically generated in computer-assisted design-synthesize-test cycles. This design algorithm offers a viable strategy for the generation of novel peptide sequences, without requiring aâ priori structure-activity knowledge. Sequence morphing and activity improvement were achieved through iterative amino acid variation and selection. Results show that not only the interaction of ACPs with the target membrane is important for their anticancer activity, but also the degree of peptide dimerization, which was corroborated by temperature profiling and electrospray mass spectrometry.