Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 51
Filter
1.
Bioorg Med Chem Lett ; 69: 128786, 2022 08 01.
Article in English | MEDLINE | ID: mdl-35569689

ABSTRACT

Contrary to expectation N-aryl pyrrolidinones (and isosteric imidazolinones and oxazolinones) are more lipophilic and less soluble than the corresponding piperidinones (tetrahydropyrimidinones and oxazinones). Exploration of the basis for these results uncovered a subtle interplay of steric and electronic effects that result in different conformations for the two classes of compounds which drive the observed effects.


Subject(s)
Pyrrolidinones , Molecular Conformation
2.
J Chem Inf Model ; 62(2): 340-349, 2022 01 24.
Article in English | MEDLINE | ID: mdl-35018781

ABSTRACT

The conformational behavior of a small molecule free in solution is important to understand the free energy of binding to its target. This could be of special interest for proteolysis-targeting chimeras (PROTACs) due to their often flexible and lengthy linkers and the need to induce a ternary complex. Here, we report on the molecular dynamics (MD) simulations of two PROTACs, MZ1 and dBET6, revealing different linker conformational behaviors. The simulation of MZ1 in dimethyl sulfoxide (DMSO) agrees well with the nuclear magnetic resonance study, providing strong support for the relevance of our simulations. To further understand the role of linker plasticity in the formation of a ternary complex, the dissociation of the complex von Hippel-Lindau-MZ1-BRD4 is investigated in detail by steered simulations and is shown to follow a two-step pathway. Interestingly, both MZ1 and dBET6 display in water, a tendency toward an intramolecular lipophilic interaction between the two warheads. The hydrophobic contact of the two warheads would prevent them from binding to their respective proteins and might have an effect on the efficacy of induced cellular protein degradation. However, conformations featuring this hydrophobic contact of the two warheads are calculated to be marginally more favorable.


Subject(s)
Nuclear Proteins , Ubiquitin-Protein Ligases , Nuclear Proteins/metabolism , Proteolysis , Transcription Factors/metabolism , Ubiquitin-Protein Ligases/chemistry , Ubiquitin-Protein Ligases/metabolism
3.
J Chem Inf Model ; 62(12): 2999-3007, 2022 06 27.
Article in English | MEDLINE | ID: mdl-35699524

ABSTRACT

Peptides are an important modality in drug discovery. While current peptide optimization focuses predominantly on the small number of natural and commercially available non-natural amino acids, the chemical spaces available for small molecule drug discovery are in the billions of molecules. In the present study, we describe the development of a large virtual library of readily synthesizable non-natural amino acids that can power the virtual screening protocols and aid in peptide optimization. To that end, we enumerated nearly 380 thousand amino acids and demonstrated their vast chemical diversity compared to the 20 natural and commercial residues. Furthermore, we selected a diverse ten thousand amino acid subset to validate our virtual screening workflow on the Keap1-Neh2 complex model system. Through in silico mutations of Neh2 peptide residues to those from the virtual library, our docking-based protocol identified a number of possible solutions with a significantly higher predicted affinity toward the Keap1 protein. This protocol demonstrates that the non-natural amino acid chemical space can be massively extended and virtually screened with a reasonable computational cost.


Subject(s)
Amino Acids , NF-E2-Related Factor 2 , Amino Acids/chemistry , Drug Discovery/methods , Kelch-Like ECH-Associated Protein 1 , Molecular Docking Simulation , Peptides/chemistry
4.
J Biol Chem ; 295(33): 11754-11763, 2020 08 14.
Article in English | MEDLINE | ID: mdl-32587091

ABSTRACT

The transcription factor NF-ĸB is a master regulator of the innate immune response and plays a central role in inflammatory diseases by mediating the expression of pro-inflammatory cytokines. Ubiquitination-triggered proteasomal degradation of DNA-bound NF-ĸB strongly limits the expression of its target genes. Conversely, USP7 (deubiquitinase ubiquitin-specific peptidase 7) opposes the activities of E3 ligases, stabilizes DNA-bound NF-ĸB, and thereby promotes NF-ĸB-mediated transcription. Using gene expression and synthetic peptide arrays on membrane support and overlay analyses, we found here that inhibiting USP7 increases NF-ĸB ubiquitination and degradation, prevents Toll-like receptor-induced pro-inflammatory cytokine expression, and represents an effective strategy for controlling inflammation. However, the broad regulatory roles of USP7 in cell death pathways, chromatin, and DNA damage responses limit the use of catalytic inhibitors of USP7 as anti-inflammatory agents. To this end, we identified an NF-ĸB-binding site in USP7, ubiquitin-like domain 2, that selectively mediates interactions of USP7 with NF-ĸB subunits but is dispensable for interactions with other proteins. Moreover, we found that the amino acids 757LDEL760 in USP7 critically contribute to the interaction with the p65 subunit of NF-ĸB. Our findings support the notion that USP7 activity could be potentially targeted in a substrate-selective manner through the development of noncatalytic inhibitors of this deubiquitinase to abrogate NF-ĸB activity.


Subject(s)
Transcription Factor RelA/metabolism , Ubiquitin-Specific Peptidase 7/metabolism , Ubiquitination , Animals , Cells, Cultured , Female , HEK293 Cells , Humans , Male , Mice, Inbred C57BL , Models, Molecular , Protein Domains , Protein Interaction Domains and Motifs , Proteolysis , Ubiquitin-Specific Peptidase 7/chemistry
5.
Nat Chem Biol ; 15(4): 348-357, 2019 04.
Article in English | MEDLINE | ID: mdl-30718815

ABSTRACT

We have discovered a class of PI3Kγ inhibitors exhibiting over 1,000-fold selectivity over PI3Kα and PI3Kß. On the basis of X-ray crystallography, hydrogen-deuterium exchange-mass spectrometry and surface plasmon resonance experiments we propose that the cyclopropylethyl moiety displaces the DFG motif of the enzyme away from the adenosine tri-phosphate binding site, inducing a large conformational change in both the kinase- and helical domains of PI3Kγ. Site directed mutagenesis explained how the conformational changes occur. Our results suggest that these cyclopropylethyl substituted compounds selectively inhibit the active state of PI3Kγ, which is unique to these compounds and to the PI3Kγ isoform, explaining their excellent potency and unmatched isoform selectivity that were confirmed in cellular systems. This is the first example of a Class I PI3K inhibitor achieving its selectivity by affecting the DFG motif in a manner that bears similarity to DFG in/out for type II protein kinase inhibitors.


Subject(s)
Class Ib Phosphatidylinositol 3-Kinase/metabolism , Phosphatidylinositol 3-Kinases/metabolism , Phosphoinositide-3 Kinase Inhibitors , Adenosine Triphosphatases , Binding Sites , Class I Phosphatidylinositol 3-Kinases/antagonists & inhibitors , Class I Phosphatidylinositol 3-Kinases/metabolism , Crystallography, X-Ray , Humans , Models, Molecular , Mutagenesis, Site-Directed , Phthalimides , Protein Binding , Protein Conformation , Protein Isoforms/physiology , Protein Kinase Inhibitors , Substrate Specificity
6.
J Chem Inf Model ; 61(7): 3667-3680, 2021 07 26.
Article in English | MEDLINE | ID: mdl-34156843

ABSTRACT

The glucocorticoid receptor (GR) is a nuclear receptor that controls critical biological processes by regulating the transcription of specific genes. There is a known allosteric cross-talk between the ligand and coregulator binding sites within the GR ligand-binding domain that is crucial for the control of the functional response. However, the molecular mechanisms underlying such an allosteric control remain elusive. Here, molecular dynamics (MD) simulations, bioinformatic analysis, and biophysical measurements are integrated to capture the structural and dynamic features of the allosteric cross-talk within the GR. We identified a network of evolutionarily conserved residues that enables the allosteric signal transduction, in agreement with experimental data. MD simulations clarify how such a network is dynamically interconnected and offer a mechanistic explanation of how different peptides affect the intensity of the allosteric signal. This study provides useful insights to elucidate the GR allosteric regulation, ultimately providing a foundation for designing novel drugs.


Subject(s)
Peptides , Receptors, Glucocorticoid , Allosteric Regulation , Allosteric Site , Binding Sites , Humans , Ligands , Protein Binding , Receptors, Glucocorticoid/metabolism
7.
Bioorg Med Chem Lett ; 30(4): 126953, 2020 02 15.
Article in English | MEDLINE | ID: mdl-31932225

ABSTRACT

GPR81 is a novel drug target that is implicated in the control of glucose and lipid metabolism. The lack of potent GPR81 modulators suitable for in vivo studies has limited the pharmacological characterization of this lactate sensing receptor. We performed a high throughput screen (HTS) and identified a GPR81 agonist chemical series containing a central acyl urea scaffold linker. During SAR exploration two additional new series were evolved, one containing cyclic acyl urea bioisosteres and another a central amide bond. These three series provide different selectivity and physicochemical properties suitable for in-vivo studies.


Subject(s)
Receptors, G-Protein-Coupled/agonists , Urea/analogs & derivatives , Amides/chemistry , Amides/metabolism , High-Throughput Screening Assays , Humans , Molecular Conformation , Protein Binding , Receptors, G-Protein-Coupled/metabolism , Receptors, Ghrelin/agonists , Receptors, Ghrelin/metabolism , Structure-Activity Relationship , Urea/metabolism
8.
J Chem Inf Model ; 60(12): 5918-5922, 2020 12 28.
Article in English | MEDLINE | ID: mdl-33118816

ABSTRACT

In the past few years, we have witnessed a renaissance of the field of molecular de novo drug design. The advancements in deep learning and artificial intelligence (AI) have triggered an avalanche of ideas on how to translate such techniques to a variety of domains including the field of drug design. A range of architectures have been devised to find the optimal way of generating chemical compounds by using either graph- or string (SMILES)-based representations. With this application note, we aim to offer the community a production-ready tool for de novo design, called REINVENT. It can be effectively applied on drug discovery projects that are striving to resolve either exploration or exploitation problems while navigating the chemical space. It can facilitate the idea generation process by bringing to the researcher's attention the most promising compounds. REINVENT's code is publicly available at https://github.com/MolecularAI/Reinvent.


Subject(s)
Artificial Intelligence , Drug Design , Drug Discovery
9.
Nat Chem Biol ; 12(12): 1065-1074, 2016 12.
Article in English | MEDLINE | ID: mdl-27748751

ABSTRACT

Macrocycles are of increasing interest as chemical probes and drugs for intractable targets like protein-protein interactions, but the determinants of their cell permeability and oral absorption are poorly understood. To enable rational design of cell-permeable macrocycles, we generated an extensive data set under consistent experimental conditions for more than 200 non-peptidic, de novo-designed macrocycles from the Broad Institute's diversity-oriented screening collection. This revealed how specific functional groups, substituents and molecular properties impact cell permeability. Analysis of energy-minimized structures for stereo- and regioisomeric sets provided fundamental insight into how dynamic, intramolecular interactions in the 3D conformations of macrocycles may be linked to physicochemical properties and permeability. Combined use of quantitative structure-permeability modeling and the procedure for conformational analysis now, for the first time, provides chemists with a rational approach to design cell-permeable non-peptidic macrocycles with potential for oral absorption.


Subject(s)
Macrocyclic Compounds/chemistry , Macrocyclic Compounds/pharmacokinetics , Caco-2 Cells , Humans , Molecular Structure , Permeability , Stereoisomerism , Structure-Activity Relationship
10.
Biophys J ; 112(6): 1147-1156, 2017 Mar 28.
Article in English | MEDLINE | ID: mdl-28355542

ABSTRACT

In this study, we performed an extensive exploration of the ligand entry mechanism for members of the steroid nuclear hormone receptor family (androgen receptor, estrogen receptor α, glucocorticoid receptor, mineralocorticoid receptor, and progesterone receptor) and their endogenous ligands. The exploration revealed a shared entry path through the helix 3, 7, and 11 regions. Examination of the x-ray structures of the receptor-ligand complexes further showed two distinct folds of the helix 6-7 region, classified as "open" and "closed", which could potentially affect ligand binding. To improve sampling of the helix 6-7 loop, we incorporated motion modes based on principal component analysis of existing crystal structures of the receptors and applied them to the protein-ligand sampling. A detailed comparison with the anisotropic network model (an elastic network model) highlights the importance of flexibility in the entrance region. While the binding (interaction) energy of individual simulations can be used to score different ligands, extensive sampling further allows us to predict absolute binding free energies and analyze reaction kinetics using Markov state models and Perron-cluster cluster analysis, respectively. The predicted relative binding free energies for three ligands binding to the progesterone receptor are in very good agreement with experimental results and the Perron-cluster cluster analysis highlighted the importance of a peripheral binding site. Our analysis revealed that the flexibility of the helix 3, 7, and 11 regions represents the most important factor for ligand binding. Furthermore, the hydrophobicity of the ligand influences the transition between the peripheral and the active binding site.


Subject(s)
Monte Carlo Method , Movement , Receptors, Cytoplasmic and Nuclear/metabolism , Kinetics , Ligands , Markov Chains , Models, Molecular , Protein Binding , Protein Conformation, alpha-Helical , Receptors, Cytoplasmic and Nuclear/chemistry , Thermodynamics , X-Rays
11.
Bioorg Med Chem Lett ; 27(3): 679-687, 2017 02 01.
Article in English | MEDLINE | ID: mdl-28017532

ABSTRACT

A novel class of potent PI3Kδ inhibitors with >1000-fold selectivity against other class I PI3K isoforms is described. Optimization of the substituents on a triazole aminopyrazine scaffold, emerging from an in-house PI3Kα program, turned moderately selective PI3Kδ compounds into highly potent and selective PI3Kδ inhibitors. These efforts resulted in a series of aminopyrazines with PI3Kδ IC50⩽1nM in the enzyme assay, some of the most selective PI3Kδ inhibitors published to date, with a cell potency in a JeKo-cell assay of 20-120nM.


Subject(s)
Class I Phosphatidylinositol 3-Kinases/antagonists & inhibitors , Enzyme Inhibitors/chemistry , Enzyme Inhibitors/pharmacology , Pyrazines/chemistry , Pyrazines/pharmacology , Animals , Binding Sites , Cell Line , Class I Phosphatidylinositol 3-Kinases/metabolism , Crystallography, X-Ray , Enzyme Activation/drug effects , Enzyme Inhibitors/metabolism , Half-Life , Humans , Inhibitory Concentration 50 , Molecular Dynamics Simulation , Protein Binding , Protein Isoforms/antagonists & inhibitors , Protein Isoforms/metabolism , Protein Structure, Tertiary , Pyrazines/metabolism , Rats , Structure-Activity Relationship , Triazoles/chemistry , Triazoles/pharmacology
12.
J Chem Inf Model ; 56(4): 774-87, 2016 04 25.
Article in English | MEDLINE | ID: mdl-26974351

ABSTRACT

Computer-aided drug design plays an important role in medicinal chemistry to obtain insights into molecular mechanisms and to prioritize design strategies. Although significant improvement has been made in structure based design, it still remains a key challenge to accurately model and predict induced fit mechanisms. Most of the current available techniques either do not provide sufficient protein conformational sampling or are too computationally demanding to fit an industrial setting. The current study presents a systematic and exhaustive investigation of predicting binding modes for a range of systems using PELE (Protein Energy Landscape Exploration), an efficient and fast protein-ligand sampling algorithm. The systems analyzed (cytochrome P, kinase, protease, and nuclear hormone receptor) exhibit different complexities of ligand induced fit mechanisms and protein dynamics. The results are compared with results from classical molecular dynamics simulations and (induced fit) docking. This study shows that ligand induced side chain rearrangements and smaller to medium backbone movements are captured well in PELE. Large secondary structure rearrangements, however, remain challenging for all employed techniques. Relevant binding modes (ligand heavy atom RMSD < 1.0 Å) can be obtained by the PELE method within a few hours of simulation, positioning PELE as a tool applicable for rapid drug design cycles.


Subject(s)
Computer-Aided Design , Drug Design , Humans , Ligands , Molecular Docking Simulation , Molecular Dynamics Simulation , Protein Binding , Protein Conformation
14.
Drug Discov Today ; 29(4): 103945, 2024 Apr.
Article in English | MEDLINE | ID: mdl-38460568

ABSTRACT

Design-Make-Test-Analyse (DMTA) is the discovery cycle through which molecules are designed, synthesised, and assayed to produce data that in turn are analysed to inform the next iteration. The process is repeated until viable drug candidates are identified, often requiring many cycles before reaching a sweet spot. The advent of artificial intelligence (AI) and cloud computing presents an opportunity to innovate drug discovery to reduce the number of cycles needed to yield a candidate. Here, we present the Predictive Insight Platform (PIP), a cloud-native modelling platform developed at AstraZeneca. The impact of PIP in each step of DMTA, as well as its architecture, integration, and usage, are discussed and used to provide insights into the future of drug discovery.


Subject(s)
Artificial Intelligence , Biological Assay , Drug Discovery
15.
J Med Chem ; 66(12): 7730-7755, 2023 06 22.
Article in English | MEDLINE | ID: mdl-37285219

ABSTRACT

It is axiomatic in medicinal chemistry that optimization of the potency of a small molecule at a macromolecular target requires complementarity between the ligand and target. In order to minimize the conformational penalty on binding, both enthalpically and entropically, it is therefore preferred to have the ligand preorganized in the bound conformation. In this Perspective, we highlight the role of allylic strain in controlling conformational preferences. Allylic strain was originally described for carbon-based allylic systems, but the same principles apply to other types of structure with sp2 or pseudo-sp2 arrangements. These systems include benzylic (including heteroaryl methyl) positions, amides, N-aryl groups, aryl ethers, and nucleotides. We have derived torsion profiles from small molecule X-ray structures for these systems. Through multiple examples, we show how these effects have been applied in drug discovery and how they can be used prospectively to influence conformation in the design process.


Subject(s)
Chemistry, Pharmaceutical , Drug Discovery , Ligands , Molecular Conformation , Amides/chemistry
16.
J Cheminform ; 15(1): 75, 2023 Aug 30.
Article in English | MEDLINE | ID: mdl-37649050

ABSTRACT

Siamese networks, representing a novel class of neural networks, consist of two identical subnetworks sharing weights but receiving different inputs. Here we present a similarity-based pairing method for generating compound pairs to train Siamese neural networks for regression tasks. In comparison with the conventional exhaustive pairing, it reduces the algorithm complexity from O(n2) to O(n). It also results in a better prediction performance consistently on the three physicochemical datasets, using a multilayer perceptron with the circular fingerprint as a proof of concept. We further include into a Siamese neural network the transformer-based Chemformer, which extracts task-specific features from the simplified molecular-input line-entry system representation of compounds. Additionally, we propose a means to measure the prediction uncertainty by utilizing the variance in predictions from a set of reference compounds. Our results demonstrate that the high prediction accuracy correlates with the high confidence. Finally, we investigate implications of the similarity property principle in machine learning.

17.
Chem Sci ; 14(25): 7057-7067, 2023 Jun 28.
Article in English | MEDLINE | ID: mdl-37389247

ABSTRACT

Understanding allosteric regulation in biomolecules is of great interest to pharmaceutical research and computational methods emerged during the last decades to characterize allosteric coupling. However, the prediction of allosteric sites in a protein structure remains a challenging task. Here, we integrate local binding site information, coevolutionary information, and information on dynamic allostery into a structure-based three-parameter model to identify potentially hidden allosteric sites in ensembles of protein structures with orthosteric ligands. When tested on five allosteric proteins (LFA-1, p38-α, GR, MAT2A, and BCKDK), the model successfully ranked all known allosteric pockets in the top three positions. Finally, we identified a novel druggable site in MAT2A confirmed by X-ray crystallography and SPR and a hitherto unknown druggable allosteric site in BCKDK validated by biochemical and X-ray crystallography analyses. Our model can be applied in drug discovery to identify allosteric pockets.

18.
Chem Res Toxicol ; 25(10): 2236-52, 2012 Oct 15.
Article in English | MEDLINE | ID: mdl-22946514

ABSTRACT

The metabolism of aromatic and heteroaromatic amines (ArNH2) results in nitrenium ions (ArNH⁺) that modify nucleobases of DNA, primarily deoxyguanosine (dG), by forming dG-C8 adducts. The activated amine nitrogen in ArNH⁺ reacts with the C8 of dG, which gives rise to mutations in DNA. For the most mutagenic ArNH2, including the majority of known genotoxic carcinogens, the stability of ArNH⁺ is of intermediate magnitude. To understand the origin of this observation as well as the specificity of reactions of ArNH⁺ with guanines in DNA, we investigated the chemical reactivity of the metabolically activated forms of ArNH2, that is, ArNHOH and ArNHOAc, toward 9-methylguanine by DFT calculations. The chemical reactivity of these forms is determined by the rate constants of two consecutive reactions leading to cationic guanine intermediates. The formation of ArNH⁺ accelerates with resonance stabilization of ArNH⁺, whereas the formed ArNH⁺ reacts with guanine derivatives with the constant diffusion-limited rate until the reaction slows down when ArNH⁺ is about 20 kcal/mol more stable than PhNH⁺. At this point, ArNHOH and ArNHOAc show maximum reactivity. The lowest activation energy of the reaction of ArNH⁺ with 9-methylguanine corresponds to the charge-transfer π-stacked transition state (π-TS) that leads to the direct formation of the C8 intermediate. The predicted activation barriers of this reaction match the observed absolute rate constants for a number of ArNH⁺. We demonstrate that the mutagenic potency of ArNH2 correlates with the rate of formation and the chemical reactivity of the metabolically activated forms toward the C8 atom of dG. On the basis of geometric consideration of the π-TS complex made of genotoxic compounds with long aromatic systems, we propose that precovalent intercalation in DNA is not an essential step in the genotoxicity pathway of ArNH2. The mechanism-based reasoning suggests rational design strategies to avoid genotoxicity of ArNH2 primarily by preventing N-hydroxylation of ArNH2.


Subject(s)
Amines/metabolism , DNA Adducts/metabolism , DNA/metabolism , Guanine/analogs & derivatives , Hydrocarbons, Aromatic/metabolism , Mutagens/metabolism , Amines/chemistry , DNA/chemistry , DNA Adducts/chemistry , Guanine/chemistry , Guanine/metabolism , Hydrocarbons, Aromatic/chemistry , Models, Molecular , Mutagens/chemistry , Thermodynamics
19.
J Chem Inf Model ; 52(6): 1480-9, 2012 Jun 25.
Article in English | MEDLINE | ID: mdl-22639789

ABSTRACT

Patent specifications are one of many information sources needed to progress drug discovery projects. Understanding compound prior art and novelty checking, validation of biological assays, and identification of new starting points for chemical explorations are a few areas where patent analysis is an important component. Cheminformatics methods can be used to facilitate the identification of so-called key compounds in patent specifications. Such methods, relying on structural information extracted from documents by expert curation or text mining, can complement or in some cases replace the traditional manual approach of searching for clues in the text. This paper describes and compares three different methods for the automatic prediction of key compounds in patent specifications using structural information alone. For this data set, the cluster seed analysis described by Hattori et al. (Hattori, K.; Wakabayashi, H.; Tamaki, K. Predicting key example compounds in competitors' patent applications using structural information alone. J. Chem. Inf. Model.2008, 48, 135-142) is superior in terms of prediction accuracy with 26 out of 48 drugs (54%) correctly predicted from their corresponding patents. Nevertheless, the two new methods, based on frequency of R-groups (FOG) and maximum common substructure (MCS) similarity measures, show significant advantages due to their inherent ability to visualize relevant structural features. The results of the FOG method can be enhanced by manual selection of the scaffolds used in the analysis. Finally, a successful example of applying FOG analysis for designing potent ATP-competitive AXL kinase inhibitors with improved properties is described.


Subject(s)
Drug Discovery , Molecular Structure , Patents as Topic
20.
J Cheminform ; 14(1): 18, 2022 Mar 28.
Article in English | MEDLINE | ID: mdl-35346368

ABSTRACT

Molecular optimization aims to improve the drug profile of a starting molecule. It is a fundamental problem in drug discovery but challenging due to (i) the requirement of simultaneous optimization of multiple properties and (ii) the large chemical space to explore. Recently, deep learning methods have been proposed to solve this task by mimicking the chemist's intuition in terms of matched molecular pairs (MMPs). Although MMPs is a widely used strategy by medicinal chemists, it offers limited capability in terms of exploring the space of structural modifications, therefore does not cover the complete space of solutions. Often more general transformations beyond the nature of MMPs are feasible and/or necessary, e.g. simultaneous modifications of the starting molecule at different places including the core scaffold. This study aims to provide a general methodology that offers more general structural modifications beyond MMPs. In particular, the same Transformer architecture is trained on different datasets. These datasets consist of a set of molecular pairs which reflect different types of transformations. Beyond MMP transformation, datasets reflecting general structural changes are constructed from ChEMBL based on two approaches: Tanimoto similarity (allows for multiple modifications) and scaffold matching (allows for multiple modifications but keep the scaffold constant) respectively. We investigate how the model behavior can be altered by tailoring the dataset while using the same model architecture. Our results show that the models trained on differently prepared datasets transform a given starting molecule in a way that it reflects the nature of the dataset used for training the model. These models could complement each other and unlock the capability for the chemists to pursue different options for improving a starting molecule.

SELECTION OF CITATIONS
SEARCH DETAIL