Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 43
Filter
Add more filters

Publication year range
1.
J Chem Inf Model ; 64(7): 2331-2344, 2024 Apr 08.
Article in English | MEDLINE | ID: mdl-37642660

ABSTRACT

Federated multipartner machine learning has been touted as an appealing and efficient method to increase the effective training data volume and thereby the predictivity of models, particularly when the generation of training data is resource-intensive. In the landmark MELLODDY project, indeed, each of ten pharmaceutical companies realized aggregated improvements on its own classification or regression models through federated learning. To this end, they leveraged a novel implementation extending multitask learning across partners, on a platform audited for privacy and security. The experiments involved an unprecedented cross-pharma data set of 2.6+ billion confidential experimental activity data points, documenting 21+ million physical small molecules and 40+ thousand assays in on-target and secondary pharmacodynamics and pharmacokinetics. Appropriate complementary metrics were developed to evaluate the predictive performance in the federated setting. In addition to predictive performance increases in labeled space, the results point toward an extended applicability domain in federated learning. Increases in collective training data volume, including by means of auxiliary data resulting from single concentration high-throughput and imaging assays, continued to boost predictive performance, albeit with a saturating return. Markedly higher improvements were observed for the pharmacokinetics and safety panel assay-based task subsets.


Subject(s)
Benchmarking , Quantitative Structure-Activity Relationship , Biological Assay , Machine Learning
2.
Chem Res Toxicol ; 36(8): 1238-1247, 2023 08 21.
Article in English | MEDLINE | ID: mdl-37556769

ABSTRACT

Drug-induced liver injury (DILI) is an important safety concern and a major reason to remove a drug from the market. Advancements in recent machine learning methods have led to a wide range of in silico models for DILI predictive methods based on molecule chemical structures (fingerprints). Existing publicly available DILI data sets used for model building are based on the interpretation of drug labels or patient case reports, resulting in a typical binary clinical DILI annotation. We developed a novel phenotype-based annotation to process hepatotoxicity information extracted from repeated dose in vivo preclinical toxicology studies using INHAND annotation to provide a more informative and reliable data set for machine learning algorithms. This work resulted in a data set of 430 unique compounds covering diverse liver pathology findings which were utilized to develop multiple DILI prediction models trained on the publicly available data (TG-GATEs) using the compound's fingerprint. We demonstrate that the TG-GATEs compounds DILI labels can be predicted well and how the differences between TG-GATEs and the external test compounds (Johnson & Johnson) impact the model generalization performance.


Subject(s)
Chemical and Drug Induced Liver Injury , Drug-Related Side Effects and Adverse Reactions , Humans , Algorithms , Machine Learning , Computer Simulation
3.
Chem Res Toxicol ; 36(7): 1028-1036, 2023 07 17.
Article in English | MEDLINE | ID: mdl-37327474

ABSTRACT

The search for chemical hit material is a lengthy and increasingly expensive drug discovery process. To improve it, ligand-based quantitative structure-activity relationship models have been broadly applied to optimize primary and secondary compound properties. Although these models can be deployed as early as the stage of molecule design, they have a limited applicability domain─if the structures of interest differ substantially from the chemical space on which the model was trained, a reliable prediction will not be possible. Image-informed ligand-based models partly solve this shortcoming by focusing on the phenotype of a cell caused by small molecules, rather than on their structure. While this enables chemical diversity expansion, it limits the application to compounds physically available and imaged. Here, we employ an active learning approach to capitalize on both of these methods' strengths and boost the model performance of a mitochondrial toxicity assay (Glu/Gal). Specifically, we used a phenotypic Cell Painting screen to build a chemistry-independent model and adopted the results as the main factor in selecting compounds for experimental testing. With the additional Glu/Gal annotation for selected compounds we were able to dramatically improve the chemistry-informed ligand-based model with respect to the increased recognition of compounds from a 10% broader chemical space.


Subject(s)
Deep Learning , Quantitative Structure-Activity Relationship , Ligands , Drug Discovery/methods
4.
Cytometry A ; 95(3): 279-289, 2019 03.
Article in English | MEDLINE | ID: mdl-30536810

ABSTRACT

Daratumumab is a CD38-targeted human monoclonal antibody with direct anti-myeloma cell mechanisms of action. Flow cytometry in relapsed and/or refractory multiple myeloma (RRMM) patients treated with daratumumab revealed cytotoxic T-cell expansion and reduction of immune-suppressive populations, suggesting immune modulation as an additional mechanism of action. Here, we performed an in-depth analysis of the effects of daratumumab on immune-cell subpopulations using high-dimensional mass cytometry. Whole-blood and bone-marrow baseline and on-treatment samples from RRMM patients who participated in daratumumab monotherapy studies (SIRIUS and GEN501) were evaluated with high-throughput immunophenotyping. In daratumumab-treated patients, the intensity of CD38 marker expression decreased on many immune cells in SIRIUS whole-blood samples. Natural killer (NK) cells were depleted with daratumumab, with remaining NK cells showing increased CD69 and CD127, decreased CD45RA, and trends for increased CD25, CD27, and CD137 and decreased granzyme B. Immune-suppressive population depletion paralleled previous findings, and a newly observed reduction in CD38+ basophils was seen in patients who received monotherapy. After 2 months of daratumumab, the T-cell population in whole-blood samples from responders shifted to a CD8 prevalence with higher granzyme B positivity (P = 0.017), suggesting increased killing capacity and supporting monotherapy-induced CD8+ T-cell activation. High-throughput cytometry immune profiling confirms and builds upon previous flow cytometry data, including comparable CD38 marker intensity on plasma cells, NK cells, monocytes, and B/T cells. Interestingly, a shift toward cytolytic granzyme B+ T cells was also observed and supports adaptive responses in patients that may contribute to depth of response. © 2018 The Authors. Cytometry Part A published by Wiley Periodicals, Inc. on behalf of International Society for Advancement of Cytometry.


Subject(s)
ADP-ribosyl Cyclase 1/immunology , Antibodies, Monoclonal/therapeutic use , Antineoplastic Agents/therapeutic use , Killer Cells, Natural/drug effects , Killer Cells, Natural/immunology , Multiple Myeloma/drug therapy , Multiple Myeloma/immunology , Antigens, Differentiation, T-Lymphocyte/metabolism , Basophils/cytology , Basophils/drug effects , Basophils/immunology , Bone Marrow Cells/cytology , Bone Marrow Cells/immunology , CD4-Positive T-Lymphocytes/cytology , CD4-Positive T-Lymphocytes/immunology , CD8-Positive T-Lymphocytes/cytology , CD8-Positive T-Lymphocytes/immunology , Flow Cytometry , Granzymes/metabolism , Humans , Immunophenotyping , Killer Cells, Natural/cytology , Multiple Myeloma/blood , Multiple Myeloma/metabolism , Recurrence
5.
J Hepatol ; 62(5): 1008-14, 2015 May.
Article in English | MEDLINE | ID: mdl-25445400

ABSTRACT

BACKGROUND & AIMS: Simeprevir is an oral hepatitis C virus (HCV) NS3/4A protease inhibitor approved for treatment of chronic HCV infection. Baseline NS3 polymorphisms in all patients and emerging mutations in patients who failed to achieve sustained virologic response (SVR) with simeprevir plus peginterferon/ribavirin (PR) in Phase IIb/III studies are described. METHODS: Baseline sequencing data were available for 2007 genotype 1 (GT1)-infected patients. Post-baseline data were available for 197/245 simeprevir-treated patients who did not achieve SVR. In vitro simeprevir susceptibility was assessed in a transient replicon assay as site-directed mutants or in chimeric replicons with patient-derived NS3 protease sequences. RESULTS: Baseline NS3 polymorphisms at positions associated with reduced in vitro susceptibility to simeprevir (43, 80, 122, 155, 156, and/or 168; EC50 fold change >2.0) were uncommon (1.3% [26/2007]), with the exception of Q80K, which confers ∼10-fold reduction in simeprevir activity in vitro (13.7% [274/2007]; GT1a 29.5% [269/911], GT1b 0.5% [5/1096]). Baseline Q80K had minor effect on initial response to simeprevir/PR, but resulted in lower SVR rates. Overall, 91.4% of simeprevir-treated patients [180/197] without SVR had emerging mutations at NS3 positions 80, 122, 155, and/or 168 at failure (mainly R155K in GT1a with and without Q80K, and D168V in GT1b), conferring high-level resistance in vitro (EC50 fold change >50). Emerging mutations were no longer detectable by population sequencing at study end in 50% [90/180] of patients (median follow-up 28.4weeks). CONCLUSIONS: Simeprevir treatment failure was usually associated with emerging high-level resistance mutations, which became undetectable over time in half of the patients.


Subject(s)
Hepacivirus , Hepatitis C, Chronic , Interferon-alpha/pharmacology , Polyethylene Glycols/pharmacology , Ribavirin/pharmacology , Simeprevir/pharmacology , Viral Nonstructural Proteins , Antiviral Agents/pharmacology , Double-Blind Method , Drug Resistance, Viral/genetics , Drug Therapy, Combination/methods , Female , Hepacivirus/drug effects , Hepacivirus/genetics , Hepatitis C, Chronic/drug therapy , Hepatitis C, Chronic/virology , Humans , Male , Middle Aged , Polymorphism, Genetic , Recombinant Proteins/pharmacology , Time Factors , Treatment Failure , Viral Nonstructural Proteins/antagonists & inhibitors , Viral Nonstructural Proteins/genetics
6.
Sci Data ; 11(1): 742, 2024 Jul 07.
Article in English | MEDLINE | ID: mdl-38972891

ABSTRACT

We here introduce the Aquamarine (AQM) dataset, an extensive quantum-mechanical (QM) dataset that contains the structural and electronic information of 59,783 low-and high-energy conformers of 1,653 molecules with a total number of atoms ranging from 2 to 92 (mean: 50.9), and containing up to 54 (mean: 28.2) non-hydrogen atoms. To gain insights into the solvent effects as well as collective dispersion interactions for drug-like molecules, we have performed QM calculations supplemented with a treatment of many-body dispersion (MBD) interactions of structures and properties in the gas phase and implicit water. Thus, AQM contains over 40 global and local physicochemical properties (including ground-state and response properties) per conformer computed at the tightly converged PBE0+MBD level of theory for gas-phase molecules, whereas PBE0+MBD with the modified Poisson-Boltzmann (MPB) model of water was used for solvated molecules. By addressing both molecule-solvent and dispersion interactions, AQM dataset can serve as a challenging benchmark for state-of-the-art machine learning methods for property modeling and de novo generation of large (solvated) molecules with pharmaceutical and biological relevance.


Subject(s)
Quantum Theory , Solvents , Solvents/chemistry , Pharmaceutical Preparations/chemistry , Water/chemistry , Molecular Conformation
7.
Biochem J ; 443(1): 173-83, 2012 Apr 01.
Article in English | MEDLINE | ID: mdl-22242915

ABSTRACT

P-Rex1 is a GEF (guanine-nucleotide-exchange factor) for the small G-protein Rac that is activated by PIP3 (phosphatidylinositol 3,4,5-trisphosphate) and Gßγ subunits and inhibited by PKA (protein kinase A). In the present study we show that PP1α (protein phosphatase 1α) binds P-Rex1 through an RVxF-type docking motif. PP1α activates P-Rex1 directly in vitro, both independently of and additively to PIP3 and Gßγ. PP1α also substantially activates P-Rex1 in vivo, both in basal and PDGF (platelet-derived growth factor)- or LPA (lysophosphatidic acid)-stimulated cells. The phosphatase activity of PP1α is required for P-Rex1 activation. PP1ß, a close homologue of PP1α, is also able to activate P-Rex1, but less effectively. PP1α stimulates P-Rex1-mediated Rac-dependent changes in endothelial cell morphology. MS analysis of wild-type P-Rex1 and a PP1α-binding-deficient mutant revealed that endogenous PP1α dephosphorylates P-Rex1 on at least three residues, Ser834, Ser1001 and Ser1165. Site-directed mutagenesis of Ser1165 to alanine caused activation of P-Rex1 to a similar degree as did PP1α, confirming Ser1165 as a dephosphorylation site important in regulating P-Rex1 Rac-GEF activity. In summary, we have identified a novel mechanism for direct activation of P-Rex1 through PP1α-dependent dephosphorylation.


Subject(s)
Guanine Nucleotide Exchange Factors/chemistry , Protein Phosphatase 1/chemistry , Amino Acid Motifs , Animals , Aorta/cytology , Cell Shape , Cells, Cultured , Endothelial Cells/drug effects , Endothelial Cells/metabolism , Endothelial Cells/physiology , Humans , Phosphorylation , Platelet-Derived Growth Factor/pharmacology , Platelet-Derived Growth Factor/physiology , Protein Binding , Protein Phosphatase 1/metabolism , Protein Structure, Tertiary , Rabbits , Swine , rac1 GTP-Binding Protein/metabolism
8.
J Antimicrob Chemother ; 67(10): 2327-37, 2012 Oct.
Article in English | MEDLINE | ID: mdl-22723600

ABSTRACT

OBJECTIVES: Drug-resistant minority viral variants can pre-exist in the viral quasispecies of chronically infected hepatitis C virus (HCV) patients and can emerge gradually upon drug treatment. When heterogeneous clinical samples are tested for drug susceptibility in a chimeric replicon-based phenotyping assay, biphasic dose-response curves may be observed. The effect of drug-resistant minority viral variants on the biphasic phenotype of mixtures was assessed in detail. METHODS: Susceptibility of mutant/wild-type mixtures containing minorities of NS3 mutants with different replication capacities and susceptibilities to protease inhibitors were tested in a transient replicon assay. The contribution of both variants in the mixture to the overall replication level was described with an E(max) model. RESULTS: The 90% and 99% effective concentrations (EC(90) and EC(99), respectively) provide a more accurate measure of the susceptibility of the population than the determination of EC(50) values. Reduced susceptibility at the EC(50) level correlated with the replication capacity of the NS3 mutant in the mixture. Using replication-enhanced mutant/wild-type mixtures demonstrated that the relative difference between the replication capacity of the variants present in the mixture results in biphasic dose-response curves. Modelling revealed that in mixtures containing wild-type and resistant variants with low replication capacity, the contributions of the wild-type variants are higher than expected from the replication level of the replicons transfected alone. CONCLUSIONS: Differences in the replication capacity of variants present in HCV replicon-based phenotype assays can lead to biphasic dose-response curves. Using EC(90) or EC(99) values increases the sensitivity of the assay to minor variants.


Subject(s)
Antiviral Agents/pharmacology , Drug Resistance, Viral , Hepacivirus/drug effects , Hepacivirus/physiology , Hepatitis C, Chronic/virology , Virus Replication , Dose-Response Relationship, Drug , Hepacivirus/isolation & purification , Humans , Microbial Sensitivity Tests/methods , Mutant Proteins/genetics , Protease Inhibitors/pharmacology , Replicon , Viral Nonstructural Proteins/genetics
9.
Nucleic Acids Res ; 38(18): 6135-47, 2010 Oct.
Article in English | MEDLINE | ID: mdl-20484370

ABSTRACT

Lens epithelium-derived growth factor/p75 (LEDGF/p75) is a transcriptional coactivator involved in stress response, autoimmune disease, cancer and HIV replication. A fusion between the nuclear pore protein NUP98 and LEDGF/p75 has been found in human acute and chronic myeloid leukemia and association of LEDGF/p75 with mixed-lineage leukemia (MLL)/menin is critical for leukemic transformation. During lentiviral replication, LEDGF/p75 tethers the pre-integration complex to the host chromatin resulting in a bias of integration into active transcription units (TUs). The consensus function of LEDGF/p75 is tethering of cargos to chromatin. In this regard, we determined the LEDGF/p75 chromatin binding profile. To this purpose, we used DamID technology and focused on the highly annotated ENCODE (Encyclopedia of DNA Elements) regions. LEDGF/p75 primarily binds downstream of the transcription start site of active TUs in agreement with the enrichment of HIV-1 integration sites at these locations. We show that LEDGF/p75 binding is not restricted to stress response elements in the genome, and correlation analysis with more than 200 genomic features revealed an association with active chromatin markers, such as H3 and H4 acetylation, H3K4 monomethylation and RNA polymerase II binding. Interestingly, some associations did not correlate with HIV-1 integration indicating that not all LEDGF/p75 complexes on the chromosome are amenable to HIV-1 integration.


Subject(s)
Chromatin/genetics , Intercellular Signaling Peptides and Proteins/metabolism , Transcription, Genetic , Binding Sites , Cell Line , Chromatin/metabolism , DNA/chemistry , HIV-1/genetics , Humans , Intercellular Signaling Peptides and Proteins/genetics , Recombinant Fusion Proteins/metabolism , Transcription Initiation Site , Virus Integration
10.
Proc Natl Acad Sci U S A ; 106(44): 18533-8, 2009 Nov 03.
Article in English | MEDLINE | ID: mdl-19846779

ABSTRACT

Sarco(endo)plasmic reticulum Ca(2+) ATPase (SERCA) Ca(2+) transporters pump cytosolic Ca(2+) into the endoplasmic reticulum, maintaining a Ca(2+) gradient that controls vital cell functions ranging from proliferation to death. To meet the physiological demand of the cell, SERCA activity is regulated by adjusting the affinity for Ca(2+) ions. Of all SERCA isoforms, the housekeeping SERCA2b isoform displays the highest Ca(2+) affinity because of a unique C-terminal extension (2b-tail). Here, an extensive structure-function analysis of SERCA2b mutants and SERCA1a2b chimera revealed how the 2b-tail controls Ca(2+) affinity. Its transmembrane (TM) segment (TM11) and luminal extension functionally cooperate and interact with TM7/TM10 and luminal loops of SERCA2b, respectively. This stabilizes the Ca(2+)-bound E1 conformation and alters Ca(2+)-transport kinetics, which provides the rationale for the higher apparent Ca(2+) affinity. Based on our NMR structure of TM11 and guided by mutagenesis results, a structural model was developed for SERCA2b that supports the proposed 2b-tail mechanism and is reminiscent of the interaction between the alpha- and beta-subunits of Na(+),K(+)-ATPase. The 2b-tail interaction site may represent a novel target to increase the Ca(2+) affinity of malfunctioning SERCA2a in the failing heart to improve contractility.


Subject(s)
Calcium/metabolism , Sarcoplasmic Reticulum Calcium-Transporting ATPases/chemistry , Sarcoplasmic Reticulum Calcium-Transporting ATPases/metabolism , Amino Acid Sequence , Animals , Binding Sites , COS Cells , Chlorocebus aethiops , Enzyme Stability , Kinetics , Models, Molecular , Molecular Sequence Data , Protein Binding , Protein Structure, Secondary , Protein Structure, Tertiary , Recombinant Proteins/chemistry , Recombinant Proteins/metabolism , Structure-Activity Relationship
11.
Mol Inform ; 41(4): e2100138, 2022 04.
Article in English | MEDLINE | ID: mdl-34726834

ABSTRACT

In this paper, we compare the most popular Atom-to-Atom Mapping (AAM) tools: ChemAxon,[1] Indigo,[2] RDTool,[3] NameRXN (NextMove),[4] and RXNMapper[5] which implement different AAM algorithms. An open-source RDTool program was optimized, and its modified version ("new RDTool") was considered together with several consensus mapping strategies. The Condensed Graph of Reaction approach was used to calculate chemical distances and develop the "AAM fixer" algorithm for an automatized correction of erroneous mapping. The benchmarking calculations were performed on a Golden dataset containing 1851 manually mapped and curated reactions. The best performing RXNMapper program together with the AMM Fixer was applied to map the USPTO database. The Golden dataset, mapped USPTO and optimized RDTool are available in the GitHub repository https://github.com/Laboratoire-de-Chemoinformatique.


Subject(s)
Benchmarking , Biochemical Phenomena , Algorithms , Databases, Factual
12.
Nat Rev Drug Discov ; 20(2): 145-159, 2021 02.
Article in English | MEDLINE | ID: mdl-33353986

ABSTRACT

Image-based profiling is a maturing strategy by which the rich information present in biological images is reduced to a multidimensional profile, a collection of extracted image-based features. These profiles can be mined for relevant patterns, revealing unexpected biological activity that is useful for many steps in the drug discovery process. Such applications include identifying disease-associated screenable phenotypes, understanding disease mechanisms and predicting a drug's activity, toxicity or mechanism of action. Several of these applications have been recently validated and have moved into production mode within academia and the pharmaceutical industry. Some of these have yielded disappointing results in practice but are now of renewed interest due to improved machine-learning strategies that better leverage image-based information. Although challenges remain, novel computational technologies such as deep learning and single-cell methods that better capture the biological information in images hold promise for accelerating drug discovery.


Subject(s)
Drug Discovery/methods , Drug Industry/methods , Image Processing, Computer-Assisted/methods , Machine Learning , Animals , Computational Biology/methods , Computational Biology/trends , Drug Discovery/trends , Drug Industry/trends , High-Throughput Screening Assays/methods , High-Throughput Screening Assays/trends , Humans , Image Processing, Computer-Assisted/trends , Machine Learning/trends
13.
Future Med Chem ; 13(19): 1639-1654, 2021 10.
Article in English | MEDLINE | ID: mdl-34528444

ABSTRACT

Background: Accurate prediction of absorption, distribution, metabolism and excretion (ADME) properties can facilitate the identification of promising drug candidates. Methodology & Results: The authors present the Janssen generic Target Product Profile (gTPP) model, which predicts 18 early ADME properties, employs a graph convolutional neural network algorithm and was trained on between 1000-10,000 internal data points per predicted parameter. gTPP demonstrated stronger predictive power than pretrained commercial ADME models and automatic model builders. Through a novel logging method, the authors report gTPP usage for more than 200 Janssen drug discovery scientists. Conclusion: The investigators successfully enabled the rapid and systematic implementation of predictive ML tools across a drug discovery pipeline in all therapeutic areas. This experience provides useful guidance for other large-scale AI/ML deployment efforts.


Subject(s)
Cytochrome P-450 Enzyme Inhibitors/pharmacology , Cytochrome P-450 Enzyme System/metabolism , Drug Development , Cytochrome P-450 Enzyme Inhibitors/chemistry , Humans , Models, Molecular
14.
Mol Inform ; 40(12): e2100119, 2021 12.
Article in English | MEDLINE | ID: mdl-34427989

ABSTRACT

The quality of experimental data for chemical reactions is a critical consideration for any reaction-driven study. However, the curation of reaction data has not been extensively discussed in the literature so far. Here, we suggest a 4 steps protocol that includes the curation of individual structures (reactants and products), chemical transformations, reaction conditions and endpoints. Its implementation in Python3 using CGRTools toolkit has been used to clean three popular reaction databases Reaxys, USPTO and Pistachio. The curated USPTO database is available in the GitHub repository (Laboratoire-de-Chemoinformatique/Reaction_Data_Cleaning).


Subject(s)
Data Curation , Databases, Factual , Reference Standards
15.
Leukemia ; 35(2): 573-584, 2021 02.
Article in English | MEDLINE | ID: mdl-32457357

ABSTRACT

CD38-targeted antibody, daratumumab, is approved for the treatment of multiple myeloma (MM). Phase 1/2 studies GEN501/SIRIUS revealed a novel immunomodulatory mechanism of action (MOA) of daratumumab that enhanced the immune response, reducing natural killer (NK) cells without affecting efficacy or safety. We further evaluated daratumumab's effects on immune cells in whole blood samples of relapsed/refractory MM patients from both treatment arms of the phase 3 POLLUX study (lenalidomide/dexamethasone [Rd] or daratumumab plus Rd [D-Rd]) at baseline (D-Rd, 40; Rd, 45) and after 2 months on treatment (D-Rd, 31; Rd, 33) using cytometry by time-of-flight. We confirmed previous reports of NK cell reduction with D-Rd. Persisting NK cells were phenotypically distinct, with increased expression of HLA-DR, CD69, CD127, and CD27. The proportion of T cells increased preferentially in deep responders to D-Rd, with a higher proportion of CD8+ versus CD4+ T cells. The expansion of CD8+ T cells correlated with clonality, indicating generation of adaptive immune response with D-Rd. D-Rd resulted in a higher proportion of effector memory T cells versus Rd. D-Rd reduced immunosuppressive CD38+ regulatory T cells. This study confirms daratumumab's immunomodulatory MOA in combination with immunomodulatory drugs and provides further insight into immune cell changes and activation status following daratumumab-based therapy.


Subject(s)
Antineoplastic Combined Chemotherapy Protocols/therapeutic use , Biomarkers/analysis , Killer Cells, Natural/immunology , Multiple Myeloma/immunology , T-Lymphocytes, Regulatory/immunology , T-Lymphocytes/immunology , Antibodies, Monoclonal/administration & dosage , Dexamethasone/administration & dosage , Humans , Killer Cells, Natural/drug effects , Lenalidomide/administration & dosage , Multiple Myeloma/drug therapy , Multiple Myeloma/pathology , T-Lymphocytes/drug effects , T-Lymphocytes, Regulatory/drug effects
16.
J Cheminform ; 12(1): 26, 2020 Apr 19.
Article in English | MEDLINE | ID: mdl-33430964

ABSTRACT

Artificial intelligence (AI) is undergoing a revolution thanks to the breakthroughs of machine learning algorithms in computer vision, speech recognition, natural language processing and generative modelling. Recent works on publicly available pharmaceutical data showed that AI methods are highly promising for Drug Target prediction. However, the quality of public data might be different than that of industry data due to different labs reporting measurements, different measurement techniques, fewer samples and less diverse and specialized assays. As part of a European funded project (ExCAPE), that brought together expertise from pharmaceutical industry, machine learning, and high-performance computing, we investigated how well machine learning models obtained from public data can be transferred to internal pharmaceutical industry data. Our results show that machine learning models trained on public data can indeed maintain their predictive power to a large degree when applied to industry data. Moreover, we observed that deep learning derived machine learning models outperformed comparable models, which were trained by other machine learning algorithms, when applied to internal pharmaceutical company datasets. To our knowledge, this is the first large-scale study evaluating the potential of machine learning and especially deep learning directly at the level of industry-scale settings and moreover investigating the transferability of publicly learned target prediction models towards industrial bioactivity prediction pipelines.

17.
Sci Rep ; 10(1): 13262, 2020 08 06.
Article in English | MEDLINE | ID: mdl-32764586

ABSTRACT

Phenomic profiles are high-dimensional sets of readouts that can comprehensively capture the biological impact of chemical and genetic perturbations in cellular assay systems. Phenomic profiling of compound libraries can be used for compound target identification or mechanism of action (MoA) prediction and other applications in drug discovery. To devise an economical set of phenomic profiling assays, we assembled a library of 1,008 approved drugs and well-characterized tool compounds manually annotated to 218 unique MoAs, and we profiled each compound at four concentrations in live-cell, high-content imaging screens against a panel of 15 reporter cell lines, which expressed a diverse set of fluorescent organelle and pathway markers in three distinct cell lineages. For 41 of 83 testable MoAs, phenomic profiles accurately ranked the reference compounds (AUC-ROC ≥ 0.9). MoAs could be better resolved by screening compounds at multiple concentrations than by including replicates at a single concentration. Screening additional cell lineages and fluorescent markers increased the number of distinguishable MoAs but this effect quickly plateaued. There remains a substantial number of MoAs that were hard to distinguish from others under the current study's conditions. We discuss ways to close this gap, which will inform the design of future phenomic profiling efforts.


Subject(s)
Biological Products/pharmacology , Luminescent Proteins/genetics , Phenomics/methods , Small Molecule Libraries/pharmacology , A549 Cells , Cell Line , Drug Discovery , Gene Expression Regulation/drug effects , Hep G2 Cells , Humans , Luminescent Proteins/metabolism
18.
Chem Sci ; 9(24): 5441-5451, 2018 Jun 28.
Article in English | MEDLINE | ID: mdl-30155234

ABSTRACT

Deep learning is currently the most successful machine learning technique in a wide range of application areas and has recently been applied successfully in drug discovery research to predict potential drug targets and to screen for active molecules. However, due to (1) the lack of large-scale studies, (2) the compound series bias that is characteristic of drug discovery datasets and (3) the hyperparameter selection bias that comes with the high number of potential deep learning architectures, it remains unclear whether deep learning can indeed outperform existing computational methods in drug discovery tasks. We therefore assessed the performance of several deep learning methods on a large-scale drug discovery dataset and compared the results with those of other machine learning and target prediction methods. To avoid potential biases from hyperparameter selection or compound series, we used a nested cluster-cross-validation strategy. We found (1) that deep learning methods significantly outperform all competing methods and (2) that the predictive performance of deep learning is in many cases comparable to that of tests performed in wet labs (i.e., in vitro assays).

19.
Assay Drug Dev Technol ; 16(3): 162-176, 2018 04.
Article in English | MEDLINE | ID: mdl-29658791

ABSTRACT

By adding biological information, beyond the chemical properties and desired effect of a compound, uncharted compound areas and connections can be explored. In this study, we add transcriptional information for 31K compounds of Janssen's primary screening deck, using the HT L1000 platform and assess (a) the transcriptional connection score for generating compound similarities, (b) machine learning algorithms for generating target activity predictions, and (c) the scaffold hopping potential of the resulting hits. We demonstrate that the transcriptional connection score is best computed from the significant genes only and should be interpreted within its confidence interval for which we provide the stats. These guidelines help to reduce noise, increase reproducibility, and enable the separation of specific and promiscuous compounds. The added value of machine learning is demonstrated for the NR3C1 and HSP90 targets. Support Vector Machine models yielded balanced accuracy values ≥80% when the expression values from DDIT4 & SERPINE1 and TMEM97 & SPR were used to predict the NR3C1 and HSP90 activity, respectively. Combining both models resulted in 22 new and confirmed HSP90-independent NR3C1 inhibitors, providing two scaffolds (i.e., pyrimidine and pyrazolo-pyrimidine), which could potentially be of interest in the treatment of depression (i.e., inhibiting the glucocorticoid receptor (i.e., NR3C1), while leaving its chaperone, HSP90, unaffected). As such, the initial hit rate increased by a factor 300, as less, but more specific chemistry could be screened, based on the upfront computed activity predictions.


Subject(s)
HSP90 Heat-Shock Proteins/genetics , High-Throughput Screening Assays , Pyrazoles/pharmacology , Pyrimidines/pharmacology , Receptors, Glucocorticoid/genetics , Transcriptome , HSP90 Heat-Shock Proteins/metabolism , Humans , Receptors, Glucocorticoid/metabolism , Support Vector Machine
20.
Cell Chem Biol ; 25(5): 611-618.e3, 2018 05 17.
Article in English | MEDLINE | ID: mdl-29503208

ABSTRACT

In both academia and the pharmaceutical industry, large-scale assays for drug discovery are expensive and often impractical, particularly for the increasingly important physiologically relevant model systems that require primary cells, organoids, whole organisms, or expensive or rare reagents. We hypothesized that data from a single high-throughput imaging assay can be repurposed to predict the biological activity of compounds in other assays, even those targeting alternate pathways or biological processes. Indeed, quantitative information extracted from a three-channel microscopy-based screen for glucocorticoid receptor translocation was able to predict assay-specific biological activity in two ongoing drug discovery projects. In these projects, repurposing increased hit rates by 50- to 250-fold over that of the initial project assays while increasing the chemical structure diversity of the hits. Our results suggest that data from high-content screens are a rich source of information that can be used to predict and replace customized biological assays.


Subject(s)
Drug Repositioning/methods , Image Processing, Computer-Assisted/methods , Machine Learning , Neural Networks, Computer , Antineoplastic Agents/pharmacology , Cell Line, Tumor , High-Throughput Screening Assays/methods , Humans , Neoplasms/drug therapy
SELECTION OF CITATIONS
SEARCH DETAIL