Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 31
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Chem Sci ; 14(19): 4997-5005, 2023 May 17.
Artigo em Inglês | MEDLINE | ID: mdl-37206399

RESUMO

The lack of publicly available, large, and unbiased datasets is a key bottleneck for the application of machine learning (ML) methods in synthetic chemistry. Data from electronic laboratory notebooks (ELNs) could provide less biased, large datasets, but no such datasets have been made publicly available. The first real-world dataset from the ELNs of a large pharmaceutical company is disclosed and its relationship to high-throughput experimentation (HTE) datasets is described. For chemical yield predictions, a key task in chemical synthesis, an attributed graph neural network (AGNN) performs as well as or better than the best previous models on two HTE datasets for the Suzuki-Miyaura and Buchwald-Hartwig reactions. However, training the AGNN on an ELN dataset does not lead to a predictive model. The implications of using ELN data for training ML-based models are discussed in the context of yield predictions.

2.
J Med Chem ; 66(2): 1221-1238, 2023 01 26.
Artigo em Inglês | MEDLINE | ID: mdl-36607408

RESUMO

Probing multiple proprietary pharmaceutical libraries in parallel via virtual screening allowed rapid expansion of the structure-activity relationship (SAR) around hit compounds with moderate efficacy against Trypanosoma cruzi, the causative agent of Chagas Disease. A potency-improving scaffold hop, followed by elaboration of the SAR via design guided by the output of the phenotypic virtual screening efforts, identified two promising hit compounds 54 and 85, which were profiled further in pharmacokinetic studies and in an in vivo model of T. cruzi infection. Compound 85 demonstrated clear reduction of parasitemia in the in vivo setting, confirming the interest in this series of 2-(pyridin-2-yl)quinazolines as potential anti-trypanosome treatments.


Assuntos
Doença de Chagas , Tripanossomicidas , Trypanosoma cruzi , Humanos , Doença de Chagas/tratamento farmacológico , Quinazolinas/farmacologia , Quinazolinas/uso terapêutico , Relação Estrutura-Atividade , Tripanossomicidas/uso terapêutico , Tripanossomicidas/farmacocinética
3.
Chem Sci ; 13(41): 12087-12099, 2022 Oct 26.
Artigo em Inglês | MEDLINE | ID: mdl-36349112

RESUMO

For the discovery of new candidate molecules in the pharmaceutical industry, library synthesis is a critical step, in which library size, diversity, and time to synthesise are fundamental. In this work we propose stopped-flow synthesis as an intermediate alternative to traditional batch and flow chemistry approaches, suited for small molecule pharmaceutical discovery. This method exploits the advantages of both techniques enabling automated experimentation with access to high pressures and temperatures; flexibility of reaction times, with minimal use of reagents (µmol scale per reaction). In this study, we integrate a stopped-flow reactor into a high-throughput continuous platform designed for the synthesis of combinatory libraries with at-line reaction analysis. This approach allowed ∼900 reactions to be conducted in an accelerated timeframe (192 hours). The stopped flow approach used ∼10% of the reactants and solvents compared to a fully continuous approach. This methodology demonstrates a significantly improved synthesis success rate of smaller libraries by simplifying the implementation of cross-reaction optimisation strategies. The experimental datasets were used to train a feed-forward neural network (FFNN) model providing a framework to guide further experiments, which showed good model predictability and success when tested against an external set with fewer experiments. As a result, this work demonstrates that combining experimental automation with machine learning strategies can deliver optimised analyses and enhanced predictions, enabling more efficient drug discovery investigations across the design, make, test and analysis (DMTA) cycle.

4.
Mol Inform ; 41(8): e2100294, 2022 08.
Artigo em Inglês | MEDLINE | ID: mdl-35122702

RESUMO

We present machine learning models for predicting the chemical context for Buchwald-Hartwig coupling reactions, i. e., what chemicals to add to the reactants to give a productive reaction. Using reaction data from in-house electronic lab notebooks, we train two models: one based on single-label data and one based on multi-label data. Both models show excellent top-3 accuracy of approximately 90 %, which suggests strong predictivity. Furthermore, there seems to be an advantage of including multi-label data because the multi-label model shows higher accuracy and better sensitivity for the individual contexts than the single-label model. Although the models are performant, we also show that such models need to be re-trained periodically as there is a strong temporal characteristic to the usage of different contexts. Therefore, a model trained on historical data will decrease in usefulness with time as newer and better contexts emerge and replace older ones. We hypothesize that such significant transitions in the context-usage will likely affect any model predicting chemical contexts trained on historical data. Consequently, training context prediction models warrants careful planning of what data is used for training and how often the model needs to be re-trained.


Assuntos
Aprendizado de Máquina
5.
J Chem Inf Model ; 62(9): 2046-2063, 2022 05 09.
Artigo em Inglês | MEDLINE | ID: mdl-34460269

RESUMO

Because of the strong relationship between the desired molecular activity and its structural core, the screening of focused, core-sharing chemical libraries is a key step in lead optimization. Despite the plethora of current research focused on in silico methods for molecule generation, to our knowledge, no tool capable of designing such libraries has been proposed. In this work, we present a novel tool for de novo drug design called LibINVENT. It is capable of rapidly proposing chemical libraries of compounds sharing the same core while maximizing a range of desirable properties. To further help the process of designing focused libraries, the user can list specific chemical reactions that can be used for the library creation. LibINVENT is therefore a flexible tool for generating virtual chemical libraries for lead optimization in a broad range of scenarios. Additionally, the shared core ensures that the compounds in the library are similar, possess desirable properties, and can also be synthesized under the same or similar conditions. The LibINVENT code is freely available in our public repository at https://github.com/MolecularAI/Lib-INVENT. The code necessary for data preprocessing is further available at: https://github.com/MolecularAI/Lib-INVENT-dataset.


Assuntos
Desenho de Fármacos , Bibliotecas de Moléculas Pequenas , Bibliotecas de Moléculas Pequenas/química
6.
RSC Med Chem ; 12(3): 384-393, 2021 Jan 21.
Artigo em Inglês | MEDLINE | ID: mdl-34041487

RESUMO

An innovative pre-competitive virtual screening collaboration was engaged to validate and subsequently explore an imidazo[1,2-a]pyridine screening hit for visceral leishmaniasis. In silico probing of five proprietary pharmaceutical company libraries enabled rapid expansion of the hit chemotype, alleviating initial concerns about the core chemical structure while simultaneously improving antiparasitic activity and selectivity index relative to the background cell line. Subsequent hit optimization informed by the structure-activity relationship enabled by this virtual screening allowed thorough investigation of the pharmacophore, opening avenues for further improvement and optimization of the chemical series.

7.
Chem Sci ; 11(1): 154-168, 2020 Jan 07.
Artigo em Inglês | MEDLINE | ID: mdl-32110367

RESUMO

Computer Assisted Synthesis Planning (CASP) has gained considerable interest as of late. Herein we investigate a template-based retrosynthetic planning tool, trained on a variety of datasets consisting of up to 17.5 million reactions. We demonstrate that models trained on datasets such as internal Electronic Laboratory Notebooks (ELN), and the publicly available United States Patent Office (USPTO) extracts, are sufficient for the prediction of full synthetic routes to compounds of interest in medicinal chemistry. As such we have assessed the models on 1731 compounds from 41 virtual libraries for which experimental results were known. Furthermore, we show that accuracy is a misleading metric for assessment of the policy network, and propose that the number of successfully applied templates, in conjunction with the overall ability to generate full synthetic routes be examined instead. To this end we found that the specificity of the templates comes at the cost of generalizability, and overall model performance. This is supplemented by a comparison of the underlying datasets and their corresponding models.

8.
Front Pharmacol ; 10: 1303, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31749705

RESUMO

In recent years, the development of high-throughput screening (HTS) technologies and their establishment in an industrialized environment have given scientists the possibility to test millions of molecules and profile them against a multitude of biological targets in a short period of time, generating data in a much faster pace and with a higher quality than before. Besides the structure activity data from traditional bioassays, more complex assays such as transcriptomics profiling or imaging have also been established as routine profiling experiments thanks to the advancement of Next Generation Sequencing or automated microscopy technologies. In industrial pharmaceutical research, these technologies are typically established in conjunction with automated platforms in order to enable efficient handling of screening collections of thousands to millions of compounds. To exploit the ever-growing amount of data that are generated by these approaches, computational techniques are constantly evolving. In this regard, artificial intelligence technologies such as deep learning and machine learning methods play a key role in cheminformatics and bio-image analytics fields to address activity prediction, scaffold hopping, de novo molecule design, reaction/retrosynthesis predictions, or high content screening analysis. Herein we summarize the current state of analyzing large-scale compound data in industrial pharmaceutical research and describe the impact it has had on the drug discovery process over the last two decades, with a specific focus on deep-learning technologies.

9.
J Chem Inf Model ; 59(3): 1230-1237, 2019 03 25.
Artigo em Inglês | MEDLINE | ID: mdl-30726080

RESUMO

Iterative screening has emerged as a promising approach to increase the efficiency of high-throughput screening (HTS) campaigns in drug discovery. By learning from a subset of the compound library, inferences on what compounds to screen next can be made by predictive models. One of the challenges of iterative screening is to decide how many iterations to perform. This is mainly related to difficulties in estimating the prospective hit rate in any given iteration. In this article, a novel method based on Venn-ABERS predictors is proposed. The method provides accurate estimates of the number of hits retrieved in any given iteration during an HTS campaign. The estimates provide the necessary information to support the decision on the number of iterations needed to maximize the screening outcome. Thus, this method offers a prospective screening strategy for early-stage drug discovery.


Assuntos
Biologia Computacional/métodos , Avaliação Pré-Clínica de Medicamentos/métodos , Ensaios de Triagem em Larga Escala , Aprendizado de Máquina , Relação Quantitativa Estrutura-Atividade
10.
Drug Discov Today Technol ; 32-33: 65-72, 2019 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-33386096

RESUMO

Application of AI technologies in synthesis prediction has developed very rapidly in recent years. We attempt here to give a comprehensive summary on the latest advancement on retro-synthesis planning, forward synthesis prediction as well as quantum chemistry-based reaction prediction models. Besides an introduction on the AI/ML models for addressing various synthesis related problems, the sources of the reaction datasets used in model building is also covered. In addition to the predictive models, the robotics based high throughput experimentation technology will be another crucial factor for conducting synthesis in an automated fashion. Some state-of-the-art of high throughput experimentation practices carried out in the pharmaceutical industry are highlighted in this chapter to give the reader a sense of how future chemistry will be conducted to make compounds faster and cheaper.


Assuntos
Inteligência Artificial , Desenho Assistido por Computador , Medicamentos Sintéticos/química , Humanos
11.
Mol Inform ; 37(9-10): e1800041, 2018 09.
Artigo em Inglês | MEDLINE | ID: mdl-29774657

RESUMO

Cheminformatics has established itself as a core discipline within large scale drug discovery operations. It would be impossible to handle the amount of data generated today in a small molecule drug discovery project without persons skilled in cheminformatics. In addition, due to increased emphasis on "Big Data", machine learning and artificial intelligence, not only in the society in general, but also in drug discovery, it is expected that the cheminformatics field will be even more important in the future. Traditional areas like virtual screening, library design and high-throughput screening analysis are highlighted in this review. Applying machine learning in drug discovery is an area that has become very important. Applications of machine learning in early drug discovery has been extended from predicting ADME properties and target activity to tasks like de novo molecular design and prediction of chemical reactions.


Assuntos
Big Data , Descoberta de Drogas/métodos , Bases de Dados de Compostos Químicos , Desenvolvimento de Medicamentos/métodos , Aprendizado de Máquina
12.
ACS Cent Sci ; 4(1): 120-131, 2018 Jan 24.
Artigo em Inglês | MEDLINE | ID: mdl-29392184

RESUMO

In de novo drug design, computational strategies are used to generate novel molecules with good affinity to the desired biological target. In this work, we show that recurrent neural networks can be trained as generative models for molecular structures, similar to statistical language models in natural language processing. We demonstrate that the properties of the generated molecules correlate very well with the properties of the molecules used to train the model. In order to enrich libraries with molecules active toward a given biological target, we propose to fine-tune the model with small sets of molecules, which are known to be active against that target. Against Staphylococcus aureus, the model reproduced 14% of 6051 hold-out test molecules that medicinal chemists designed, whereas against Plasmodium falciparum (Malaria), it reproduced 28% of 1240 test molecules. When coupled with a scoring function, our model can perform the complete de novo drug design cycle to generate large sets of novel molecules for drug discovery.

13.
J Chem Inf Model ; 57(11): 2741-2753, 2017 11 27.
Artigo em Inglês | MEDLINE | ID: mdl-29068231

RESUMO

It is well-established that the number of publications of novel small molecule modulators, and their associated targets, has increased over the years. This work focuses on publishing trends over the years with a particular focus on the comparison between patents and scientific literature which is accessible via the ChEMBL and GOSTAR databases. More precisely, the patents and scientific literature associated with bioactive molecules and their target annotations have been compared to identify where novelty (in the meaning of the first modulator of a protein target) originated from. Comparing the published date of the first small molecule modulator published in literature and patents for a particular target (with either identical or different structure) shows that modulators are usually published in both scientific literature and in patents (45%), or in scientific literature alone (51%), but rarely in patents only. When looking at the time when first modulators are published in both sources, 65% of the time they are disseminated in literature first. Finally, when analyzing just the novel small molecule modulators, regardless of the protein targets they have been published with, those structures representing novel chemistry tend to be published in patents first 61% of the time.


Assuntos
Descoberta de Drogas/métodos , Terapia de Alvo Molecular , Bibliotecas de Moléculas Pequenas/farmacologia , Patentes como Assunto , Proteínas/metabolismo
14.
J Chem Inf Model ; 57(3): 445-453, 2017 03 27.
Artigo em Inglês | MEDLINE | ID: mdl-28257198

RESUMO

The development of new antimalarial therapies is essential, and lowering the barrier of entry for the screening and discovery of new lead compound classes can spur drug development at organizations that may not have large compound screening libraries or resources to conduct high-throughput screens. Machine learning models have been long established to be more robust and have a larger domain of applicability with larger training sets. Screens over multiple data sets to find compounds with potential malaria blood stage inhibitory activity have been used to generate multiple Bayesian models. Here we describe a method by which Bayesian quantitative structure-activity relationship models, which contain information on thousands to millions of proprietary compounds, can be shared between collaborators at both for-profit and not-for-profit institutions. This model-sharing paradigm allows for the development of consensus models that have increased predictive power over any single model and yet does not reveal the identity of any compounds in the training sets.


Assuntos
Antimaláricos/farmacologia , Aprendizado de Máquina , Malária/tratamento farmacológico , Modelos Teóricos , Relação Quantitativa Estrutura-Atividade , Antimaláricos/uso terapêutico , Teorema de Bayes , Descoberta de Drogas , Malária/sangue , Curva ROC , Temperatura
15.
Phytomedicine ; 23(5): 441-59, 2016 May 15.
Artigo em Inglês | MEDLINE | ID: mdl-27064003

RESUMO

BACKGROUND: Lichens, as a symbiotic association of photobionts and mycobionts, display an unmatched environmental adaptability and a great chemical diversity. As an important morphological group, cetrarioid lichens are one of the most studied lichen taxa for their phylogeny, secondary chemistry, bioactivities and uses in folk medicines, especially the lichen Cetraria islandica. However, insufficient structure elucidation and discrepancy in bioactivity results could be found in a few studies. PURPOSE: This review aimed to present a more detailed and updated overview of the knowledge of secondary metabolites from cetrarioid lichens in a critical manner, highlighting their potentials for pharmaceuticals as well as other applications. Here we also highlight the uses of molecular phylogenetics, metabolomics and ChemGPS-NP model for future bioprospecting, taxonomy and drug screening to accelerate applications of those lichen substances. CHAPTERS: The paper starts with a short introduction in to the studies of lichen secondary metabolites, the biological classification of cetrarioid lichens and the aim. In light of ethnic uses of cetrarioid lichens for therapeutic purposes, molecular phylogeny is proposed as a tool for future bioprospecting of cetrarioid lichens, followed by a brief discussion of the taxonomic value of lichen substances. Then a delicate description of the bioactivities, patents, updated chemical structures and lichen sources is presented, where lichen substances are grouped by their chemical structures and discussed about their bioactivity in comparison with reference compounds. To accelerate the discovery of bioactivities and potential drug targets of lichen substances, the application of the ChemGPS NP model is highlighted. Finally the safety concerns of lichen substances (i.e. toxicity and immunogenicity) and future-prospects in the field are exhibited. CONCLUSION: While the ethnic uses of cetrarioid lichens and the pharmaceutical potential of their secondary metabolites have been recognized, the knowledge of a large number of lichen substances with interesting structures is still limited to various in vitro assays with insufficient biological annotations, and this area still deserves more research in bioactivity, drug targets and screening. Attention should be paid on the accurate interpretation of their bioactivity for further applications avoiding over-interpretations from various in vitro bioassays.


Assuntos
Líquens/química , Metabolismo Secundário , Bioprospecção , Líquens/classificação , Estrutura Molecular , Filogenia
16.
J Chem Inf Model ; 55(11): 2375-90, 2015 Nov 23.
Artigo em Inglês | MEDLINE | ID: mdl-26484706

RESUMO

In this study, biologically relevant areas of the chemical space were analyzed using ChemGPS-NP. This application enables comparing groups of ligands within a multidimensional space based on principle components derived from physicochemical descriptors. Also, 3D visualization of the ChemGPS-NP global map can be used to conveniently evaluate bioactive compound similarity and visually distinguish between different types or groups of compounds. To further establish ChemGPS-NP as a method to accurately represent the chemical space, a comparison with structure-based fingerprint has been performed. Interesting complementarities between the two descriptions of molecules were observed. It has been shown that the accuracy of describing molecules with physicochemical descriptors like in ChemGPS-NP is similar to the accuracy of structural fingerprints in retrieving bioactive molecules. Lastly, pharmacological similarity of structurally diverse compounds has been investigated in ChemGPS-NP space. These results further strengthen the case of using ChemGPS-NP as a tool to explore and visualize chemical space.


Assuntos
Descoberta de Drogas/métodos , Desenho Assistido por Computador , Bases de Dados de Produtos Farmacêuticos , Humanos , Ligantes , Modelos Moleculares , Software , Relação Estrutura-Atividade
17.
Drug Discov Today ; 20(1): 11-7, 2015 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-25281855

RESUMO

One pragmatic way to improve compound quality, while enhancing and accelerating drug discovery projects, is the ability to access a high quality, novel, diverse building block collection. Here, we outline general principles that should be applied to ensure that a building block collection has the greatest impact on drug discovery projects, by discussing design principles for novel reagents and what types of reagents are popular with medicinal chemists in general. We initiated a program in 2009 to address this, which has already delivered three candidate drugs, and the success of that program provides evidence that focussing on building block design is a useful strategy for drug discovery.


Assuntos
Desenho de Fármacos , Indicadores e Reagentes/química
18.
J Pharm Sci ; 104(3): 1197-206, 2015 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-25546343

RESUMO

Recently, we built an in silico model to predict the unbound brain-to-plasma concentration ratio (Kp,uu,brain), a measure of the distribution of a compound between the blood plasma and the brain. Here, we validate the previous model with new additional data points expanding the chemical space and use that data also to renew the model. The model building process was similar to our previous approach; however, a new set of descriptors, molecular signatures, was included to facilitate the model interpretation from a structure perspective. The best consensus model shows better predictive power than the previous model (R(2) = 0.6 vs. R(2) = 0.53, when the same 99 compounds were used as test set). The two-class classification accuracy increased from 76% using the previous model to 81%. Furthermore, the atom-summarized gradient based on molecular signature descriptors was proposed as an interesting new approach to interpret the Kp,uu,brain machine learning model and scrutinize structure Kp,uu,brain relationships for investigated compounds.


Assuntos
Barreira Hematoencefálica/metabolismo , Permeabilidade Capilar , Simulação por Computador , Modelos Biológicos , Preparações Farmacêuticas/sangue , Farmacocinética , Animais , Humanos , Preparações Farmacêuticas/administração & dosagem , Ligação Proteica , Reprodutibilidade dos Testes , Máquina de Vetores de Suporte
19.
J Chem Inf Model ; 53(7): 1825-35, 2013 Jul 22.
Artigo em Inglês | MEDLINE | ID: mdl-23826858

RESUMO

This work describes a data driven method for scaffold hopping by fragment replacement. A search database of scaffolds is created by cutting bonds of existing compounds in a combinatorial fashion. Three-dimensional structures of the scaffolds are then generated and made searchable based on the relative orientation of the broken bonds using an auxiliary index file. The retrieved scaffolds are ranked using volume overlap and electrostatic similarity scores. A similar approach has been used before in the program CAVEAT and others. The present work introduces a novel indexing scheme for the attachment vector geometry, which allows for fast searching. A scaffold shape descriptor is defined, which allows for queries with a single attachment vector (R-groups) and improves the shape similarity between the query and the suggested replacement fragments. The program, called Scaffold Hopping, is shown to retrieve relevant bioisosteric replacement scaffolds for a set of example queries in a reasonable time frame, making the program suitable to be used in drug design work.


Assuntos
Desenho de Fármacos , Avaliação Pré-Clínica de Medicamentos/métodos , Software , Estudos de Viabilidade , Internet , Modelos Moleculares , Conformação Molecular , Interface Usuário-Computador
20.
Drug Discov Today ; 18(19-20): 1014-24, 2013 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-23127858

RESUMO

In this study, the screening collections of two major pharmaceutical companies (AstraZeneca and Bayer Pharma AG) have been compared using a 2D molecular fingerprint by a nearest neighborhood approach. Results revealed a low overlap between both collections in terms of compound identity and similarity. This emphasizes the value of screening multiple compound collections to expand the chemical space that can be accessed by high-throughput screening (HTS).


Assuntos
Descoberta de Drogas/tendências , Indústria Farmacêutica/tendências , Ensaios de Triagem em Larga Escala/tendências , Bibliotecas de Moléculas Pequenas/química , Animais , Descoberta de Drogas/métodos , Avaliação Pré-Clínica de Medicamentos/métodos , Avaliação Pré-Clínica de Medicamentos/tendências , Indústria Farmacêutica/métodos , Ensaios de Triagem em Larga Escala/métodos , Humanos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...