Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 27
Filtrar
1.
J Chem Inf Model ; 64(12): 4687-4699, 2024 Jun 24.
Artigo em Inglês | MEDLINE | ID: mdl-38822782

RESUMO

The design of compounds during hit-to-lead often seeks to explore a vector from a core scaffold to form additional interactions with the target protein. A rational approach to this is to probe the region of a protein accessed by a vector with a systematic placement of pharmacophore features in 3D, particularly when bound structures are not available. Herein, we present bbSelect, an open-source tool built to map the placements of pharmacophore features in 3D Euclidean space from a library of R-groups, employing partitioning to drive a diverse and systematic selection to a user-defined size. An evaluation of bbSelect against established methods exemplified the superiority of bbSelect in its ability to perform diverse selections, achieving high levels of pharmacophore feature placement coverage with selection sizes of a fraction of the total set and without the introduction of excess complexity. bbSelect also reports visualizations and rationale to enable users to understand and interrogate results. This provides a tool for the drug discovery community to guide their hit-to-lead activities.


Assuntos
Descoberta de Drogas , Software , Descoberta de Drogas/métodos , Modelos Moleculares , Desenho de Fármacos , Proteínas/química , Farmacóforo
2.
Pharm Stat ; 20(4): 898-915, 2021 07.
Artigo em Inglês | MEDLINE | ID: mdl-33768736

RESUMO

One of the main problems that the drug discovery research field confronts is to identify small molecules, modulators of protein function, which are likely to be therapeutically useful. Common practices rely on the screening of vast libraries of small molecules (often 1-2 million molecules) in order to identify a molecule, known as a lead molecule, which specifically inhibits or activates the protein function. To search for the lead molecule, we investigate the molecular structure, which generally consists of an extremely large number of fragments. Presence or absence of particular fragments, or groups of fragments, can strongly affect molecular properties. We study the relationship between molecular properties and its fragment composition by building a regression model, in which predictors, represented by binary variables indicating the presence or absence of fragments, are grouped in subsets and a bi-level penalization term is introduced for the high dimensionality of the problem. We evaluate the performance of this model in two simulation studies, comparing different penalization terms and different clustering techniques to derive the best predictor subsets structure. Both studies are characterized by small sets of data relative to the number of predictors under consideration. From the results of these simulation studies, we show that our approach can generate models able to identify key features and provide accurate predictions. The good performance of these models is then exhibited with real data about the MMP-12 enzyme.


Assuntos
Descoberta de Drogas , Análise por Conglomerados , Simulação por Computador , Humanos
3.
J Chem Inf Model ; 60(12): 5699-5713, 2020 12 28.
Artigo em Inglês | MEDLINE | ID: mdl-32659085

RESUMO

Deep learning approaches have become popular in recent years in the field of de novo molecular design. While a variety of different methods are available, it is still a challenge to assess and compare their performance. A particularly promising approach for automated drug design is to use recurrent neural networks (RNNs) as SMILES generators and train them with the learning procedure called "transfer learning". This involves first training the initial model on a large generic data set of molecules to learn the general syntax of SMILES, followed by fine-tuning on a smaller set of molecules, coming from, e.g., a lead optimization program. To create a well-performing transfer learning application which can be automated, it is important to understand how the size of the second data set affects the training process. In addition, extensive postfiltering using similarity metrics of the molecules generated after transfer learning should be avoided, as it can introduce new biases toward the selection of drug candidates. Here, we present results from the application of a gated recurrent unit cell (GRU)-RNN to transfer learning on data sets of varying sizes and complexity. Analysis of the results has allowed us to provide some general guidelines for transfer learning. In particular, we show that data set sizes containing at least 190 molecules are needed for effective GRU-RNN-based molecular generation using transfer learning. The methods presented here should be applicable generally to the benchmarking of other deep learning methodologies for molecule generation.


Assuntos
Desenho de Fármacos , Redes Neurais de Computação , Aprendizado de Máquina
4.
J Comput Aided Mol Des ; 34(7): 747-765, 2020 07.
Artigo em Inglês | MEDLINE | ID: mdl-31637565

RESUMO

This paper introduces BRADSHAW (Biological Response Analysis and Design System using an Heterogenous, Automated Workflow), a system for automated molecular design which integrates methods for chemical structure generation, experimental design, active learning and cheminformatics tools. The simple user interface is designed to facilitate access to large scale automated design whilst minimising software development required to introduce new algorithms, a critical requirement in what is a very fast moving field. The system embodies a philosophy of automation, best practice, experimental design and the use of both traditional cheminformatics and modern machine learning algorithms.


Assuntos
Desenho Assistido por Computador , Desenho de Fármacos , Antagonistas do Receptor A2 de Adenosina/química , Algoritmos , Quimioinformática/métodos , Quimioinformática/estatística & dados numéricos , Quimioinformática/tendências , Desenho Assistido por Computador/estatística & dados numéricos , Desenho Assistido por Computador/tendências , Aprendizado Profundo , Descoberta de Drogas/métodos , Descoberta de Drogas/estatística & dados numéricos , Descoberta de Drogas/tendências , Humanos , Aprendizado de Máquina , Inibidores de Metaloproteinases de Matriz/química , Relação Quantitativa Estrutura-Atividade , Bibliotecas de Moléculas Pequenas , Software , Interface Usuário-Computador , Fluxo de Trabalho
5.
J Comput Aided Mol Des ; 34(7): 767, 2020 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-31691917

RESUMO

The original version of this article unfortunately contained some mistakes in the references.

6.
J Chem Inf Model ; 57(3): 445-453, 2017 03 27.
Artigo em Inglês | MEDLINE | ID: mdl-28257198

RESUMO

The development of new antimalarial therapies is essential, and lowering the barrier of entry for the screening and discovery of new lead compound classes can spur drug development at organizations that may not have large compound screening libraries or resources to conduct high-throughput screens. Machine learning models have been long established to be more robust and have a larger domain of applicability with larger training sets. Screens over multiple data sets to find compounds with potential malaria blood stage inhibitory activity have been used to generate multiple Bayesian models. Here we describe a method by which Bayesian quantitative structure-activity relationship models, which contain information on thousands to millions of proprietary compounds, can be shared between collaborators at both for-profit and not-for-profit institutions. This model-sharing paradigm allows for the development of consensus models that have increased predictive power over any single model and yet does not reveal the identity of any compounds in the training sets.


Assuntos
Antimaláricos/farmacologia , Aprendizado de Máquina , Malária/tratamento farmacológico , Modelos Teóricos , Relação Quantitativa Estrutura-Atividade , Antimaláricos/uso terapêutico , Teorema de Bayes , Descoberta de Drogas , Malária/sangue , Curva ROC , Temperatura
7.
J Comput Aided Mol Des ; 31(3): 249-253, 2017 03.
Artigo em Inglês | MEDLINE | ID: mdl-28070730

RESUMO

The acronym "CADD" is often used interchangeably to refer to "Computer Aided Drug Discovery" and "Computer Aided Drug Design". While the former definition implies the use of a computer to impact one or more aspects of discovering a drug, in this paper we contend that computational chemists are most effective when they enable teams to apply true design principles as they strive to create medicines to treat human disease. We argue that teams must bring to bear multiple sub-disciplines of computational chemistry in an integrated manner in order to utilize these principles to address the multi-objective nature of the drug discovery problem. Impact, resourcing principles, and future directions for the field are also discussed, including areas of future opportunity as well as a cautionary note about hype and hubris.


Assuntos
Biologia Computacional/métodos , Desenho Assistido por Computador , Desenho de Fármacos , Modelos Moleculares , Estrutura Molecular , Software , Relação Estrutura-Atividade
8.
Nature ; 465(7296): 305-10, 2010 May 20.
Artigo em Inglês | MEDLINE | ID: mdl-20485427

RESUMO

Malaria is a devastating infection caused by protozoa of the genus Plasmodium. Drug resistance is widespread, no new chemical class of antimalarials has been introduced into clinical practice since 1996 and there is a recent rise of parasite strains with reduced sensitivity to the newest drugs. We screened nearly 2 million compounds in GlaxoSmithKline's chemical library for inhibitors of P. falciparum, of which 13,533 were confirmed to inhibit parasite growth by at least 80% at 2 microM concentration. More than 8,000 also showed potent activity against the multidrug resistant strain Dd2. Most (82%) compounds originate from internal company projects and are new to the malaria community. Analyses using historic assay data suggest several novel mechanisms of antimalarial action, such as inhibition of protein kinases and host-pathogen interaction related targets. Chemical structures and associated data are hereby made public to encourage additional drug lead identification efforts and further research into this disease.


Assuntos
Antimaláricos/análise , Antimaláricos/farmacologia , Descoberta de Drogas , Malária Falciparum/tratamento farmacológico , Plasmodium falciparum/efeitos dos fármacos , Bibliotecas de Moléculas Pequenas/análise , Bibliotecas de Moléculas Pequenas/farmacologia , Animais , Antimaláricos/química , Antimaláricos/toxicidade , Linhagem Celular Tumoral , Resistência a Múltiplos Medicamentos/efeitos dos fármacos , Humanos , Malária Falciparum/parasitologia , Modelos Biológicos , Filogenia , Plasmodium falciparum/enzimologia , Plasmodium falciparum/genética , Plasmodium falciparum/crescimento & desenvolvimento , Bibliotecas de Moléculas Pequenas/química , Bibliotecas de Moléculas Pequenas/toxicidade
9.
J Chem Inf Model ; 54(10): 2636-46, 2014 Oct 27.
Artigo em Inglês | MEDLINE | ID: mdl-25244105

RESUMO

There is an ever increasing resource in terms of both structural information and activity data for many protein targets. In this paper we describe OOMMPPAA, a novel computational tool designed to inform compound design by combining such data. OOMMPPAA uses 3D matched molecular pairs to generate 3D ligand conformations. It then identifies pharmacophoric transformations between pairs of compounds and associates them with their relevant activity changes. OOMMPPAA presents this data in an interactive application providing the user with a visual summary of important interaction regions in the context of the binding site. We present validation of the tool using openly available data for CDK2 and a GlaxoSmithKline data set for a SAM-dependent methyl-transferase. We demonstrate OOMMPPAA's application in optimizing both potency and cell permeability and use OOMMPPAA to highlight nuanced and cross-series SAR. OOMMPPAA is freely available to download at http://oommppaa.sgc.ox.ac.uk/OOMMPPAA/ .


Assuntos
Quinase 2 Dependente de Ciclina/antagonistas & inibidores , Inibidores Enzimáticos/química , Metiltransferases/antagonistas & inibidores , Bibliotecas de Moléculas Pequenas/química , Software , Sítios de Ligação , Quinase 2 Dependente de Ciclina/química , Desenho de Fármacos , Inibidores Enzimáticos/síntese química , Humanos , Ligantes , Metiltransferases/química , Simulação de Acoplamento Molecular , Ligação Proteica , Relação Quantitativa Estrutura-Atividade , S-Adenosilmetionina/química , Bibliotecas de Moléculas Pequenas/síntese química
10.
J Comput Aided Mol Des ; 27(4): 321-36, 2013 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-23615761

RESUMO

We describe the QSAR Workbench, a system for the building and analysis of QSAR models. The system is built around the Pipeline Pilot workflow tool and provides access to a variety of model building algorithms for both continuous and categorical data. Traditionally models are built on a one by one basis and fully exploring the model space of algorithms and descriptor subsets is a time consuming basis. The QSAR Workbench provides a framework to allow for multiple models to be built over a number of modeling algorithms, descriptor combinations and data splits (training and test sets). Methods to analyze and compare models are provided, enabling the user to select the most appropriate model. The Workbench provides a consistent set of routines for data preparation and chemistry normalization that are also applied for predictions. The Workbench provides a large degree of automation with the ability to publish preconfigured model building workflows for a variety of problem domains, whilst providing experienced users full access to the underlying parameterization if required. Methods are provided to allow for publication of selected models as web services, thus providing integration with the chemistry desktop. We describe the design and implementation of the QSAR Workbench and demonstrate its utility through application to two public domain datasets.


Assuntos
Desenho de Fármacos , Modelos Biológicos , Relação Quantitativa Estrutura-Atividade , Algoritmos , Bases de Dados de Produtos Farmacêuticos , Humanos , Fluxo de Trabalho
11.
J Cheminform ; 13(1): 13, 2021 Feb 22.
Artigo em Inglês | MEDLINE | ID: mdl-33618772

RESUMO

Malaria is a disease affecting hundreds of millions of people across the world, mainly in developing countries and especially in sub-Saharan Africa. It is the cause of hundreds of thousands of deaths each year and there is an ever-present need to identify and develop effective new therapies to tackle the disease and overcome increasing drug resistance. Here, we extend a previous study in which a number of partners collaborated to develop a consensus in silico model that can be used to identify novel molecules that may have antimalarial properties. The performance of machine learning methods generally improves with the number of data points available for training. One practical challenge in building large training sets is that the data are often proprietary and cannot be straightforwardly integrated. Here, this was addressed by sharing QSAR models, each built on a private data set. We describe the development of an open-source software platform for creating such models, a comprehensive evaluation of methods to create a single consensus model and a web platform called MAIP available at https://www.ebi.ac.uk/chembl/maip/ . MAIP is freely available for the wider community to make large-scale predictions of potential malaria inhibiting compounds. This project also highlights some of the practical challenges in reproducing published computational methods and the opportunities that open-source software can offer to the community.

12.
J Med Chem ; 63(20): 11964-11971, 2020 10 22.
Artigo em Inglês | MEDLINE | ID: mdl-32955254

RESUMO

Machine learning approaches promise to accelerate and improve success rates in medicinal chemistry programs by more effectively leveraging available data to guide a molecular design. A key step of an automated computational design algorithm is molecule generation, where the machine is required to design high-quality, drug-like molecules within the appropriate chemical space. Many algorithms have been proposed for molecular generation; however, a challenge is how to assess the validity of the resulting molecules. Here, we report three Turing-inspired tests designed to evaluate the performance of molecular generators. Profound differences were observed between the performance of molecule generators in these tests, highlighting the importance of selection of the appropriate design algorithms for specific circumstances. One molecule generator, based on match molecular pairs, performed excellently against all tests and thus provides a valuable component for machine-driven medicinal chemistry design workflows.


Assuntos
Algoritmos , Aprendizado de Máquina , Química Farmacêutica , Desenho de Fármacos , Humanos , Estrutura Molecular
14.
SLAS Discov ; 23(6): 532-545, 2018 07.
Artigo em Inglês | MEDLINE | ID: mdl-29699447

RESUMO

High-throughput screening (HTS) hits include compounds with undesirable properties. Many filters have been described to identify such hits. Notably, pan-assay interference compounds (PAINS) has been adopted by the community as the standard term to refer to such filters, and very useful guidelines have been adopted by the American Chemical Society (ACS) and subsequently triggered a healthy scientific debate about the pitfalls of draconian use of filters. Using an inhibitory frequency index, we have analyzed in detail the promiscuity profile of the whole GlaxoSmithKline (GSK) HTS collection comprising more than 2 million unique compounds that have been tested in hundreds of screening assays. We provide a comprehensive analysis of many previously published filters and newly described classes of nuisance structures that may serve as a useful source of empirical information to guide the design or growth of HTS collections and hit triaging strategies.


Assuntos
Descoberta de Drogas/métodos , Ensaios de Triagem em Larga Escala/métodos , Bibliotecas de Moléculas Pequenas/química , Bioensaio/métodos
16.
Acta Crystallogr D Struct Biol ; 73(Pt 3): 279-285, 2017 03 01.
Artigo em Inglês | MEDLINE | ID: mdl-28291763

RESUMO

In this work, two freely available web-based interactive computational tools that facilitate the analysis and interpretation of protein-ligand interaction data are described. Firstly, WONKA, which assists in uncovering interesting and unusual features (for example residue motions) within ensembles of protein-ligand structures and enables the facile sharing of observations between scientists. Secondly, OOMMPPAA, which incorporates protein-ligand activity data with protein-ligand structural data using three-dimensional matched molecular pairs. OOMMPPAA highlights nuanced structure-activity relationships (SAR) and summarizes available protein-ligand activity data in the protein context. In this paper, the background that led to the development of both tools is described. Their implementation is outlined and their utility using in-house Structural Genomics Consortium (SGC) data sets and openly available data from the PDB and ChEMBL is described. Both tools are freely available to use and download at http://wonka.sgc.ox.ac.uk/WONKA/ and http://oommppaa.sgc.ox.ac.uk/OOMMPPAA/.


Assuntos
Desenho Assistido por Computador , Desenho de Fármacos , Proteínas/metabolismo , Software , Sítios de Ligação , Bases de Dados de Proteínas , Humanos , Ligantes , Simulação de Acoplamento Molecular , Ligação Proteica , Proteínas/química , Relação Estrutura-Atividade
17.
Drug Discov Today ; 21(10): 1719-1727, 2016 10.
Artigo em Inglês | MEDLINE | ID: mdl-27423371

RESUMO

In an attempt to seek increased understanding of compound attributes that influence successful drug pipeline progression, GlaxoSmithKline's portfolio of oral candidates was compared with reference sets of marketed oral drugs. The approach differs from other attrition studies by explicitly focusing on choosing 'the right compound' by applying relevant, experimentally derived properties. The analysis led to four proposed compound quality categories, created by combining specific criteria for three measures: dose, solubility and the property forecast index, a composite measure of lipophilicity using chromatographically determined LogD and aromaticity. The 'three properties' provide benchmarked guidelines for project teams to use when seeking and selecting clinical candidates, because they reflect the property distribution of marketed oral drugs.


Assuntos
Descoberta de Drogas , Administração Oral , Animais , Humanos , Interações Hidrofóbicas e Hidrofílicas , Preparações Farmacêuticas/administração & dosagem , Preparações Farmacêuticas/química , Preparações Farmacêuticas/metabolismo , Solubilidade
18.
J Med Chem ; 45(23): 5069-80, 2002 Nov 07.
Artigo em Inglês | MEDLINE | ID: mdl-12408718

RESUMO

Deriving quantitative structure-activity relationship (QSAR) models that are accurate, reliable, and easily interpretable is a difficult task. In this study, two new methods have been developed that aim to find useful QSAR models that represent an appropriate balance between model accuracy and complexity. Both methods are based on genetic programming (GP). The first method, referred to as genetic QSAR (or GPQSAR), uses a penalty function to control model complexity. GPQSAR is designed to derive a single linear model that represents an appropriate balance between the variance and the number of descriptors selected for the model. The second method, referred to as multiobjective genetic QSAR (MoQSAR), is based on multiobjective GP and represents a new way of thinking of QSAR. Specifically, QSAR is considered as a multiobjective optimization problem that comprises a number of competitive objectives. Typical objectives include model fitting, the total number of terms, and the occurrence of nonlinear terms. MoQSAR results in a family of equivalent QSAR models where each QSAR represents a different tradeoff in the objectives. A practical consideration often overlooked in QSAR studies is the need for the model to promote an understanding of the biochemical response under investigation. To accomplish this, chemically intuitive descriptors are needed but do not always give rise to statistically robust models. This problem is addressed by the addition of a further objective, called chemical desirability, that aims to reward models that consist of descriptors that are easily interpretable by chemists. GPQSAR and MoQSAR have been tested on various data sets including the Selwood data set and two different solubility data sets. The study demonstrates that the MoQSAR method is able to find models that are at least as good as models derived using standard statistical approaches and also yields models that allow a medicinal chemist to trade statistical robustness for chemical interpretability.


Assuntos
Relação Quantitativa Estrutura-Atividade , Algoritmos , Desenho de Fármacos , Modelos Estatísticos
19.
Prog Med Chem ; 41: 61-97, 2003.
Artigo em Inglês | MEDLINE | ID: mdl-12774691

RESUMO

Virtual screening of virtual libraries (VSVL) is a rapidly changing area of research. Great efforts are being made to produce better algorithms, selection methods and infrastructure. Yet, the number of successful examples in the literature is not impressive, although the quality of work certainly is high. Why is this? One reason is that these methods tend to be applied at the lead generation stage and therefore there is a large lead-time before successful examples appear in the literature. However, any computational chemist would confirm that these methods are successful and there exists a glut of start-up companies specialising in virtual screening. Moreover, the scientific community would not be focussing so much attention on this area if it were not yielding results. Even so, the paucity of literature data is certainly a hindrance to the development of better methods. The VSVL process is unique within the discovery process, in that it is the only method that can screen the > 10(30) genuinely novel molecules out there. Already, some VSVL methods are evaluating 10(13) compounds, a capacity that high throughput screening can only dream of. There is a huge potential advantage for the company that develops efficient and effective methods, for lead generation, lead hopping and optimization of both potency and ADME properties. To do this, it requires more than the software, it requires confidence to exploit the methodology, to commit synthesis on the basis of it, and to build this approach into the medicinal chemistry strategy. It is a fact that these tools remain quite daunting for the majority of scientists working at the bench. The routine use of these methods is not simply a matter of education and training. Integration of these methods into accessible and robust end user software, without dilution of the science, must be a priority. We have reached a coincidence, where several technologies have the required level of maturity predictive computational chemistry methods, algorithms that manage the combinatorial explosion, high throughput crystallography and ADME measurements and the massive increase in computational horsepower from distributed computing. The author is confident that the synergy of these technologies will bring great benefit to the industry, with more efficient production of higher quality clinical candidates. The future is bright. The future is virtual!


Assuntos
Técnicas de Química Combinatória/métodos , Desenho de Fármacos , Biblioteca de Peptídeos , Algoritmos , Fenômenos Químicos , Físico-Química , Avaliação Pré-Clínica de Medicamentos/métodos , Humanos , Relação Estrutura-Atividade
20.
J Mol Graph Model ; 20(6): 491-8, 2002 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-12071283

RESUMO

When designing a combinatorial library it is usually desirable to optimise multiple properties of the library simultaneously and often the properties are in competition with one another. For example, a library that is designed to be focused around a given target molecule should ideally have minimum cost and also contain molecules that are bioavailable. In this paper, we describe the program MoSELECT for multiobjective library design that is based on a multiobjective genetic algorithm (MOGA). MoSELECT searches the product-space of a virtual combinatorial library to generate a family of equivalent solutions where each solution represents a combinatorial subset of the virtual library optimised over multiple objectives. The family of solutions allows the relationships between the objectives to be explored and thus enables the library designer to make an informed choice on an appropriate compromise solution. Experiments are reported where MoSELECT has been applied to the design of various focused libraries.


Assuntos
Algoritmos , Técnicas de Química Combinatória , Desenho de Fármacos , Amidas/química , DNA Helicases/antagonistas & inibidores , Estrutura Molecular , Biblioteca de Peptídeos , Relação Quantitativa Estrutura-Atividade , Software , Tiazóis/química
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA