RESUMO
INTRODUCTION: Artificial intelligence (AI) has seen a massive resurgence in recent years with wide successes in computer vision, natural language processing, and games. The similar creation of robust and accurate AI models for ADME/Tox endpoint and activity prediction would be revolutionary to drug discovery pipelines. There have been numerous demonstrations of successful applications, but a key challenge remains: how generalizable are these predictive models? AREAS COVERED: The authors present a summary of current promising components of AI models in the context of early drug discovery where ADME/Tox endpoint and activity prediction is the main driver of the iterative drug design process. Following that is a review of applicability domains and dataset construction considerations which determine generalizability bottlenecks for AI deployment. Further reviewed is the role of promising learning frameworks - multitask, transfer, and meta learning - which leverage auxiliary data to overcome issues of generalizability. EXPERT OPINION: The authors conclude that the most promising direction toward integrating reliable and informative AI models into the drug discovery pipeline is a conjunction of learned feature representations, deep learning, and novel learning frameworks. Such a solution would address the sparse and incomplete datasets that are available for key endpoints related to drug discovery.
Assuntos
Inteligência Artificial , Desenho de Fármacos , Descoberta de Drogas , HumanosRESUMO
Antagonism of CCR9 is a promising mechanism for treatment of inflammatory bowel disease, including ulcerative colitis and Crohn's disease. There is limited experimental data on CCR9 and its ligands, complicating efforts to identify new small molecule antagonists. We present here results of a successful virtual screening and rational hit-to-lead campaign that led to the discovery and initial optimization of novel CCR9 antagonists. This work uses a novel data fusion strategy to integrate the output of multiple computational tools, such as 2D similarity search, shape similarity, pharmacophore searching, and molecular docking, as well as the identification and incorporation of privileged chemokine fragments. The application of various ranking strategies, which combined consensus and parallel selection methods to achieve a balance of enrichment and novelty, resulted in 198 virtual screening hits in total, with an overall hit rate of 18%. Several hits were developed into early leads through targeted synthesis and purchase of analogs.
Assuntos
Simulação por Computador , Simulação de Acoplamento Molecular/métodos , Receptores CCR/agonistas , Descoberta de Drogas/métodos , Avaliação Pré-Clínica de Medicamentos/métodos , Ensaios de Triagem em Larga Escala/métodos , Ligantes , Estrutura Molecular , Análise de Componente Principal , Receptores CXCR4/agonistas , Receptores Acoplados a Proteínas G/metabolismo , Relação Estrutura-AtividadeRESUMO
Protein biosynthesis and extracellular secretion are essential biological processes for therapeutic protein production in mammalian cells, which offer the capacity for correct folding and proper post-translational modifications. In this study, we have generated bispecific therapeutic fusion proteins in mammalian cells by combining a peptide and an antibody into a single open reading frame. A neutralizing peptide directed against interleukin-17A (IL17A) was genetically fused to the N termini of an anti-IL22 antibody, through either the light chain, the heavy chain, or both chains. Although the resulting fusion proteins bound and inhibited IL22 with the same affinity and potency as the unmodified anti-IL22 antibody, the peptide modality in the fusion scaffold was not active in the cell-based assay due to the N-terminal degradation. When a glutamine residue was introduced at the N terminus, which can be cyclized to form pyroglutamate in mammalian cells, the IL17A neutralization activity of the fusion protein was restored. Interestingly, the mass spectroscopic analysis of the purified fusion protein revealed an unexpected O-linked glycosylation modification at threonine 5 of the anti-IL17A peptide. The subsequent removal of this post-translational modification by site-directed mutagenesis drastically enhanced the IL17A binding affinity and neutralization potency for the resulting fusion protein. These results provide direct experimental evidence that post-translational modifications during protein biosynthesis along secretory pathways play critical roles in determining the structure and function of therapeutic proteins produced by mammalian cells. The newly engineered peptide-antibody genetic fusion is promising for therapeutically targeting multiple antigens in a single antibody-like molecule.
Assuntos
Anticorpos Biespecíficos/genética , Interleucina-17/imunologia , Interleucinas/imunologia , Polissacarídeos/química , Ácido Pirrolidonocarboxílico/química , Sequência de Aminoácidos , Cromatografia Líquida , Ensaio de Imunoadsorção Enzimática , Células HEK293 , Humanos , Espectrometria de Massas , Dados de Sequência Molecular , Mutagênese Sítio-Dirigida , Processamento de Proteína Pós-Traducional , Interleucina 22RESUMO
An algorithm has been devised for the automatic design of peptide turn mimetics, particularly applicable to peptide-activated GPCRs. The method is based on flexible alignments using a new design paradigm and scoring system that aims to reduce the molecular weight of the compound and preferentially lead to drug like molecules. The process can be applied either as a de novo design or a virtual screening tool. Its use has been demonstrated by the design of novel double digit nanomolar ligands for the melanocortin 4 receptor (MC4). The method is, in principle, applicable to any type of receptor, including orphan receptors.
Assuntos
Desenho de Fármacos , Mimetismo Molecular , Peptídeos/química , Receptor Tipo 4 de Melanocortina , Algoritmos , Linhagem Celular , Cromatografia Líquida de Alta Pressão , Computadores Moleculares , Humanos , Ligantes , Biblioteca de Peptídeos , Peptídeos/metabolismo , Pirrolidinas/química , Receptor Tipo 4 de Melanocortina/química , Receptor Tipo 4 de Melanocortina/metabolismo , Interface Usuário-ComputadorRESUMO
Rapid overlay of chemical structures (ROCS) is a method that aligns molecules based on shape and/or chemical similarity. It is often used in 3D ligand-based virtual screening. Given a query consisting of a single conformation of an active molecule ROCS can generate highly enriched hit lists. Typically the chosen query conformation is a minimum energy structure. Can better enrichment be obtained using conformations other than the minimum energy structure? To answer this question a methodology has been developed called CORAL (COnformational analysis, Rocs ALignment). For a given set of molecule conformations it computes optimized conformations for ROCS screening. It does so by clustering all conformations of a chosen molecule set using pairwise ROCS combo scores. The best representative conformation is that which has the highest average overlap with the rest of the conformations in the cluster. It is these best representative conformations that are then used for virtual screening. CORAL was tested by performing virtual screening experiments with the 40 DUD (Directory of Useful Decoys) data sets. Both CORAL and minimum energy queries were used. The recognition capability of each query was quantified as the area under the ROC curve (AUC). Results show that the CORAL AUC values are on average larger than the minimum energy AUC values. This demonstrates that one can indeed obtain better ROCS enrichments with conformations other than the minimum energy structure. As a result, CORAL analysis can be a valuable first step in virtual screening workflows using ROCS.
Assuntos
Desenho de Fármacos , Preparações Farmacêuticas/química , Ligantes , Conformação Molecular , Estrutura Molecular , Curva ROC , Software , Fluxo de TrabalhoRESUMO
The root-mean-squared deviation (rmsd) is a widely used measure of distance between two aligned objects -- often chemical structures. However, rmsd has a number of known limitations including difficulty of interpretation, no limit on weighting for any portion of the alignment, and a lack of normalization. In this work, a Generally Applicable Replacement for rmsD (GARD) is proposed. In this implementation atomic contributions are weighted by their relative importance to binding, as determined statistically by Andrews et al. (1) , and as such this method is 'chemically aware'. This novel measure is normalized and does not have many of the failings of traditional rmsd. It is, thus, perfectly suited for a wide variety of uses, including the assessment of the quality of poses produced from molecular docking programs and the comparison of conformers. Rmsd and GARD are compared in their ability to assess docking software and multiple examples of the use of GARD to rescue essentially correct poses with a high rmsd are presented.
Assuntos
Algoritmos , Proteínas/metabolismo , Animais , Receptores ErbB/química , Receptores ErbB/metabolismo , Glutationa Transferase/química , Glutationa Transferase/metabolismo , Protease de HIV/química , Protease de HIV/metabolismo , Humanos , Ligantes , Proteína Tirosina Quinase p56(lck) Linfócito-Específica/química , Proteína Tirosina Quinase p56(lck) Linfócito-Específica/metabolismo , Camundongos , Modelos Moleculares , Neuraminidase/química , Neuraminidase/metabolismo , Ligação Proteica , Conformação Proteica , Proteínas/química , Proteínas do Core Viral/química , Proteínas do Core Viral/metabolismoRESUMO
Molecular docking programs are widely used modeling tools for predicting ligand binding modes and structure based virtual screening. In this study, six molecular docking programs (DOCK, FlexX, GLIDE, ICM, PhDOCK, and Surflex) were evaluated using metrics intended to assess docking pose and virtual screening accuracy. Cognate ligand docking to 68 diverse, high-resolution X-ray complexes revealed that ICM, GLIDE, and Surflex generated ligand poses close to the X-ray conformation more often than the other docking programs. GLIDE and Surflex also outperformed the other docking programs when used for virtual screening, based on mean ROC AUC and ROC enrichment values obtained for the 40 protein targets in the Directory of Useful Decoys (DUD). Further analysis uncovered general trends in accuracy that are specific for particular protein families. Modifying basic parameters in the software was shown to have a significant effect on docking and virtual screening results, suggesting that expert knowledge is critical for optimizing the accuracy of these methods.
Assuntos
Avaliação Pré-Clínica de Medicamentos/métodos , Modelos Moleculares , Interface Usuário-Computador , Cristalografia por Raios X , Ligantes , Conformação Molecular , Proteínas/química , Proteínas/metabolismo , Curva ROCRESUMO
This paper describes the application of de novo design utilizing exclusively ligand information. In the current approach, ligand design criteria, including pharmacophores, similarity and desired properties are applied as part of a fitness function driving the design process, instead of using them as filters after the process. This allows relevant parts of chemical space to be explored more efficiently. Two case studies of successful ligand design are also presented, one aimed at scaffold hopping, the other for exploring substitution patterns around a novel scaffold.
Assuntos
Desenho Assistido por Computador , Desenho de Fármacos , Algoritmos , Humanos , Ligantes , Inibidores da Captação de Neurotransmissores , Receptores da GonadotropinaRESUMO
Structure-based lead optimization approaches are increasingly playing a role in the drug-discovery process. Recent advances in 'high-throughput' molecular docking methods and examples of their successful use in lead optimization are reviewed. Measures of docking accuracy, scoring function comparisons, and consensus approaches are discussed. Differences in docking protocols typically used for lead optimization versus lead generation are highlighted; this section includes a discussion of the latest methods for the incorporation of protein flexibility. New approaches developed specifically for the design of combinatorial libraries as well as those designed or used for 'fragment' versus lead optimization are presented. Finally, potential future improvements to the technology are outlined.
Assuntos
Técnicas de Química Combinatória , Desenho Assistido por Computador , Desenho de Fármacos , Preparações Farmacêuticas/química , Proteínas/química , Tecnologia Farmacêutica/métodos , Sítios de Ligação , Simulação por Computador , Ligantes , Modelos Químicos , Modelos Moleculares , Estrutura Molecular , Preparações Farmacêuticas/metabolismo , Ligação Proteica , Conformação Proteica , Proteínas/metabolismo , Relação Estrutura-AtividadeRESUMO
This paper describes the development of a set of new 2D fingerprints for the purposes of virtual screening in a pharmaceutical environment. The new fingerprints are based on established ones: the changes in their design included the introduction of overlapping pharmacophore feature types, feature counts for pharmacophore and structural fingerprints, as well as changes in the resolution in property description for property fingerprints. The effects of each of these changes on virtual screening performance were monitored using two types of training sets, emulating different stages in the drug discovery process. The results demonstrate that these changes all lead to an improvement in virtual screening performance.
Assuntos
Química Farmacêutica/métodos , Avaliação Pré-Clínica de Medicamentos/métodos , Ligantes , Tecnologia Farmacêutica/métodos , Química/métodos , Técnicas de Química Combinatória , Simulação por Computador , Bases de Dados Factuais , Desenho de Fármacos , Modelos Químicos , Relação Quantitativa Estrutura-Atividade , Curva ROC , Receptores Acoplados a Proteínas G/químicaRESUMO
A new consensus approach has been developed for ligand-based virtual screening. It involves combining highly disparate properties in order to improve performance in virtual screening. The properties include structural, 2D pharmacophore and property-based fingerprints, scores derived using BCUT descriptors, and 3D pharmacophore approaches. Different approaches for the combination of all or some of these methods have been tested. Logistic regression and sum ranks were found to be the most advantageous in different pharmaceutical applications. The three major reasons consensus scoring appears to enrich data sets better than single scoring functions are (1) using multiple scoring functions is similar to repeated samplings, in which case the mean is closer to the true value than any single value, (2) due to the better clustering of actives, multiple sampling will recover more actives than inactives, and (3) different methods seem to agree more on the ranking of the actives than on the inactives. Furthermore, consensus results are not only better but are also more consistent across receptor systems.