RESUMO
Cytosolic sulfotransferases (SULTs) are a family of enzymes responsible for the sulfation of small endogenous and exogenous compounds. SULTs contribute to the conjugation phase of metabolism and share substrates with the uridine 5'-diphospho-glucuronosyltransferase (UGT) family of enzymes. UGTs are considered to be the most important enzymes in the conjugation phase, and SULTs are an auxiliary enzyme system to them. Understanding how the regioselectivity of SULTs differs from that of UGTs is essential from the perspective of developing novel drug candidates. We present a general ligand-based SULT model trained and tested using high-quality experimental regioselectivity data. The current study suggests that, unlike other metabolic enzymes in the modification and conjugation phases, the SULT regioselectivity is not strongly influenced by the activation energy of the rate-limiting step of the catalysis. Instead, the prominent role is played by the substrate binding site of SULT. Thus, the model is trained only on steric and orientation descriptors, which mimic the binding pocket of SULT. The resulting classification model, which predicts whether a site is metabolized, achieved a Cohen's kappa of 0.71.
Assuntos
Sulfotransferases , Catálise , Sítios de Ligação , Sulfotransferases/química , Sulfotransferases/metabolismoRESUMO
1. Unexpected metabolism could lead to the failure of many late-stage drug candidates or even the withdrawal of approved drugs. Thus, it is critical to predict and study the dominant routes of metabolism in the early stages of research. In this study, we describe the development and validation of a 'WhichEnzyme' model that accurately predicts the enzyme families most likely to be responsible for a drug-like molecule's metabolism. Furthermore, we combine this model with our previously published regioselectivity models for Cytochromes P450, Aldehyde Oxidases, Flavin-containing Monooxygenases, UDP-glucuronosyltransferases and Sulfotransferases - the most important Phase I and Phase II drug metabolising enzymes - and a 'WhichP450' model that predicts the Cytochrome P450 isoform(s) responsible for a compound's metabolism. The regioselectivity models are based on a mechanistic understanding of these enzymes' actions, and use quantum mechanical simulations with machine learning methods to accurately predict sites of metabolism and the resulting metabolites. We train heuristic based on the outputs of the 'WhichEnzyme', 'WhichP450', and regioselectivity models to determine the most likely routes of metabolism and metabolites to be observed experimentally. Finally, we demonstrate that this combination delivers high sensitivity in identifying experimentally reported metabolites and higher precision than other methods for predicting in vivo metabolite profiles.
RESUMO
Animal pharmacokinetic (PK) data as well as human and animal in vitro systems are utilized in drug discovery to define the rate and route of drug elimination. Accurate prediction and mechanistic understanding of drug clearance and disposition in animals provide a degree of confidence for extrapolation to humans. In addition, prediction of in vivo properties can be used to improve design during drug discovery, help select compounds with better properties, and reduce the number of in vivo experiments. In this study, we generated machine learning models able to predict rat in vivo PK parameters and concentration-time PK profiles based on the molecular chemical structure and either measured or predicted in vitro parameters. The models were trained on internal in vivo rat PK data for over 3000 diverse compounds from multiple projects and therapeutic areas, and the predicted endpoints include clearance and oral bioavailability. We compared the performance of various traditional machine learning algorithms and deep learning approaches, including graph convolutional neural networks. The best models for PK parameters achieved R2 = 0.63 [root mean squared error (RMSE) = 0.26] for clearance and R2 = 0.55 (RMSE = 0.46) for bioavailability. The models provide a fast and cost-efficient way to guide the design of molecules with optimal PK profiles, to enable the prediction of virtual compounds at the point of design, and to drive prioritization of compounds for in vivo assays.
Assuntos
Aprendizado de Máquina , Modelos Biológicos , Animais , Disponibilidade Biológica , Descoberta de Drogas , Taxa de Depuração Metabólica , Preparações Farmacêuticas , Farmacocinética , RatosRESUMO
We present a study based on density functional theory calculations to explore the rate limiting steps of product formation for oxidation by Flavin-containing Monooxygenase (FMO) and glucuronidation by the UDP-glucuronosyltransferase (UGT) family of enzymes. FMOs are responsible for the modification phase of metabolism of a wide diversity of drugs, working in conjunction with Cytochrome P450 (CYP) family of enzymes, and UGTs are the most important class of drug conjugation enzymes. Reactivity calculations are important for prediction of metabolism by CYPs and reactivity alone explains around 70-85% of the experimentally observed sites of metabolism within CYP substrates. In the current work we extend this approach to propose model systems which can be used to calculate the activation energies, i.e. reactivity, for the rate-limiting steps for both FMO oxidation and glucuronidation of potential sites of metabolism. These results are validated by comparison with the experimentally observed reaction rates and sites of metabolism, indicating that the presented models are suitable to provide the basis of a reactivity component within generalizable models to predict either FMO or UGT metabolism.
Assuntos
Sistema Enzimático do Citocromo P-450/metabolismo , Glucuronosiltransferase/metabolismo , Oxigenases/metabolismo , Preparações Farmacêuticas/metabolismo , Humanos , Inativação Metabólica , Modelos Biológicos , Modelos Moleculares , Oxirredução , Preparações Farmacêuticas/químicaRESUMO
Predicting the sensory properties of compounds is challenging due to the subjective nature of the experimental measurements. This testing relies on a panel of human participants and is therefore also expensive and time-consuming. We describe the application of a state-of-the-art deep learning method, Alchemite™, to the imputation of sparse physicochemical and sensory data and compare the results with conventional quantitative structure-activity relationship methods and a multi-target graph convolutional neural network. The imputation model achieved a substantially higher accuracy of prediction, with improvements in R2 between 0.26 and 0.45 over the next best method for each sensory property. We also demonstrate that robust uncertainty estimates generated by the imputation model enable the most accurate predictions to be identified and that imputation also more accurately predicts activity cliffs, where small changes in compound structure result in large changes in sensory properties. In combination, these results demonstrate that the use of imputation, based on data from less expensive, early experiments, enables better selection of compounds for more costly studies, saving experimental time and resources.
Assuntos
Aprendizado Profundo , Células Receptoras Sensoriais/fisiologia , Algoritmos , Humanos , Relação Quantitativa Estrutura-Atividade , IncertezaRESUMO
The 12th International Society for the Study of Xenobiotics (ISSX) meeting, held in Portland, OR, USA from July 28 to 31, 2019, was attended by diverse members of the pharmaceutical sciences community. The ISSX New Investigators Group provides learning and professional growth opportunities for student and early career members of ISSX. To share meeting content with those who were unable to attend, the ISSX New Investigators herein elected to highlight the "Advances in the Study of Drug Metabolism" symposium, as it engaged attendees with diverse backgrounds. This session covered a wide range of current topics in drug metabolism research including predicting sites and routes of metabolism, metabolite identification, ligand docking, and medicinal and natural products chemistry, and highlighted approaches complemented by computational modeling. In silico tools have been increasingly applied in both academic and industrial settings, alongside traditional and evolving in vitro techniques, to strengthen and streamline pharmaceutical research. Approaches such as quantum mechanics simulations facilitate understanding of reaction energetics toward prediction of routes and sites of drug metabolism. Furthermore, in tandem with crystallographic and orthogonal wet lab techniques for structural validation of drug metabolizing enzymes, in silico models can aid understanding of substrate recognition by particular enzymes, identify metabolic soft spots and predict toxic metabolites for improved molecular design. Of note, integration of chemical synthesis and biosynthesis using natural products remains an important approach for identifying new chemical scaffolds in drug discovery. These subjects, compiled by the symposium organizers, presenters, and the ISSX New Investigators Group, are discussed in this review.
Assuntos
Biologia Computacional , Descoberta de Drogas , Xenobióticos , Congressos como Assunto , Aprendizado de Máquina , Preparações Farmacêuticas/química , Preparações Farmacêuticas/metabolismo , Teoria QuânticaRESUMO
The acid dissociation constant (pKa) has an important influence on molecular properties crucial to compound development in synthesis, formulation, and optimization of absorption, distribution, metabolism, and excretion properties. We will present a method that combines quantum mechanical calculations, at a semi-empirical level of theory, with machine learning to accurately predict pKa for a diverse range of mono- and polyprotic compounds. The resulting model has been tested on two external data sets, one specifically used to test pKa prediction methods (SAMPL6) and the second covering known drugs containing basic functionalities. Both sets were predicted with excellent accuracy (root-mean-square errors of 0.7-1.0 log units), comparable to other methodologies using a much higher level of theory and computational cost.
Assuntos
Teoria Quântica , Solventes , TermodinâmicaRESUMO
Contemporary deep learning approaches still struggle to bring a useful improvement in the field of drug discovery because of the challenges of sparse, noisy, and heterogeneous data that are typically encountered in this context. We use a state-of-the-art deep learning method, Alchemite, to impute data from drug discovery projects, including multitarget biochemical activities, phenotypic activities in cell-based assays, and a variety of absorption, distribution, metabolism, and excretion (ADME) endpoints. The resulting model gives excellent predictions for activity and ADME endpoints, offering an average increase in R2 of 0.22 versus quantitative structure-activity relationship methods. The model accuracy is robust to combining data across uncorrelated endpoints and projects with different chemical spaces, enabling a single model to be trained for all compounds and endpoints. We demonstrate improvements in accuracy on the latest chemistry and data when updating models with new data as an ongoing medicinal chemistry project progresses.
Assuntos
Aprendizado Profundo , Descoberta de Drogas , Química Farmacêutica , Relação Quantitativa Estrutura-AtividadeRESUMO
In the development of novel pharmaceuticals, the knowledge of how many, and which, Cytochrome P450 isoforms are involved in the phase I metabolism of a compound is important. Potential problems can arise if a compound is metabolised predominantly by a single isoform in terms of drug-drug interactions or genetic polymorphisms that would lead to variations in exposure in the general population. Combined with models of regioselectivities of metabolism by each isoform, such a model would also aid in the prediction of the metabolites likely to be formed by P450-mediated metabolism. We describe the generation of a multi-class random forest model to predict which, out of a list of the seven leading Cytochrome P450 isoforms, would be the major metabolising isoforms for a novel compound. The model has a 76% success rate with a top-1 criterion and an 88% success rate for a top-2 criterion and shows significant enrichment over randomised models.
Assuntos
Sistema Enzimático do Citocromo P-450/química , Sistema Enzimático do Citocromo P-450/metabolismo , Modelos Moleculares , Área Sob a Curva , Análise por Conglomerados , Bases de Dados de Proteínas , Interações Medicamentosas , Estrutura Molecular , Isoformas de Proteínas , Software , Relação Estrutura-AtividadeRESUMO
We describe methods for predicting cytochrome P450 (CYP) metabolism incorporating both pathway-specific reactivity and isoform-specific accessibility considerations. Semiempirical quantum mechanical (QM) simulations, parametrized using experimental data and ab initio calculations, estimate the reactivity of each potential site of metabolism (SOM) in the context of the whole molecule. Ligand-based models, trained using high-quality regioselectivity data, correct for orientation and steric effects of the different CYP isoform binding pockets. The resulting models identify a SOM in the top 2 predictions for between 82% and 91% of compounds in independent test sets across seven CYP isoforms. In addition to predicting the relative proportion of metabolite formation at each site, these methods estimate the activation energy at each site, from which additional information can be derived regarding their lability in absolute terms. We illustrate how this can guide the design of compounds to overcome issues with rapid CYP metabolism.
Assuntos
Sistema Enzimático do Citocromo P-450/metabolismo , Modelos Biológicos , Teoria Quântica , Estereoisomerismo , Especificidade por SubstratoRESUMO
All of the experimental compound data with which we work have significant uncertainties, due to imperfect correlations between experimental systems and the ultimate in vivo properties of compounds and the inherent variability in experimental conditions. When using these data to make decisions, it is essential that these uncertainties are taken into account to avoid making inappropriate decisions in the selection of compounds, which can lead to wasted effort and missed opportunities. In this paper we will consider approaches to rigorously account for uncertainties when selecting between compounds or assessing compounds against a property criterion; first for an individual measurement of a single property and then for multiple measurements of a property for the same compound. We will then explore how uncertainties in multiple properties can be combined when assessing compounds against a profile of criteria, a process known as multi-parameter optimisation. This guides rigorous decision-making using complex, uncertain data to focus on compounds with the best chance of success, while avoiding missed opportunities by inappropriately rejecting compounds.
Assuntos
Interpretação Estatística de Dados , Tomada de Decisões , Descoberta de Drogas/métodos , Confiabilidade dos Dados , Descoberta de Drogas/estatística & dados numéricos , Inativação Metabólica , Farmacocinética , Probabilidade , Distribuição Tecidual , IncertezaRESUMO
We present a transferable MACE interatomic potential that is applicable to open- and closed-shell drug-like molecules containing hydrogen, carbon, and oxygen atoms. Including an accurate description of radical species extends the scope of possible applications to bond dissociation energy (BDE) prediction, for example, in the context of cytochrome P450 (CYP) metabolism. The transferability of the MACE potential was validated on the COMP6 data set, containing only closed-shell molecules, where it reaches better accuracy than the readily available general ANI-2x potential. MACE achieves similar accuracy on two CYP metabolism-specific data sets, which include open- and closed-shell structures. This model enables us to calculate the aliphatic C-H BDE, which allows us to compare reaction energies of hydrogen abstraction, which is the rate-limiting step of the aliphatic hydroxylation reaction catalyzed by CYPs. On the "CYP 3A4" data set, MACE achieves a BDE RMSE of 1.37 kcal/mol and better prediction of BDE ranks than alternatives: the semiempirical AM1 and GFN2-xTB methods and the ALFABET model that directly predicts bond dissociation enthalpies. Finally, we highlight the smoothness of the MACE potential over paths of sp3C-H bond elongation and show that a minimal extension is enough for the MACE model to start finding reasonable minimum energy paths of methoxy radical-mediated hydrogen abstraction. Altogether, this work lays the ground for further extensions of scope in terms of chemical elements, (CYP-mediated) reaction classes and modeling the full reaction paths, not only BDEs.
Assuntos
Avaliação Pré-Clínica de Medicamentos/métodos , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos , Técnicas de Cultura de Órgãos/métodos , Testes de Toxicidade Aguda/métodos , Animais , Relação Dose-Resposta a Droga , Coração/efeitos dos fármacos , Humanos , Fígado/efeitos dos fármacos , Preparações Farmacêuticas/administração & dosagem , Preparações Farmacêuticas/sangue , FarmacocinéticaRESUMO
In this article, we discuss what we mean by 'design' and contrast this with the application of computational methods in drug discovery. We suggest that the predictivity of the computational models currently applied in drug discovery is not yet sufficient to permit a true design paradigm, as demonstrated by the large number of compounds that must currently be synthesised and tested to identify a successful drug. However, despite the uncertainties in predictions, computational methods have enormous potential value in narrowing the range of compounds to consider, by eliminating those that have negligible chance of being a successful drug, while focussing efforts on chemistries with the best likelihood of success. Applied appropriately, computational approaches can support decision-makers in achieving multi-parameter optimisation to guide the selection and design of compounds with the best chance of achieving an appropriate balance of properties for a drug discovery project's objectives. Finally, we consider some approaches that may contribute over the next 25 years to improve the accuracy and transferability of computational models in drug discovery and move towards a genuine design process.
Assuntos
Simulação por Computador/tendências , Desenho Assistido por Computador/tendências , Descoberta de Drogas/tendências , Humanos , Ligantes , Modelos Moleculares , Ligação ProteicaRESUMO
Unexpected metabolism in modification and conjugation phases can lead to the failure of many late-stage drug candidates or even withdrawal of approved drugs. Thus, it is critical to predict the sites of metabolism (SoM) for enzymes, which interact with drug-like molecules, in the early stages of the research. This study presents methods for predicting the isoform-specific metabolism for human AOs, FMOs, and UGTs and general CYP metabolism for preclinical species. The models use semi-empirical quantum mechanical simulations, validated using experimentally obtained data and DFT calculations, to estimate the reactivity of each SoM in the context of the whole molecule. Ligand-based models, trained and tested using high-quality regioselectivity data, combine the reactivity of the potential SoM with the orientation and steric effects of the binding pockets of the different enzyme isoforms. The resulting models achieve κ values of up to 0.94 and AUC of up to 0.92.
Assuntos
Aprendizado de Máquina , Humanos , LigantesRESUMO
A central challenge of antimalarial therapy is the emergence of resistance to the components of artemisinin-based combination therapies (ACTs) and the urgent need for new drugs acting through novel mechanism of action. Over the last decade, compounds identified in phenotypic high throughput screens (HTS) have provided the starting point for six candidate drugs currently in the Medicines for Malaria Venture (MMV) clinical development portfolio. However, the published screening data which provided much of the new chemical matter for malaria drug discovery projects have been extensively mined. Here we present a new screening and selection cascade for generation of hit compounds active against the blood stage of Plasmodium falciparum. In addition, we validate our approach by testing a library of 141,786 compounds not reported earlier as being tested against malaria. The Hit Generation Library 1 (HGL1) was designed to maximise the chemical diversity and novelty of compounds with physicochemical properties associated with potential for further development. A robust HTS cascade containing orthogonal efficacy and cytotoxicity assays, including a newly developed and validated nanoluciferase-based assay was used to profile the compounds. 75 compounds (Screening Active hit rate of 0.05%) were identified meeting our stringent selection criteria of potency in drug sensitive (NF54) and drug resistant (Dd2) parasite strains (IC50 ≤ 2 µM), rapid speed of action and cell viability in HepG2 cells (IC50 ≥ 10 µM). Following further profiling, 33 compounds were identified that meet the MMV Confirmed Active profile and are high quality starting points for new antimalarial drug discovery projects.
Assuntos
Antimaláricos , Malária , Antimaláricos/farmacologia , Descoberta de Drogas , Humanos , Luciferases , Malária/tratamento farmacológico , Plasmodium falciparumRESUMO
In this article we describe a computational method that automatically generates chemically relevant compound ideas from an initial molecule, closely integrated with in silico models, and a probabilistic scoring algorithm to highlight the compound ideas most likely to satisfy a user-defined profile of required properties. The new compound ideas are generated using medicinal chemistry 'transformation rules' taken from examples in the literature. We demonstrate that the set of 206 transformations employed is generally applicable, produces a wide range of new compounds, and is representative of the types of modifications previously made to move from lead-like to drug-like compounds. Furthermore, we show that more than 94% of the compounds generated by transformation of typical drug-like molecules are acceptable to experienced medicinal chemists. Finally, we illustrate an application of our approach to the lead that ultimately led to the discovery of duloxetine, a marketed serotonin reuptake inhibitor.
Assuntos
Química Farmacêutica/métodos , Biologia Computacional/métodos , Simulação por Computador , Descoberta de Drogas/métodos , Software , Algoritmos , Desenho de Fármacos , Cloridrato de Duloxetina , Humanos , Relação Quantitativa Estrutura-Atividade , Projetos de Pesquisa , Proteínas da Membrana Plasmática de Transporte de Serotonina/química , Proteínas da Membrana Plasmática de Transporte de Serotonina/metabolismo , Inibidores Seletivos de Recaptação de Serotonina/química , Inibidores Seletivos de Recaptação de Serotonina/metabolismo , Tiofenos/química , Tiofenos/metabolismoRESUMO
Since experimental measurements of NMR chemical shifts provide time and ensemble averaged values, we investigated how these effects should be included when chemical shifts are computed using density functional theory (DFT). We measured the chemical shifts of the N-formyl-L-methionyl-L-leucyl-L-phenylalanine-OMe (MLF) peptide in the solid state, and then used the X-ray structure to calculate the (13)C chemical shifts using the gauge including projector augmented wave (GIPAW) method, which accounts for the periodic nature of the crystal structure, obtaining an overall accuracy of 4.2 ppm. In order to understand the origin of the difference between experimental and calculated chemical shifts, we carried out first-principles molecular dynamics simulations to characterize the molecular motion of the MLF peptide on the picosecond time scale. We found that (13)C chemical shifts experience very rapid fluctuations of more than 20 ppm that are averaged out over less than 200 fs. Taking account of these fluctuations in the calculation of the chemical shifts resulted in an accuracy of 3.3 ppm. To investigate the effects of averaging over longer time scales we sampled the rotameric states populated by the MLF peptides in the solid state by performing a total of 5 micros classical molecular dynamics simulations. By averaging the chemical shifts over these rotameric states, we increased the accuracy of the chemical shift calculations to 3.0 ppm, with less than 1 ppm error in 10 out of 22 cases. These results suggests that better DFT-based predictions of chemical shifts of peptides and proteins will be achieved by developing improved computational strategies capable of taking into account the averaging process up to the millisecond time scale on which the chemical shift measurements report.
Assuntos
N-Formilmetionina Leucil-Fenilalanina/química , Cristalização , Ressonância Magnética Nuclear Biomolecular , Conformação Proteica , Fatores de TempoRESUMO
In this article, we extend the application of the Gaussian processes technique to classification quantitative structure-activity relationship modeling problems. We explore two approaches, an intrinsic Gaussian processes classification technique and a probit treatment of the Gaussian processes regression method. Here, we describe the basic concepts of the methods and apply these techniques to building category models of absorption, distribution, metabolism, excretion, toxicity and target activity data. We also compare the performance of Gaussian processes for classification to other known computational methods, namely decision trees, random forest, support vector machines, and probit partial least squares. The results indicate that, while no method consistently generates the best model, the Gaussian processes classifier often produces more predictive models than those of the random forest or support vector machines and was rarely significantly outperformed.
Assuntos
Classificação/métodos , Preparações Farmacêuticas/metabolismo , Relação Quantitativa Estrutura-Atividade , Inteligência Artificial , Barreira Hematoencefálica/metabolismo , Árvores de Decisões , Descoberta de Drogas , Canais de Potássio Éter-A-Go-Go/antagonistas & inibidores , Humanos , Concentração de Íons de Hidrogênio , Absorção Intestinal , Análise dos Mínimos Quadrados , Modelos Biológicos , Distribuição Normal , Preparações Farmacêuticas/química , Solubilidade , Água/químicaRESUMO
When we build a predictive model of a drug property we rigorously assess its predictive accuracy, but we are rarely able to address the most important question, "How useful will the model be in making a decision in a practical context?" To answer this requires an understanding of the prior probability distribution ("the prior") and hence prevalence of negative outcomes due to the property being assessed. In this perspective, we illustrate the importance of the prior to assess the utility of a model in different contexts: to select or eliminate compounds, to prioritise compounds for further investigation using more expensive screens, or to combine models for different properties to select compounds with a balance of properties. In all three contexts, a better understanding of the prior probabilities of adverse events due to key factors will improve our ability to make good decisions in drug discovery, finding higher quality molecules more efficiently.