Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 22
Filtrar
1.
J Chem Theory Comput ; 20(1): 164-177, 2024 Jan 09.
Artículo en Inglés | MEDLINE | ID: mdl-38108269

RESUMEN

We present a transferable MACE interatomic potential that is applicable to open- and closed-shell drug-like molecules containing hydrogen, carbon, and oxygen atoms. Including an accurate description of radical species extends the scope of possible applications to bond dissociation energy (BDE) prediction, for example, in the context of cytochrome P450 (CYP) metabolism. The transferability of the MACE potential was validated on the COMP6 data set, containing only closed-shell molecules, where it reaches better accuracy than the readily available general ANI-2x potential. MACE achieves similar accuracy on two CYP metabolism-specific data sets, which include open- and closed-shell structures. This model enables us to calculate the aliphatic C-H BDE, which allows us to compare reaction energies of hydrogen abstraction, which is the rate-limiting step of the aliphatic hydroxylation reaction catalyzed by CYPs. On the "CYP 3A4" data set, MACE achieves a BDE RMSE of 1.37 kcal/mol and better prediction of BDE ranks than alternatives: the semiempirical AM1 and GFN2-xTB methods and the ALFABET model that directly predicts bond dissociation enthalpies. Finally, we highlight the smoothness of the MACE potential over paths of sp3C-H bond elongation and show that a minimal extension is enough for the MACE model to start finding reasonable minimum energy paths of methoxy radical-mediated hydrogen abstraction. Altogether, this work lays the ground for further extensions of scope in terms of chemical elements, (CYP-mediated) reaction classes and modeling the full reaction paths, not only BDEs.

2.
Xenobiotica ; : 1-49, 2023 Nov 15.
Artículo en Inglés | MEDLINE | ID: mdl-37966132

RESUMEN

1. Unexpected metabolism could lead to the failure of many late-stage drug candidates or even the withdrawal of approved drugs. Thus, it is critical to predict and study the dominant routes of metabolism in the early stages of research. In this study, we describe the development and validation of a 'WhichEnzyme' model that accurately predicts the enzyme families most likely to be responsible for a drug-like molecule's metabolism. Furthermore, we combine this model with our previously published regioselectivity models for Cytochromes P450, Aldehyde Oxidases, Flavin-containing Monooxygenases, UDP-glucuronosyltransferases and Sulfotransferases - the most important Phase I and Phase II drug metabolising enzymes - and a 'WhichP450' model that predicts the Cytochrome P450 isoform(s) responsible for a compound's metabolism. The regioselectivity models are based on a mechanistic understanding of these enzymes' actions, and use quantum mechanical simulations with machine learning methods to accurately predict sites of metabolism and the resulting metabolites. We train heuristic based on the outputs of the 'WhichEnzyme', 'WhichP450', and regioselectivity models to determine the most likely routes of metabolism and metabolites to be observed experimentally. Finally, we demonstrate that this combination delivers high sensitivity in identifying experimentally reported metabolites and higher precision than other methods for predicting in vivo metabolite profiles.

3.
J Chem Inf Model ; 63(11): 3340-3349, 2023 06 12.
Artículo en Inglés | MEDLINE | ID: mdl-37229540

RESUMEN

Cytosolic sulfotransferases (SULTs) are a family of enzymes responsible for the sulfation of small endogenous and exogenous compounds. SULTs contribute to the conjugation phase of metabolism and share substrates with the uridine 5'-diphospho-glucuronosyltransferase (UGT) family of enzymes. UGTs are considered to be the most important enzymes in the conjugation phase, and SULTs are an auxiliary enzyme system to them. Understanding how the regioselectivity of SULTs differs from that of UGTs is essential from the perspective of developing novel drug candidates. We present a general ligand-based SULT model trained and tested using high-quality experimental regioselectivity data. The current study suggests that, unlike other metabolic enzymes in the modification and conjugation phases, the SULT regioselectivity is not strongly influenced by the activation energy of the rate-limiting step of the catalysis. Instead, the prominent role is played by the substrate binding site of SULT. Thus, the model is trained only on steric and orientation descriptors, which mimic the binding pocket of SULT. The resulting classification model, which predicts whether a site is metabolized, achieved a Cohen's kappa of 0.71.


Asunto(s)
Sulfotransferasas , Catálisis , Sitios de Unión , Sulfotransferasas/química , Sulfotransferasas/metabolismo
4.
J Med Chem ; 65(20): 14066-14081, 2022 10 27.
Artículo en Inglés | MEDLINE | ID: mdl-36239985

RESUMEN

Unexpected metabolism in modification and conjugation phases can lead to the failure of many late-stage drug candidates or even withdrawal of approved drugs. Thus, it is critical to predict the sites of metabolism (SoM) for enzymes, which interact with drug-like molecules, in the early stages of the research. This study presents methods for predicting the isoform-specific metabolism for human AOs, FMOs, and UGTs and general CYP metabolism for preclinical species. The models use semi-empirical quantum mechanical simulations, validated using experimentally obtained data and DFT calculations, to estimate the reactivity of each SoM in the context of the whole molecule. Ligand-based models, trained and tested using high-quality regioselectivity data, combine the reactivity of the potential SoM with the orientation and steric effects of the binding pockets of the different enzyme isoforms. The resulting models achieve κ values of up to 0.94 and AUC of up to 0.92.


Asunto(s)
Aprendizaje Automático , Humanos , Ligandos
5.
SLAS Discov ; 27(6): 337-348, 2022 09.
Artículo en Inglés | MEDLINE | ID: mdl-35872229

RESUMEN

A central challenge of antimalarial therapy is the emergence of resistance to the components of artemisinin-based combination therapies (ACTs) and the urgent need for new drugs acting through novel mechanism of action. Over the last decade, compounds identified in phenotypic high throughput screens (HTS) have provided the starting point for six candidate drugs currently in the Medicines for Malaria Venture (MMV) clinical development portfolio. However, the published screening data which provided much of the new chemical matter for malaria drug discovery projects have been extensively mined. Here we present a new screening and selection cascade for generation of hit compounds active against the blood stage of Plasmodium falciparum. In addition, we validate our approach by testing a library of 141,786 compounds not reported earlier as being tested against malaria. The Hit Generation Library 1 (HGL1) was designed to maximise the chemical diversity and novelty of compounds with physicochemical properties associated with potential for further development. A robust HTS cascade containing orthogonal efficacy and cytotoxicity assays, including a newly developed and validated nanoluciferase-based assay was used to profile the compounds. 75 compounds (Screening Active hit rate of 0.05%) were identified meeting our stringent selection criteria of potency in drug sensitive (NF54) and drug resistant (Dd2) parasite strains (IC50 ≤ 2 µM), rapid speed of action and cell viability in HepG2 cells (IC50 ≥ 10 µM). Following further profiling, 33 compounds were identified that meet the MMV Confirmed Active profile and are high quality starting points for new antimalarial drug discovery projects.


Asunto(s)
Antimaláricos , Malaria , Antimaláricos/farmacología , Descubrimiento de Drogas , Humanos , Luciferasas , Malaria/tratamiento farmacológico , Plasmodium falciparum
6.
J Comput Aided Mol Des ; 35(4): 541-555, 2021 04.
Artículo en Inglés | MEDLINE | ID: mdl-32533369

RESUMEN

We present a study based on density functional theory calculations to explore the rate limiting steps of product formation for oxidation by Flavin-containing Monooxygenase (FMO) and glucuronidation by the UDP-glucuronosyltransferase (UGT) family of enzymes. FMOs are responsible for the modification phase of metabolism of a wide diversity of drugs, working in conjunction with Cytochrome P450 (CYP) family of enzymes, and UGTs are the most important class of drug conjugation enzymes. Reactivity calculations are important for prediction of metabolism by CYPs and reactivity alone explains around 70-85% of the experimentally observed sites of metabolism within CYP substrates. In the current work we extend this approach to propose model systems which can be used to calculate the activation energies, i.e. reactivity, for the rate-limiting steps for both FMO oxidation and glucuronidation of potential sites of metabolism. These results are validated by comparison with the experimentally observed reaction rates and sites of metabolism, indicating that the presented models are suitable to provide the basis of a reactivity component within generalizable models to predict either FMO or UGT metabolism.


Asunto(s)
Sistema Enzimático del Citocromo P-450/metabolismo , Glucuronosiltransferasa/metabolismo , Oxigenasas/metabolismo , Preparaciones Farmacéuticas/metabolismo , Humanos , Inactivación Metabólica , Modelos Biológicos , Modelos Moleculares , Oxidación-Reducción , Preparaciones Farmacéuticas/química
7.
J Chem Inf Model ; 60(6): 2848-2857, 2020 06 22.
Artículo en Inglés | MEDLINE | ID: mdl-32478517

RESUMEN

Contemporary deep learning approaches still struggle to bring a useful improvement in the field of drug discovery because of the challenges of sparse, noisy, and heterogeneous data that are typically encountered in this context. We use a state-of-the-art deep learning method, Alchemite, to impute data from drug discovery projects, including multitarget biochemical activities, phenotypic activities in cell-based assays, and a variety of absorption, distribution, metabolism, and excretion (ADME) endpoints. The resulting model gives excellent predictions for activity and ADME endpoints, offering an average increase in R2 of 0.22 versus quantitative structure-activity relationship methods. The model accuracy is robust to combining data across uncorrelated endpoints and projects with different chemical spaces, enabling a single model to be trained for all compounds and endpoints. We demonstrate improvements in accuracy on the latest chemistry and data when updating models with new data as an ongoing medicinal chemistry project progresses.


Asunto(s)
Aprendizaje Profundo , Descubrimiento de Drogas , Química Farmacéutica , Relación Estructura-Actividad Cuantitativa
8.
J Comput Aided Mol Des ; 32(4): 537-546, 2018 04.
Artículo en Inglés | MEDLINE | ID: mdl-29464466

RESUMEN

In the development of novel pharmaceuticals, the knowledge of how many, and which, Cytochrome P450 isoforms are involved in the phase I metabolism of a compound is important. Potential problems can arise if a compound is metabolised predominantly by a single isoform in terms of drug-drug interactions or genetic polymorphisms that would lead to variations in exposure in the general population. Combined with models of regioselectivities of metabolism by each isoform, such a model would also aid in the prediction of the metabolites likely to be formed by P450-mediated metabolism. We describe the generation of a multi-class random forest model to predict which, out of a list of the seven leading Cytochrome P450 isoforms, would be the major metabolising isoforms for a novel compound. The model has a 76% success rate with a top-1 criterion and an 88% success rate for a top-2 criterion and shows significant enrichment over randomised models.


Asunto(s)
Sistema Enzimático del Citocromo P-450/química , Sistema Enzimático del Citocromo P-450/metabolismo , Modelos Moleculares , Área Bajo la Curva , Análisis por Conglomerados , Bases de Datos de Proteínas , Interacciones Farmacológicas , Estructura Molecular , Isoformas de Proteínas , Programas Informáticos , Relación Estructura-Actividad
9.
J Chem Inf Model ; 56(11): 2180-2193, 2016 11 28.
Artículo en Inglés | MEDLINE | ID: mdl-27753488

RESUMEN

We describe methods for predicting cytochrome P450 (CYP) metabolism incorporating both pathway-specific reactivity and isoform-specific accessibility considerations. Semiempirical quantum mechanical (QM) simulations, parametrized using experimental data and ab initio calculations, estimate the reactivity of each potential site of metabolism (SOM) in the context of the whole molecule. Ligand-based models, trained using high-quality regioselectivity data, correct for orientation and steric effects of the different CYP isoform binding pockets. The resulting models identify a SOM in the top 2 predictions for between 82% and 91% of compounds in independent test sets across seven CYP isoforms. In addition to predicting the relative proportion of metabolite formation at each site, these methods estimate the activation energy at each site, from which additional information can be derived regarding their lability in absolute terms. We illustrate how this can guide the design of compounds to overcome issues with rapid CYP metabolism.


Asunto(s)
Sistema Enzimático del Citocromo P-450/metabolismo , Modelos Biológicos , Teoría Cuántica , Estereoisomerismo , Especificidad por Sustrato
10.
J Med Chem ; 59(9): 4267-77, 2016 05 12.
Artículo en Inglés | MEDLINE | ID: mdl-26901568

RESUMEN

Drug discovery is a multiparameter optimization process in which the goal of a project is to identify compounds that meet multiple property criteria required to achieve a therapeutic objective. However, once a profile of property criteria has been chosen, the impact of these criteria on the decisions made regarding progression of compounds or chemical series should be carefully considered. In some cases the decision is very sensitive to a specific property criterion, and such a criterion may artificially distort the direction of the project; any uncertainty in the "correct" value or the importance of this criterion may lead to valuable opportunities being missed. In this paper, we describe a method for analyzing the sensitivity of the prioritization of compounds to a multiparameter profile of property criteria. We show how the results can be easily interpreted and illustrate how this analysis can highlight new avenues for exploration.


Asunto(s)
Descubrimiento de Drogas , Probabilidad , Incertidumbre
11.
J Comput Aided Mol Des ; 29(9): 809-16, 2015 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-26126976

RESUMEN

All of the experimental compound data with which we work have significant uncertainties, due to imperfect correlations between experimental systems and the ultimate in vivo properties of compounds and the inherent variability in experimental conditions. When using these data to make decisions, it is essential that these uncertainties are taken into account to avoid making inappropriate decisions in the selection of compounds, which can lead to wasted effort and missed opportunities. In this paper we will consider approaches to rigorously account for uncertainties when selecting between compounds or assessing compounds against a property criterion; first for an individual measurement of a single property and then for multiple measurements of a property for the same compound. We will then explore how uncertainties in multiple properties can be combined when assessing compounds against a profile of criteria, a process known as multi-parameter optimisation. This guides rigorous decision-making using complex, uncertain data to focus on compounds with the best chance of success, while avoiding missed opportunities by inappropriately rejecting compounds.


Asunto(s)
Interpretación Estadística de Datos , Toma de Decisiones , Descubrimiento de Drogas/métodos , Exactitud de los Datos , Descubrimiento de Drogas/estadística & datos numéricos , Inactivación Metabólica , Farmacocinética , Probabilidad , Distribución Tisular , Incertidumbre
12.
Future Med Chem ; 6(5): 577-93, 2014 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-24649959

RESUMEN

A number of alternative variables have appeared in the medicinal chemistry literature trying to provide a more rigorous formulation of the guidelines proposed by Lipinski to exclude chemical entities with poor pharmacokinetic properties early in the discovery process. Typically, these variables combine the affinity towards the target with physicochemical properties of the ligand and are named efficiencies or ligand efficiencies. Several formulations have been defined and used by different laboratories with different degrees of success. A unified formulation, ligand efficiency indices, was proposed that included efficiency in two complementary variables (i.e., size and polarity) to map and monitor the drug-discovery process (AtlasCBS). The use of this formulation in combination with an extended multiparameter optimization is presented, with examples, as a promising methodology to optimize the drug-discovery process in the future. Future perspectives and challenges for this approach are also discussed.


Asunto(s)
Descubrimiento de Drogas , Química Farmacéutica , Bases de Datos Factuales , Ligandos , Preparaciones Farmacéuticas/química , Preparaciones Farmacéuticas/metabolismo , Farmacocinética , Relación Estructura-Actividad
13.
Drug Discov Today ; 19(5): 680-7, 2014 May.
Artículo en Inglés | MEDLINE | ID: mdl-24451293

RESUMEN

Drug discovery is a process of multiparameter optimisation, with the objective of finding compounds that achieve multiple, project-specific property criteria. These criteria are often based on the subjective opinion of the project team, but analysis of historical data can help to find the most appropriate profile. Computational 'rule induction' approaches enable an objective analysis of complex data to identify interpretable, multiparameter rules that distinguish compounds with the greatest likelihood of success for a project. Each property criterion highlights the most critical data that enable effective compound prioritisation decisions. We illustrate this with two applications: determining rules for simple, drug-like properties; and exploring experimental target inhibition data to find rules to reduce the risk of toxicity.


Asunto(s)
Diseño de Fármacos , Descubrimiento de Drogas/métodos , Descubrimiento de Drogas/normas , Animales , Sistemas de Liberación de Medicamentos/métodos , Sistemas de Liberación de Medicamentos/normas , Humanos
14.
Drug Discov Today ; 19(5): 688-93, 2014 May.
Artículo en Inglés | MEDLINE | ID: mdl-24451294

RESUMEN

Prioritising compounds with a lower chance of causing toxicity, early in the drug discovery process, would help to address the high attrition rate in pharmaceutical R&D. Expert knowledge-based prediction of toxicity can alert chemists if their proposed compounds are likely to have an increased likelihood of causing toxicity. We will discuss how multiparameter optimisation approaches can be used to balance the potential for toxicity with other properties required in a high-quality candidate drug, giving appropriate weight to the alert in the selection of compounds. Furthermore, we will describe how information about the region of a compound that triggers a toxicity alert can be interactively visualised to guide the modification of a compound to reduce the likelihood of toxicity.


Asunto(s)
Diseño de Fármacos , Descubrimiento de Drogas/métodos , Efectos Colaterales y Reacciones Adversas Relacionados con Medicamentos , Animales , Descubrimiento de Drogas/tendencias , Efectos Colaterales y Reacciones Adversas Relacionados con Medicamentos/prevención & control , Humanos , Preparaciones Farmacéuticas/química , Factores de Riesgo
15.
Drug Discov Today ; 18(13-14): 659-66, 2013 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-23458995

RESUMEN

Many definitions of 'drug-like' compound properties have been published; based on the analysis of simple molecular properties of successful drugs. These are typically presented as rules that define acceptable boundaries for these properties. When a compound does not 'fit' within these boundaries then its properties differ from those of the majority of drugs, which could indicate a higher risk of poor pharmacokinetics or safety outcomes in vivo. Here, we review the strengths and weaknesses of these rules and note, in particular, that the overly rigid application of strict cut-off points can introduce artificial distinctions between similar compounds, running the risk of missing valuable opportunities. Alternatively, compounds can be ranked according to their similarity to marketed drugs using a continuous measure of drug-likeness. However, being similar to known drugs does not necessarily mean that a compound is more likely to become a drug and we demonstrate how a new approach, employing Bayesian methods, can be used to compare a set of successful drugs with a set of non-drug compounds to identify those properties that give the greatest distinction between the two sets, and hence the greatest increase in the likelihood of a compound becoming a successful drug. This analysis further illustrates that guidelines for drug-likeness might not be generally applicable across all compound and target classes or therapeutic indications. Therefore, it might be more appropriate to consider specific guidelines for drug-likeness that are project specific.


Asunto(s)
Descubrimiento de Drogas/métodos , Preparaciones Farmacéuticas/clasificación , Farmacología/clasificación , Terminología como Asunto , Efectos Colaterales y Reacciones Adversas Relacionados con Medicamentos , Modelos Estadísticos , Estructura Molecular , Preparaciones Farmacéuticas/química , Relación Estructura-Actividad
16.
Curr Pharm Des ; 18(9): 1292-310, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-22316157

RESUMEN

A successful, efficacious and safe drug must have a balance of properties, including potency against its intended target, appropriate absorption, distribution, metabolism, and elimination (ADME) properties and an acceptable safety profile. Achieving this balance of, often conflicting, requirements is a major challenge in drug discovery. Approaches to simultaneously optimizing many factors in a design are broadly described under the term 'multi-parameter optimization' (MPO). In this review, we will describe how MPO can be applied to efficiently design and select high quality compounds and describe the range of methods that have been employed in drug discovery, including; simple 'rules of thumb' such as Lipinski's rule; desirability functions; Pareto optimization; and probabilistic approaches that take into consideration the uncertainty in all drug discovery data due to predictive error and experimental variability. We will explore how these methods have been applied to predicted and experimental data to reduce attrition and improve the productivity of the drug discovery process.


Asunto(s)
Diseño de Fármacos , Descubrimiento de Drogas/métodos , Farmacocinética , Animales , Sistemas de Liberación de Medicamentos , Efectos Colaterales y Reacciones Adversas Relacionados con Medicamentos , Humanos , Preparaciones Farmacéuticas/metabolismo
17.
Drug Discov Today ; 15(13-14): 561-9, 2010 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-20553956

RESUMEN

Better individual and team decision-making should enhance R&D performance. Reproducible biases affecting human decision-making, known as cognitive biases, are well understood by psychologists. These threaten objectivity and balance and so are credible causes for continuing unpleasant surprises in Development and high operating costs. For four of the most common and insidious cognitive biases, we consider the risks to R&D decision-making and contrast current practice with use of evidence-based medicine by healthcare practitioners. Feedback on problem-solving performance in simulated environments could be one of the simplest ways to help teams improve their selection of compounds and effective screening sequences. Computational tools that encourage objective consideration of all of the available information might also contribute.


Asunto(s)
Investigación Biomédica , Toma de Decisiones en la Organización , Descubrimiento de Drogas , Animales , Investigación Biomédica/economía , Investigación Biomédica/organización & administración , Biología Computacional/métodos , Sistemas de Apoyo a Decisiones Administrativas , Eficiencia Organizacional/economía , Medicina Basada en la Evidencia/métodos , Humanos , Solución de Problemas , Psicología Industrial/métodos
18.
J Chem Inf Model ; 50(6): 1053-61, 2010 Jun 28.
Artículo en Inglés | MEDLINE | ID: mdl-20433177

RESUMEN

In this article, we extend the application of the Gaussian processes technique to classification quantitative structure-activity relationship modeling problems. We explore two approaches, an intrinsic Gaussian processes classification technique and a probit treatment of the Gaussian processes regression method. Here, we describe the basic concepts of the methods and apply these techniques to building category models of absorption, distribution, metabolism, excretion, toxicity and target activity data. We also compare the performance of Gaussian processes for classification to other known computational methods, namely decision trees, random forest, support vector machines, and probit partial least squares. The results indicate that, while no method consistently generates the best model, the Gaussian processes classifier often produces more predictive models than those of the random forest or support vector machines and was rarely significantly outperformed.


Asunto(s)
Clasificación/métodos , Preparaciones Farmacéuticas/metabolismo , Relación Estructura-Actividad Cuantitativa , Inteligencia Artificial , Barrera Hematoencefálica/metabolismo , Árboles de Decisión , Descubrimiento de Drogas , Canales de Potasio Éter-A-Go-Go/antagonistas & inhibidores , Humanos , Concentración de Iones de Hidrógeno , Absorción Intestinal , Análisis de los Mínimos Cuadrados , Modelos Biológicos , Distribución Normal , Preparaciones Farmacéuticas/química , Solubilidad , Agua/química
19.
J Am Chem Soc ; 132(17): 5993-6000, 2010 May 05.
Artículo en Inglés | MEDLINE | ID: mdl-20387894

RESUMEN

Since experimental measurements of NMR chemical shifts provide time and ensemble averaged values, we investigated how these effects should be included when chemical shifts are computed using density functional theory (DFT). We measured the chemical shifts of the N-formyl-L-methionyl-L-leucyl-L-phenylalanine-OMe (MLF) peptide in the solid state, and then used the X-ray structure to calculate the (13)C chemical shifts using the gauge including projector augmented wave (GIPAW) method, which accounts for the periodic nature of the crystal structure, obtaining an overall accuracy of 4.2 ppm. In order to understand the origin of the difference between experimental and calculated chemical shifts, we carried out first-principles molecular dynamics simulations to characterize the molecular motion of the MLF peptide on the picosecond time scale. We found that (13)C chemical shifts experience very rapid fluctuations of more than 20 ppm that are averaged out over less than 200 fs. Taking account of these fluctuations in the calculation of the chemical shifts resulted in an accuracy of 3.3 ppm. To investigate the effects of averaging over longer time scales we sampled the rotameric states populated by the MLF peptides in the solid state by performing a total of 5 micros classical molecular dynamics simulations. By averaging the chemical shifts over these rotameric states, we increased the accuracy of the chemical shift calculations to 3.0 ppm, with less than 1 ppm error in 10 out of 22 cases. These results suggests that better DFT-based predictions of chemical shifts of peptides and proteins will be achieved by developing improved computational strategies capable of taking into account the averaging process up to the millisecond time scale on which the chemical shift measurements report.


Asunto(s)
N-Formilmetionina Leucil-Fenilalanina/química , Cristalización , Resonancia Magnética Nuclear Biomolecular , Conformación Proteica , Factores de Tiempo
20.
J Comput Aided Mol Des ; 22(6-7): 431-40, 2008.
Artículo en Inglés | MEDLINE | ID: mdl-18273554

RESUMEN

In this article, we present an automatic model generation process for building QSAR models using Gaussian Processes, a powerful machine learning modeling method. We describe the stages of the process that ensure models are built and validated within a rigorous framework: descriptor calculation, splitting data into training, validation and test sets, descriptor filtering, application of modeling techniques and selection of the best model. We apply this automatic process to data sets of blood-brain barrier penetration and aqueous solubility and compare the resulting automatically generated models with 'manually' built models using external test sets. The results demonstrate the effectiveness of the automatic model generation process for two types of data sets commonly encountered in building ADME QSAR models, a small set of in vivo data and a large set of physico-chemical data.


Asunto(s)
Modelos Moleculares , Barrera Hematoencefálica , Relación Estructura-Actividad Cuantitativa , Solubilidad , Agua/química
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...