Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
1.
Int J Mol Sci ; 21(15)2020 Aug 03.
Artículo en Inglés | MEDLINE | ID: mdl-32756326

RESUMEN

Nowadays, the problem of the model's applicability domain (AD) definition is an active research topic in chemoinformatics. Although many various AD definitions for the models predicting properties of molecules (Quantitative Structure-Activity/Property Relationship (QSAR/QSPR) models) were described in the literature, no one for chemical reactions (Quantitative Reaction-Property Relationships (QRPR)) has been reported to date. The point is that a chemical reaction is a much more complex object than an individual molecule, and its yield, thermodynamic and kinetic characteristics depend not only on the structures of reactants and products but also on experimental conditions. The QRPR models' performance largely depends on the way that chemical transformation is encoded. In this study, various AD definition methods extensively used in QSAR/QSPR studies of individual molecules, as well as several novel approaches suggested in this work for reactions, were benchmarked on several reaction datasets. The ability to exclude wrong reaction types, increase coverage, improve the model performance and detect Y-outliers were tested. As a result, several "best" AD definitions for the QRPR models predicting reaction characteristics have been revealed and tested on a previously published external dataset with a clear AD definition problem.


Asunto(s)
Quimioinformática/tendencias , Dominios Proteicos , Relación Estructura-Actividad Cuantitativa , Termodinámica , Fenómenos Químicos , Cinética , Modelos Moleculares
2.
Molecules ; 25(2)2020 Jan 17.
Artículo en Inglés | MEDLINE | ID: mdl-31963467

RESUMEN

Pharmacophore modeling is usually considered as a special type of virtual screening without probabilistic nature. Correspondence of at least one conformation of a molecule to pharmacophore is considered as evidence of its bioactivity. We show that pharmacophores can be treated as one-class machine learning models, and the probability the reflecting model's confidence can be assigned to a pharmacophore on the basis of their precision of active compounds identification on a calibration set. Two schemes (Max and Mean) of probability calculation for consensus prediction based on individual pharmacophore models were proposed. Both approaches to some extent correspond to commonly used consensus approaches like the common hit approach or the one based on a logical OR operation uniting hit lists of individual models. Unlike some known approaches, the proposed ones can rank compounds retrieved by multiple models. These approaches were benchmarked on multiple ChEMBL datasets used for ligand-based pharmacophore modeling and externally validated on corresponding DUD-E datasets. The influence of complexity of pharmacophores and their performance on a calibration set on results of virtual screening was analyzed. It was shown that Max and Mean approaches have superior early enrichment to the commonly used approaches. Thus, a well-performing, easy-to-implement, and probabilistic alternative to existing approaches for pharmacophore-based virtual screening was proposed.


Asunto(s)
Evaluación Preclínica de Medicamentos/métodos , Preparaciones Farmacéuticas/análisis , Animales , Simulación por Computador , Humanos , Ligandos , Aprendizaje Automático , Modelos Químicos , Modelos Moleculares , Conformación Molecular , Unión Proteica
3.
J Chem Inf Model ; 59(11): 4569-4576, 2019 11 25.
Artículo en Inglés | MEDLINE | ID: mdl-31638794

RESUMEN

Here, we describe a concept of conjugated models for several properties (activities) linked by a strict mathematical relationship. This relationship can be directly integrated analytically into the ridge regression (RR) algorithm or accounted for in a special case of "twin" neural networks (NN). Developed approaches were applied to the modeling of the logarithm of the prototropic tautomeric constant (logKT) which can be expressed as the difference between the acidity constants (pKa) of two related tautomers. Both conjugated and individual RR and NN models for logKT and pKa were developed. The modeling set included 639 tautomeric constants and 2371 acidity constants of organic molecules in various solvents. A descriptor vector for each reaction resulted from the concatenation of structural descriptors and some parameters for reaction conditions. For the former, atom-centered substructural fragments describing acid sites in tautomer molecules were used. The latter were automatically identified using the condensed graph of reaction approach. Conjugated models performed similarly to the best individual models for logKT and pKa. At the same time, the physically grounded relationship between logKT and pKa was respected only for conjugated but not individual models.


Asunto(s)
Compuestos Orgánicos/química , Preparaciones Farmacéuticas/química , Ácidos/química , Algoritmos , Descubrimiento de Drogas , Modelos Químicos , Estructura Molecular , Redes Neurales de la Computación , Relación Estructura-Actividad Cuantitativa , Solventes/química , Estereoisomerismo
4.
J Control Release ; 353: 903-914, 2023 01.
Artículo en Inglés | MEDLINE | ID: mdl-36402234

RESUMEN

Active learning (AL) has become a subject of active recent research both in industry and academia as an efficient approach for rapid design and discovery of novel chemicals, materials, and polymers. Herein, we have assessed the applicability of AL for the discovery of polymeric micelle formulations for poorly soluble drugs. We were motivated by the key advantages of this approach making it a desirable strategy for rational design of drug delivery systems due toto its ability to (i) employ relatively small datasets for model development, (ii) iterate between model development and model assessment using small external datasets that can be either generated in focused experimental studies or formed from subsets of the initial training data, and (iii) progressively evolve models towards increasingly more reliable predictions and the identification of novel chemicals with the desired properties. In this study, we compared various AL protocols for their effectiveness in finding biologically active molecules using synthetic datasets. We have investigated the dependency of AL performance on the size of the initial training set, the relative complexity of the task, and the choice of the initial training dataset. We found that AL techniques as applied to regression modeling offer no benefits over random search, while AL used for classification tasks performs better than models built for randomly selected training sets but still quite far from perfect. Using the best performing AL protocol,. Finally, the best performing AL approach was employed to discover and experimentally validate novel binding polymers for a case study of asialoglycoprotein receptor (ASGPR).


Asunto(s)
Polímeros , Aprendizaje Basado en Problemas , Polímeros/química , Micelas , Sistemas de Liberación de Medicamentos , Péptidos
5.
Mol Inform ; 41(4): e2100138, 2022 04.
Artículo en Inglés | MEDLINE | ID: mdl-34726834

RESUMEN

In this paper, we compare the most popular Atom-to-Atom Mapping (AAM) tools: ChemAxon,[1] Indigo,[2] RDTool,[3] NameRXN (NextMove),[4] and RXNMapper[5] which implement different AAM algorithms. An open-source RDTool program was optimized, and its modified version ("new RDTool") was considered together with several consensus mapping strategies. The Condensed Graph of Reaction approach was used to calculate chemical distances and develop the "AAM fixer" algorithm for an automatized correction of erroneous mapping. The benchmarking calculations were performed on a Golden dataset containing 1851 manually mapped and curated reactions. The best performing RXNMapper program together with the AMM Fixer was applied to map the USPTO database. The Golden dataset, mapped USPTO and optimized RDTool are available in the GitHub repository https://github.com/Laboratoire-de-Chemoinformatique.


Asunto(s)
Benchmarking , Fenómenos Bioquímicos , Algoritmos , Bases de Datos Factuales
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA