Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 41
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
Chemphyschem ; 23(24): e202200300, 2022 12 16.
Artículo en Inglés | MEDLINE | ID: mdl-35929613

RESUMEN

Machine-learning models were developed to predict the composition profile of a three-compound mixture in liquid-liquid equilibrium (LLE), given the global composition at certain temperature and pressure. A chemoinformatics approach was explored, based on the MOLMAP technology to encode molecules and mixtures. The chemical systems involved an ionic liquid (IL) and two organic molecules. Two complementary models have been optimized for the IL-rich and IL-poor phases. The two global optimized models are highly accurate, and were validated with independent test sets, where combinations of molecule1+molecule2+IL are different from those in the training set. These results highlight the MOLMAP encoding scheme, based on atomic properties to train models that learn relationships between features of complex multi-component chemical systems and their profile of phase compositions.


Asunto(s)
Quimioinformática , Líquidos Iónicos , Líquidos Iónicos/química , Temperatura
2.
J Chem Inf Model ; 61(1): 67-75, 2021 01 25.
Artículo en Inglés | MEDLINE | ID: mdl-33350814

RESUMEN

In this study, machine learning algorithms were investigated for the classification of organic molecules with one carbon chiral center according to the sign of optical rotation. Diverse heterogeneous data sets comprising up to 13,080 compounds and their corresponding optical rotation were retrieved from Reaxys and processed independently for three solvents: dichloromethane, chloroform, and methanol. The molecular structures were represented by chiral descriptors based on the physicochemical and topological properties of ligands attached to the chiral center. The sign of optical rotation was predicted by random forests (RF) and artificial neural networks for independent test sets with an accuracy of up to 75% for dichloromethane, 82% for chloroform, and 82% for methanol. RF probabilities and the availability of structures in the training set with the same spheres of atom types around the chiral center defined applicability domains in which the accuracy is higher.


Asunto(s)
Aprendizaje Automático , Redes Neurales de la Computación , Algoritmos , Estructura Molecular , Rotación Óptica , Estereoisomerismo
3.
Bioinformatics ; 34(1): 120-121, 2018 01 01.
Artículo en Inglés | MEDLINE | ID: mdl-28968640

RESUMEN

Summary: The representation of metabolic reactions strongly relies on visualization, which is a major barrier for blind users. The NavMol software renders the communication and interpretation of molecular structures and reactions accessible by integrating chemoinformatics and assistive technology. NavMol 3.0 provides a molecular editor for metabolic reactions. The user can start with templates of reactions and build from such cores. Atom-to-atom mapping enables changes in the reactants to be reflected in the products (and vice-versa) and the reaction centres to be automatically identified. Blind users can easily interact with the software using the keyboard and text-to-speech technology. Availability and implementation: NavMol 3.0 is free and open source under the GNU general public license (GPLv3), and can be downloaded at http://sourceforge.net/projects/navmol as a JAR file. Contact: joao@airesdesousa.com.


Asunto(s)
Ceguera , Redes y Vías Metabólicas , Auxiliares Sensoriales , Programas Informáticos , Humanos
4.
Mar Drugs ; 16(7)2018 Jul 13.
Artículo en Inglés | MEDLINE | ID: mdl-30011882

RESUMEN

Computational methodologies are assisting the exploration of marine natural products (MNPs) to make the discovery of new leads more efficient, to repurpose known MNPs, to target new metabolites on the basis of genome analysis, to reveal mechanisms of action, and to optimize leads. In silico efforts in drug discovery of NPs have mainly focused on two tasks: dereplication and prediction of bioactivities. The exploration of new chemical spaces and the application of predicted spectral data must be included in new approaches to select species, extracts, and growth conditions with maximum probabilities of medicinal chemistry novelty. In this review, the most relevant current computational dereplication methodologies are highlighted. Structure-based (SB) and ligand-based (LB) chemoinformatics approaches have become essential tools for the virtual screening of NPs either in small datasets of isolated compounds or in large-scale databases. The most common LB techniques include Quantitative Structure⁻Activity Relationships (QSAR), estimation of drug likeness, prediction of adsorption, distribution, metabolism, excretion, and toxicity (ADMET) properties, similarity searching, and pharmacophore identification. Analogously, molecular dynamics, docking and binding cavity analysis have been used in SB approaches. Their significance and achievements are the main focus of this review.


Asunto(s)
Organismos Acuáticos , Productos Biológicos/química , Biología Computacional/métodos , Descubrimiento de Drogas/métodos , Modelos Biológicos , Productos Biológicos/farmacología , Química Farmacéutica/métodos , Diseño de Fármacos , Modelos Químicos , Modelos Moleculares , Estructura Molecular , Relación Estructura-Actividad Cuantitativa
5.
Molecules ; 23(11)2018 Nov 18.
Artículo en Inglés | MEDLINE | ID: mdl-30453681

RESUMEN

A series of π-conjugated molecules, based on pyridazine and thiophene heterocycles 3a⁻e, were synthesized using commercially, or readily available, coupling components, through a palladium catalyzed Suzuki-Miyaura cross-coupling reaction. The electron-deficient pyridazine heterocycle was functionalized by a thiophene electron-rich heterocycle at position six, and different (hetero)aromatic moieties (phenyl, thienyl, furanyl) were functionalized with electron acceptor groups at position three. Density Functional Theory (DFT) calculations were carried out to obtain information on the conformation, electronic structure, electron distribution, dipolar moment, and molecular nonlinear response of the synthesized push-pull pyridazine derivatives. Hyper-Rayleigh scattering in 1,4-dioxane solutions, using a fundamental wavelength of 1064 nm, was used to evaluate their second-order nonlinear optical properties. The thienylpyridazine functionalized with the cyano-phenyl moiety exhibited the largest first hyperpolarizability (ß = 175 × 10-30 esu, using the T convention) indicating its potential as a second harmonic generation (SHG) chromophore.


Asunto(s)
Modelos Teóricos , Acoplamiento Oxidativo , Piridazinas/síntesis química , Piridazinas/química , Piridazinas/farmacología , Análisis Espectral
6.
J Chem Inf Model ; 57(1): 11-21, 2017 01 23.
Artículo en Inglés | MEDLINE | ID: mdl-28033004

RESUMEN

Machine learning algorithms were explored for the fast estimation of HOMO and LUMO orbital energies calculated by DFT B3LYP, on the basis of molecular descriptors exclusively based on connectivity. The whole project involved the retrieval and generation of molecular structures, quantum chemical calculations for a database with >111 000 structures, development of new molecular descriptors, and training/validation of machine learning models. Several machine learning algorithms were screened, and an applicability domain was defined based on Euclidean distances to the training set. Random forest models predicted an external test set of 9989 compounds achieving mean absolute error (MAE) up to 0.15 and 0.16 eV for the HOMO and LUMO orbitals, respectively. The impact of the quantum chemical calculation protocol was assessed with a subset of compounds. Inclusion of the orbital energy calculated by PM7 as an additional descriptor significantly improved the quality of estimations (reducing the MAE in >30%).


Asunto(s)
Aprendizaje Automático , Teoría Cuántica
7.
Mol Inform ; 43(1): e202300190, 2024 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-37885368

RESUMEN

GUIDEMOL is a Python computer program based on the RDKit software to process molecular structures and calculate molecular descriptors with a graphical user interface using the tkinter package. It can calculate descriptors already implemented in RDKit as well as grid representations of 3D molecular structures using the electrostatic potential or voxels. The GUIDEMOL app provides easy access to RDKit tools for chemoinformatics users with no programming skills and can be adapted to calculate other descriptors or to trigger other procedures. A command line interface (CLI) is also provided for the calculation of grid representations. The source code is available at https://github.com/jairesdesousa/guidemol.


Asunto(s)
Quimioinformática , Programas Informáticos , Proteínas Adaptadoras Transductoras de Señales
8.
Mol Inform ; 42(1): e2200193, 2023 01.
Artículo en Inglés | MEDLINE | ID: mdl-36167940

RESUMEN

Random Forest (RF) QSPR models were developed with a data set of homolytic bond dissociation energies (BDE) previously calculated by B3LYP/6-311++G(d,p)//DFTB for 2263 sp3C-H covalent bonds. The best set of attributes consisted in 114 descriptors of the carbon atom (counts of atom types in 5 spheres around the kernel atom and ring descriptors). The optimized model predicted the DFT-calculated BDE of an independent test set of 224 bonds with MAE=2.86 kcal/mol. A new data set of 409 bonds from the iBonD database (http://ibond.nankai.edu.cn) was predicted by the RF with a modest MAE (5.36 kcal/mol) but a relatively high R2 (0.75) against experimental energies. A prediction scheme was explored that corrects the RF prediction with the average deviation observed for the k nearest neighbours (KNN) in an additional memory of experimental data. The corrected predictions achieved MAE=2.22 kcal/mol for an independent test set of 145 bonds and the corresponding experimental bond energies.


Asunto(s)
Aprendizaje Automático , Termodinámica , Calibración
9.
J Chem Inf Model ; 52(12): 3116-22, 2012 Dec 21.
Artículo en Inglés | MEDLINE | ID: mdl-23167287

RESUMEN

Machine learning (SVM and JRip rule learner) methods have been used in conjunction with the Condensed Graph of Reaction (CGR) approach to identify errors in the atom-to-atom mapping of chemical reactions produced by an automated mapping tool by ChemAxon. The modeling has been performed on the three first enzymatic classes of metabolic reactions from the KEGG database. Each reaction has been converted into a CGR representing a pseudomolecule with conventional (single, double, aromatic, etc.) bonds and dynamic bonds characterizing chemical transformations. The ChemAxon tool was used to automatically detect the matching atom pairs in reagents and products. These automated mappings were analyzed by the human expert and classified as "correct" or "wrong". ISIDA fragment descriptors generated for CGRs for both correct and wrong mappings were used as attributes in machine learning. The learned models have been validated in n-fold cross-validation on the training set followed by a challenge to detect correct and wrong mappings within an external test set of reactions, never used for learning. Results show that both SVM and JRip models detect most of the wrongly mapped reactions. We believe that this approach could be used to identify erroneous atom-to-atom mapping performed by any automated algorithm.


Asunto(s)
Biología Computacional/métodos , Máquina de Vectores de Soporte , Automatización , Bases de Datos de Proteínas , Reacciones Falso Positivas , Modelos Biológicos
10.
J Org Chem ; 76(22): 9312-9, 2011 Nov 18.
Artículo en Inglés | MEDLINE | ID: mdl-21970444

RESUMEN

Quantitative structure-property relationships (QSPRs) were investigated for the estimation of the Mayr electrophilicity parameter using a data set of 64 compounds, all currently available uncharged electrophiles in Mayr's Database of Reactivity Parameters. Three collections of empirical descriptors were employed, from Dragon, Adriana.Code, and CDK. Models were built with multilinear regressions, k nearest neighbors, model trees, random forests, support vector machines (SVMs), associative neural networks, and counterpropagation neural networks. Quantum chemical descriptors were calculated with density functional theory (DFT) methods and incorporated in QSPR models. The best results were achieved with SVM using seven empirical and DFT descriptors; an R(2) of 0.92 was obtained for the test set (21 compounds). The final seven descriptors were the Parr electrophilicity index, ε(LUMO), hardness, and four CDK descriptors (FNSA-3, ATSc5, Kier2, and nAtomLAC). Screening of correlations between individual descriptors and Mayr electrophilicity revealed the highest absolute value of correlation for DFT ε(LUMO) (R = -0.82) and comparable correlations for some empirical descriptors, e.g., Dragon's folding degree index (R = -0.80), Kier flexibility index (R = -0.78), and Kier S2K index (R = -0.78). High correlations were observed in the training set between reactivity descriptors calculated by the PM6 semiempirical and DFT methods (R = 0.96 for ε(LUMO) and 0.94 for the electrophilicity index).

11.
J Comput Aided Mol Des ; 25(6): 533-54, 2011 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-21660515

RESUMEN

The Online Chemical Modeling Environment is a web-based platform that aims to automate and simplify the typical steps required for QSAR modeling. The platform consists of two major subsystems: the database of experimental measurements and the modeling framework. A user-contributed database contains a set of tools for easy input, search and modification of thousands of records. The OCHEM database is based on the wiki principle and focuses primarily on the quality and verifiability of the data. The database is tightly integrated with the modeling framework, which supports all the steps required to create a predictive model: data search, calculation and selection of a vast variety of molecular descriptors, application of machine learning methods, validation, analysis of the model and assessment of the applicability domain. As compared to other similar systems, OCHEM is not intended to re-implement the existing tools or models but rather to invite the original authors to contribute their results, make them publicly available, share them with other users and to become members of the growing research community. Our intention is to make OCHEM a widely used platform to perform the QSPR/QSAR studies online and share it with other users on the Web. The ultimate goal of OCHEM is collecting all possible chemoinformatics tools within one simple, reliable and user-friendly resource. The OCHEM is free for web users and it is available online at http://www.ochem.eu.


Asunto(s)
Bases de Datos Factuales , Internet , Modelos Químicos , Difusión de la Información , Gestión de la Información , Relación Estructura-Actividad Cuantitativa , Interfaz Usuario-Computador
12.
Sci Rep ; 11(1): 23720, 2021 12 09.
Artículo en Inglés | MEDLINE | ID: mdl-34887473

RESUMEN

Machine learning (ML) algorithms were explored for the classification of the UV-Vis absorption spectrum of organic molecules based on molecular descriptors and fingerprints generated from 2D chemical structures. Training and test data (~ 75 k molecules and associated UV-Vis data) were assembled from a database with lists of experimental absorption maxima. They were labeled with positive class (related to photoreactive potential) if an absorption maximum is reported in the range between 290 and 700 nm (UV/Vis) with molar extinction coefficient (MEC) above 1000 Lmol-1 cm-1, and as negative if no such a peak is in the list. Random forests were selected among several algorithms. The models were validated with two external test sets comprising 998 organic molecules, obtaining a global accuracy up to 0.89, sensitivity of 0.90 and specificity of 0.88. The ML output (UV-Vis spectrum class) was explored as a predictor of the 3T3 NRU phototoxicity in vitro assay for a set of 43 molecules. Comparable results were observed with the classification directly based on experimental UV-Vis data in the same format.

13.
Eur J Med Chem ; 210: 112985, 2021 Jan 15.
Artículo en Inglés | MEDLINE | ID: mdl-33189435

RESUMEN

Aiming at generating a series of monoterpene indole alkaloids with enhanced multidrug resistance (MDR) reversing activity in cancer, two major epimeric alkaloids isolated from Tabernaemontana elegans, tabernaemontanine (1) and dregamine (2), were derivatized by alkylation of the indole nitrogen. Twenty-six new derivatives (3-28) were prepared by reaction with different aliphatic and aromatic halides, whose structures were elucidated mainly by NMR, including 2D NMR experiments. Their MDR reversal ability was evaluated through a functional assay, using as models resistant human colon adenocarcinoma and human ABCB1-gene transfected L5178Y mouse lymphoma cells, overexpressing P-glycoprotein (P-gp), by flow cytometry. A considerable increase of activity was found for most of the derivatives, being the strongest P-gp inhibitors those sharing N-phenethyl moieties, displaying outstanding inhibitory activity, associated with weak cytotoxicity. Chemosensitivity assays were also performed in a model of combination chemotherapy in the same cell lines, by studying the in vitro interactions between the compounds and the antineoplastic drug doxorubicin. Most of the compounds have shown strong synergistic interactions with doxorubicin, highlighting their potential as MDR reversers. QSAR models were also explored for insights on drug-receptor interaction, and it was found that lipophilicity and bulkiness features were associated with inhibitory activity, although linear correlations were not observed.


Asunto(s)
Miembro 1 de la Subfamilia B de Casetes de Unión a ATP/antagonistas & inhibidores , Antineoplásicos/farmacología , Alcaloides Indólicos/farmacología , Alquilación , Animales , Antineoplásicos/síntesis química , Antineoplásicos/química , Proliferación Celular/efectos de los fármacos , Relación Dosis-Respuesta a Droga , Ensayos de Selección de Medicamentos Antitumorales , Alcaloides Indólicos/síntesis química , Alcaloides Indólicos/química , Ratones , Estructura Molecular , Relación Estructura-Actividad Cuantitativa , Células Tumorales Cultivadas
14.
Mol Inform ; 39(9): e2000001, 2020 09.
Artículo en Inglés | MEDLINE | ID: mdl-32469147

RESUMEN

The increasing application of new ionic liquids (IL) creates the need of liquid-liquid equilibria data for both miscible and quasi-immiscible systems. In this study, equilibrium concentrations at different temperatures for ionic liquid+water two-phase systems were modeled using a Quantitative-Structure-Property Relationship (QSPR) method. Data on equilibrium concentrations were taken from the ILThermo Ionic Liquids database, curated and used to make models that predict the weight fraction of water in ionic liquid rich phase and ionic liquid in the aqueous phase as two separate properties. The major modeling challenge stems from the fact that each single IL is characterized by several data points, since equilibrium concentrations are temperature dependent. Thus, new approaches for the detection of potential data point outliers, testing set selection, and quality prediction have been developed. Training set comprised equilibrium concentration data for 67 and 68 ILs in case of water in IL and IL in water modeling, respectively. SiRMS, MOLMAPS, Rcdk and Chemaxon descriptors were used to build Random Forest models for both properties. Models were subjected to the Y-scrambling test for robustness assessment. The best models have also been validated using an external test set that is not part of the ILThermo database. A two-phase equilibrium diagram for one of the external test set IL is presented for better visualization of the results and potential derivation of tie lines.


Asunto(s)
Líquidos Iónicos/química , Modelos Químicos , Relación Estructura-Actividad Cuantitativa , Agua/química , Curaduría de Datos , Conjuntos de Datos como Asunto , Concentración Osmolar , Presión , Temperatura
15.
Bioinformatics ; 24(19): 2236-44, 2008 Oct 01.
Artículo en Inglés | MEDLINE | ID: mdl-18676416

RESUMEN

MOTIVATION: The automatic perception of chemical similarities between metabolic reactions is required for a variety of applications ranging from the computer-aided validation of classification systems, to genome-scale reconstruction (or comparison) of metabolic pathways, to the classification of enzymatic mechanisms. Comparison of metabolic reactions has been mostly based on Enzyme Commission (EC) numbers, which are extremely useful and widespread, but not always straightforward to apply, and often problematic when an enzyme catalyzes several reactions, when the same reaction is catalyzed by different enzymes, when official full EC numbers are unavailable or when reactions are not catalyzed by enzymes. Different methods should be available to compare metabolic reactions. Simultaneously, methods are required for the automatic assignment of EC numbers to reactions still not officially classified. RESULTS: We have proposed the MOLMAP reaction descriptors to numerically encode the structural transformations resulting from a chemical reaction. Here, such descriptors are applied to the mapping of a genome-scale database of almost 4000 metabolic reactions by Kohonen self-organizing maps (SOMs), and its screening for inconsistencies in EC numbers. This approach allowed for the SOMs to assign EC numbers at the class, subclass and sub-subclass levels for reactions of independent test sets with accuracies up to 92, 80 and 70%, respectively. Different levels of similarity between training and test sets were explored. The approach also led to the identification of a number of similar reactions bearing differences at the EC class level. AVAILABILITY: The programs to generate MOLMAP descriptors from atomic properties included in SDF files are available upon request for evaluation.


Asunto(s)
Enzimas/clasificación , Genómica , Biología Computacional/métodos , Bases de Datos de Proteínas , Enzimas/genética , Enzimas/metabolismo , Genoma , Terminología como Asunto
16.
J Comput Aided Mol Des ; 23(7): 419-29, 2009 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-19468693

RESUMEN

Databases of chemical reactions contain knowledge about the reactivity of specific reagents. Although information is in general only explicitly available for compounds reported to react, it is possible to derive information about substructures that do not react in the reported reactions. Both types of information (positive and negative) can be used to train machine learning techniques to predict if a compound reacts or not with a specific reagent. The whole process was implemented with two databases of reactions, one involving BuNH2 as the reagent, and the other NaCNBH3. Negative information was derived using MOLMAP molecular descriptors, and classification models were developed with Random Forests also based on MOLMAP descriptors. MOLMAP descriptors were based exclusively on calculated physicochemical features of molecules. Correct predictions were achieved for approximately 90% of independent test sets. While NaCNBH3 is a selective reducing reagent widely used in organic synthesis, BuNH2 is a nucleophile that mimics the reactivity of the lysine side chain (involved in an initiating step of the mechanism leading to skin sensitization).


Asunto(s)
Inteligencia Artificial , Borohidruros/química , Butilaminas/química , Relación Estructura-Actividad Cuantitativa , Simulación por Computador , Bases de Datos Factuales , Modelos Químicos , Estructura Molecular
17.
Spectrochim Acta A Mol Biomol Spectrosc ; 223: 117289, 2019 Dec 05.
Artículo en Inglés | MEDLINE | ID: mdl-31255865

RESUMEN

A chemoinformatics method was applied to the assignment of absolute configurations and to the quantitative prediction of specific optical rotations using a data set of 88 chiral fluorinated molecules (44 pairs of enantiomers). Counterpropagation neural networks were explored for the classification of enantiomers as dextrorotatory or levorotatory. Regression models were trained using multilayer perceptrons (MLP), random forests (RF) or multilinear regressions (MLR), on the basis of physicochemical atomic stereo (PAS) descriptors. New descriptors were also derived considering the common structural features of the data set (cPAS descriptors), which enabled RF models to predict the whole data set with R = 0.964, mean absolute error (MAE) of 9.8° and root mean square error (RMSE) of 12.5° in leave-one-pair-out cross-validation experiments. The predictions for the 30 compounds measured in chloroform were obtained with R = 0.971, MAE = 9.1° and RMSE = 12.5°, which compares favorably with quantum chemistry calculations reported in the literature.

18.
J Cheminform ; 10(1): 43, 2018 Aug 22.
Artículo en Inglés | MEDLINE | ID: mdl-30136001

RESUMEN

Machine learning (ML) algorithms were explored for the fast estimation of molecular dipole moments calculated by density functional theory (DFT) by B3LYP/6-31G(d,p) on the basis of molecular descriptors generated from DFT-optimized geometries and partial atomic charges obtained by empirical or ML schemes. A database was used with 10,071 structures, new molecular descriptors were designed and the models were validated with external test sets. Several ML algorithms were screened. Random forest regression models predicted an external test set of 3368 compounds achieving mean absolute error up to 0.44 D. The results represent a significant improvement of the dipole moments calculated using empirical point charges located at the nucleus, even assuming the DFT-optimized geometry (root mean square error, RMSE, of 0.68 D vs. 1.53 D and R2 = 0.87 vs. 0.66).

19.
Med Chem ; 13(5): 439-447, 2017.
Artículo en Inglés | MEDLINE | ID: mdl-28185538

RESUMEN

BACKGROUND: Tuberculosis (TB) is the second leading cause of mortality worldwide being a highly contagious and insidious illness caused by Mycobacterium tuberculosis, Mtb. Additionally, the emergence of multidrug-resistant and extensively drug-resistant strains of Mtb, together with significant levels of co-infection with HIV and TB (HIV/TB) make the search for new antitubercular drugs urgent and challenging. METHODS: This work was based on the hypothesis that an active compound could be obtained if substituents present in some other active compounds were attached on a core of an important structure, in this case the indole scaffold, thus generating a hybrid compound. A QSAR-oriented design based on classification and regression models along with the estimation of physicochemical and biological properties have also been used to assist in the selection of compounds. Chosen compounds were synthesized using various synthetic procedures and evaluated against M. tuberculosis H37Rv strain. RESULTS: Selected compounds possess substituents at positions C5, C2 and N1 of the indole ring. The substituents involve p-halophenyl, pyridyl, benzyloxy and benzylamine groups. Four compounds were synthesised using suitable synthetic procedures to attain the desired substitution at the indole core. From these, three compounds are new and have been fully characterized, and tested in vitro against the H37Rv ATCC27294T Mtb strain, using isoniazid as a control. One of them, compound 2, with the pyridyl group at N1, has an experimental log (1/MIC) very close to 5 and can be considered as being (weakly) active. In fact, it is more active than 64% of all indole molecules in our data sets of experimental results from literature. The most active indole in this data sets has log (1/MIC)=5.93 with only 6 compounds with log (1/MIC) above 5.5. CONCLUSION: Despite the lower activity found for the tested compounds, when compared to other reported indole-derivatives, these structures, which rely on a hybrid design concept, may constitute interesting scaffolds to prepare a new family of TB inhibitors with improved activity.


Asunto(s)
Antituberculosos/farmacología , Indoles/farmacología , Piridinas/farmacología , Antituberculosos/síntesis química , Diseño de Fármacos , Indoles/síntesis química , Isoniazida/farmacología , Aprendizaje Automático , Mycobacterium tuberculosis/efectos de los fármacos , Redes Neurales de la Computación , Piridinas/síntesis química , Relación Estructura-Actividad Cuantitativa
20.
Mol Inform ; 35(2): 62-9, 2016 02.
Artículo en Inglés | MEDLINE | ID: mdl-27491791

RESUMEN

To enable the fast estimation of atom condensed Fukui functions, machine learning algorithms were trained with databases of DFT pre-calculated values for ca. 23,000 atoms in organic molecules. The problem was approached as the ranking of atom types with the Bradley-Terry (BT) model, and as the regression of the Fukui function. Random Forests (RF) were trained to predict the condensed Fukui function, to rank atoms in a molecule, and to classify atoms as high/low Fukui function. Atomic descriptors were based on counts of atom types in spheres around the kernel atom. The BT coefficients assigned to atom types enabled the identification (93-94 % accuracy) of the atom with the highest Fukui function in pairs of atoms in the same molecule with differences ≥0.1. In whole molecules, the atom with the top Fukui function could be recognized in ca. 50 % of the cases and, on the average, about 3 of the top 4 atoms could be recognized in a shortlist of 4. Regression RF yielded predictions for test sets with R(2) =0.68-0.69, improving the ability of BT coefficients to rank atoms in a molecule. Atom classification (as high/low Fukui function) was obtained with RF with sensitivity of 55-61 % and specificity of 94-95 %.


Asunto(s)
Aprendizaje Automático , Modelos Químicos , Relación Estructura-Actividad Cuantitativa
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA