Your browser doesn't support javascript.
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Mais filtros

Base de dados
Intervalo de ano de publicação
J Chem Inf Model ; 2020 Feb 03.
Artigo em Inglês | MEDLINE | ID: mdl-32013419


The copper (I)-catalyzed alkyne-azide cycloaddition (CuAAC) reaction, a major click chemistry reaction, is widely employed in drug discovery and chemical biology. However, the success rate of CuAAC reaction is not satisfactory as expected, and in order to improve its performance, we developed a recurrent neural network (RNN) model to predict its feasibility. First, we designed and synthesized a structurally diverse library of 700 compounds with the CuAAC reaction to obtain experimental data. Then, using reaction SMILES as input, we generated a bidirectional long-short term memory with a self-attention mechanism (BiLSTM-SA) model. Our best prediction model has total accuracy of 80%. With the self-attention mechanism, adverse substructures responsible for negative reactions were recognized and derived as quantitative descriptors. DFT investigations were conducted to provide evidence for the correlation between bromo-α-C hybrid types and the success rate of the reaction. Quantitative descriptors combined with RDKit descriptors were fed to three machine learning models, a support vector machine, random forest, and logistic regression, and resulted in improved performance. The BiLSTM-SA model for predicting the feasibility of the CuAAC reaction, is superior to other conventional learning methods and advances heuristic chemical rules.

J Chem Inf Model ; 60(1): 47-55, 2020 Jan 27.
Artigo em Inglês | MEDLINE | ID: mdl-31825611


Synthesis planning is the process of recursively decomposing target molecules into available precursors. Computer-aided retrosynthesis can potentially assist chemists in designing synthetic routes; however, at present, it is cumbersome and cannot provide satisfactory results. In this study, we have developed a template-free self-corrected retrosynthesis predictor (SCROP) to predict retrosynthesis using transformer neural networks. In the method, the retrosynthesis planning was converted to a machine translation problem from the products to molecular linear notations of the reactants. By coupling with a neural network-based syntax corrector, our method achieved an accuracy of 59.0% on a standard benchmark data set, which outperformed other deep learning methods by >21% and template-based methods by >6%. More importantly, our method was 1.7 times more accurate than other state-of-the-art methods for compounds not appearing in the training set.

J Chem Inf Model ; 59(2): 914-923, 2019 02 25.
Artigo em Inglês | MEDLINE | ID: mdl-30669836


Recognizing substructures and their relations embedded in a molecular structure representation is a key process for structure-activity or structure-property relationship (SAR/SPR) studies. A molecular structure can be explicitly represented as either a connection table (CT) or linear notation, such as SMILES, which is a language describing the connectivity of atoms in the molecular structure. Conventional SAR/SPR approaches rely on partitioning the CT into a set of predefined substructures as structural descriptors. In this work, we propose a new method to identifying SAR/SPR through linear notation (for example, SMILES) syntax analysis with self-attention mechanism, an interpretable deep learning architecture. The method has been evaluated by predicting chemical properties, toxicology, and bioactivity from experimental data sets. Our results demonstrate that the method yields superior performance compared with state-of-the-art models. Moreover, the method can produce chemically interpretable results, which can be used for a chemist to design and synthesize the activity- or property-improved compounds.

/métodos , Aprendizado Profundo , Solubilidade , Relação Estrutura-Atividade , Água/química
J Cheminform ; 11(1): 5, 2019 Jan 17.
Artigo em Inglês | MEDLINE | ID: mdl-30656426


Biogenic compounds are important materials for drug discovery and chemical biology. In this work, we report a quasi-biogenic molecule generator (QBMG) to compose virtual quasi-biogenic compound libraries by means of gated recurrent unit recurrent neural networks. The library includes stereo-chemical properties, which are crucial features of natural products. QMBG can reproduce the property distribution of the underlying training set, while being able to generate realistic, novel molecules outside of the training set. Furthermore, these compounds are associated with known bioactivities. A focused compound library based on a given chemotype/scaffold can also be generated by this approach combining transfer learning technology. This approach can be used to generate virtual compound libraries for pharmaceutical lead identification and optimization.