Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 12 de 12
Filtrar
1.
J Comput Aided Mol Des ; 34(7): 783-803, 2020 07.
Artigo em Inglês | MEDLINE | ID: mdl-32112286

RESUMO

Reaction-based de novo design refers to the in-silico generation of novel chemical structures by combining reagents using structural transformations derived from known reactions. The driver for using reaction-based transformations is to increase the likelihood of the designed molecules being synthetically accessible. We have previously described a reaction-based de novo design method based on reaction vectors which are transformation rules that are encoded automatically from reaction databases. A limitation of reaction vectors is that they account for structural changes that occur at the core of a reaction only, and they do not consider the presence of competing functionalities that can compromise the reaction outcome. Here, we present the development of a Reaction Class Recommender to enhance the reaction vector framework. The recommender is intended to be used as a filter on the reaction vectors that are applied during de novo design to reduce the combinatorial explosion of in-silico molecules produced while limiting the generated structures to those which are most likely to be synthesisable. The recommender has been validated using an external data set extracted from the recent medicinal chemistry literature and in two simulated de novo design experiments. Results suggest that the use of the recommender drastically reduces the number of solutions explored by the algorithm while preserving the chance of finding relevant solutions and increasing the global synthetic accessibility of the designed molecules.


Assuntos
Desenho de Fármacos , Algoritmos , Técnicas de Química Sintética/métodos , Técnicas de Química Sintética/estatística & dados numéricos , Química Farmacêutica/métodos , Química Farmacêutica/estatística & dados numéricos , Simulação por Computador , Desenho Assistido por Computador , Bases de Dados de Compostos Químicos , Bases de Dados de Produtos Farmacêuticos , Humanos , Aprendizado de Máquina , Bibliotecas de Moléculas Pequenas
2.
J Chem Inf Model ; 59(10): 4167-4187, 2019 10 28.
Artigo em Inglês | MEDLINE | ID: mdl-31529948

RESUMO

Reaction classification has often been considered an important task for many different applications, and has traditionally been accomplished using hand-coded rule-based approaches. However, the availability of large collections of reactions enables data-driven approaches to be developed. We present the development and validation of a 336-class machine learning-based classification model integrated within a Conformal Prediction (CP) framework to associate reaction class predictions with confidence estimations. We also propose a data-driven approach for "dynamic" reaction fingerprinting to maximize the effectiveness of reaction encoding, as well as developing a novel reaction classification system that organizes labels into four hierarchical levels (SHREC: Sheffield Hierarchical REaction Classification). We show that the performance of the CP augmented model can be improved by defining confidence thresholds to detect predictions that are less likely to be false. For example, the external validation of the model reports 95% of predictions as correct by filtering out less than 15% of the uncertain classifications. The application of the model is demonstrated by classifying two reaction data sets: one extracted from an industrial ELN and the other from the medicinal chemistry literature. We show how confidence estimations and class compositions across different levels of information can be used to gain immediate insights on the nature of reaction collections and hidden relationships between reaction classes.


Assuntos
Química Farmacêutica , Bases de Dados de Compostos Químicos , Aprendizado de Máquina , Modelos Químicos , Estrutura Molecular
3.
Mol Inform ; 43(4): e202300183, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38258328

RESUMO

De novo design has been a hotly pursued topic for many years. Most recent developments have involved the use of deep learning methods for generative molecular design. Despite increasing levels of algorithmic sophistication, the design of molecules that are synthetically accessible remains a major challenge. Reaction-based de novo design takes a conceptually simpler approach and aims to address synthesisability directly by mimicking synthetic chemistry and driving structural transformations by known reactions that are applied in a stepwise manner. However, the use of a small number of hand-coded transformations restricts the chemical space that can be accessed and there are few examples in the literature where molecules and their synthetic routes have been designed and executed successfully. Here we describe the application of reaction-based de novo design to the design of synthetically accessible and biologically active compounds as proof-of-concept of our reaction vector-based software. Reaction vectors are derived automatically from known reactions and allow access to a wide region of synthetically accessible chemical space. The design was aimed at producing molecules that are active against PARP1 and which have improved brain penetration properties compared to existing PARP1 inhibitors. We synthesised a selection of the designed molecules according to the provided synthetic routes and tested them experimentally. The results demonstrate that reaction vectors can be applied to the design of novel molecules of biological relevance that are also synthetically accessible.


Assuntos
Desenho de Fármacos , Inibidores de Poli(ADP-Ribose) Polimerases , Inibidores de Poli(ADP-Ribose) Polimerases/química , Inibidores de Poli(ADP-Ribose) Polimerases/farmacologia , Inibidores de Poli(ADP-Ribose) Polimerases/síntese química , Humanos , Poli(ADP-Ribose) Polimerase-1/antagonistas & inibidores , Poli(ADP-Ribose) Polimerase-1/metabolismo , Software
4.
Methods Mol Biol ; 2390: 125-151, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-34731467

RESUMO

Within the context of the latest resurgence in the application of artificial intelligence approaches, deep learning has undergone a renaissance over recent years. These methods have been applied to a number of problems in computational chemistry. Compared to other machine learning approaches, the practical performance advantages of deep neural networks are often unclear. However, deep learning does appear to offer a number of other advantages such as the facile incorporation of multitask learning and the enhancement of generative modeling. The high complexity of contemporary network architectures represents a potentially significant barrier to their future adoption due to the costs of training such models and challenges in interpreting their predictions. When combined with the relative paucity of very large datasets, it is interesting to reflect on whether deep learning is likely to have the kind of transformational impact on computational chemistry that it is commonly held to have had in other domains such as image recognition.


Assuntos
Aprendizado Profundo , Química Computacional
5.
Mol Inform ; 41(4): e2100207, 2022 04.
Artigo em Inglês | MEDLINE | ID: mdl-34750989

RESUMO

Reaction-based de novo design refers to the generation of synthetically accessible molecules using transformation rules extracted from known reactions in the literature. In this context, we have previously described the extraction of reaction vectors from a reactions database and their coupling with a structure generation algorithm for the generation of novel molecules from a starting material. An issue when designing molecules from a starting material is the combinatorial explosion of possible product molecules that can be generated, especially for multistep syntheses. Here, we present the development of RENATE, a reaction-based de novo design tool, which is based on a pseudo-retrosynthetic fragmentation of a reference ligand and an inside-out approach to de novo design. The reference ligand is fragmented; each fragment is used to search for similar fragments as building blocks; the building blocks are combined into products using reaction vectors; and a synthetic route is suggested for each product molecule. The RENATE methodology is presented followed by a retrospective validation to recreate a set of approved drugs. Results show that RENATE can generate very similar or even identical structures to the corresponding input drugs, hence validating the fragmentation, search, and design heuristics implemented in the tool.


Assuntos
Algoritmos , Ligantes , Estudos Retrospectivos
6.
J Chem Inf Model ; 49(12): 2820-36, 2009 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-19908874

RESUMO

Nowadays, in medicinal chemistry adenosine receptors represent some of the most studied targets, and there is growing interest on the different adenosine receptor (AR) subtypes. The AR subtypes selectivity is highly desired in the development of potent ligands to achieve the therapeutic success. So far, very few ligand-based strategies have been investigated to predict the receptor subtypes selectivity. In the present study, we have carried out a novel application of the multilabel classification approach by combining our recently reported autocorrelated molecular descriptors encoding for the molecular electrostatic potential (autoMEP) with support vector machines (SVMs). Three valuable models, based on decreasing thresholds of potency, have been generated as in series quantitative sieves for the simultaneous prediction of the hA(1)R, hA(2A)R, hA(2B)R, and hA(3)R subtypes potency profile and selectivity of a large collection, more than 500, of known inverse agonists such as xanthine, pyrazolo-triazolo-pyrimidine, and triazolo-pyrimidine analogues. The robustness and reliability of our multilabel classification models were assessed by predicting an internal test set. Finally, we have applied our strategy to 13 newly synthesized pyrazolo-triazolo-pyrimidine derivatives inferring their full adenosine receptor potency spectrum and hAR subtypes selectivity profile.


Assuntos
Biologia Computacional , Descoberta de Drogas/métodos , Antagonistas de Receptores Purinérgicos P1 , Inteligência Artificial , Humanos , Subunidades Proteicas/antagonistas & inibidores , Pirimidinas/química , Pirimidinas/farmacologia , Reprodutibilidade dos Testes , Eletricidade Estática , Especificidade por Substrato , Fatores de Tempo , Xantina/química , Xantina/farmacologia
7.
Comb Chem High Throughput Screen ; 13(1): 54-66, 2010 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-20214575

RESUMO

Nature, especially the plant kingdom, is a rich source for novel bioactive compounds that can be used as lead compounds for drug development. In order to exploit this resource, the two neural network-based virtual screening techniques novelty detection with self-organizing maps (SOMs) and counterpropagation neural network were evaluated as tools for efficient lead structure discovery. As application scenario, significant descriptors for acetylcholinesterase (AChE) inhibitors were determined and used for model building, theoretical model validation, and virtual screening. Top-ranked virtual hits from both approaches were docked into the AChE binding site to approve the initial hits. Finally, in vitro testing of selected compounds led to the identification of forsythoside A and (+)-sesamolin as novel AChE inhibitors.


Assuntos
Acetilcolinesterase/metabolismo , Produtos Biológicos/farmacologia , Inibidores da Colinesterase/farmacologia , Mineração de Dados/métodos , Acetilcolinesterase/química , Produtos Biológicos/química , Inibidores da Colinesterase/química , Descoberta de Drogas , Modelos Moleculares
8.
J Chem Inf Model ; 48(1): 56-67, 2008 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-18069808

RESUMO

Recently, we have built a classification model that is capable of assigning a given sesquiterpene lactone (STL) into exactly one tribe of the plant family Asteraceae from which the STL has been isolated. Although many plant species are able to biosynthesize a set of peculiar compounds, the occurrence of the same secondary metabolites in more than one tribe of Asteraceae is frequent. Building on our previous work, in this paper, we explore the possibility of assigning an STL to more than one tribe (class) simultaneously. When an object may belong to more than one class simultaneously, it is called multilabeled. In this work, we present a general overview of the techniques available to examine multilabeled data. The problem of evaluating the performance of a multilabeled classifier is discussed. Two particular multilabeled classification methods-cross-training with support vector machines (ct-SVM) and multilabeled k-nearest neighbors (ML-kNN)-were applied to the classification of the STLs into seven tribes from the plant family Asteraceae. The results are compared to a single-label classification and are analyzed from a chemotaxonomic point of view. The multilabeled approach allowed us to (1) model the reality as closely as possible, (2) improve our understanding of the relationship between the secondary metabolite profiles of different Asteraceae tribes, and (3) significantly decrease the number of plant sources to be considered for finding a certain STL. The presented classification models are useful for the targeted collection of plants with the objective of finding plant sources of natural compounds that are biologically active or possess other specific properties of interest.


Assuntos
Plantas/química , Plantas/classificação , Terpenos/química , Terpenos/isolamento & purificação , Inteligência Artificial , Simulação por Computador , Lactonas/química , Plantas/metabolismo , Sesquiterpenos/química , Terpenos/metabolismo
9.
J Chem Inf Model ; 47(6): 2044-62, 2007.
Artigo em Inglês | MEDLINE | ID: mdl-17854167

RESUMO

We describe a novel method for ligand-based virtual screening, based on utilizing Self-Organizing Maps (SOM) as a novelty detection device. Novelty detection (or one-class classification) refers to the attempt of identifying patterns that do not belong to the space covered by a given data set. In ligand-based virtual screening, chemical structures perceived as novel lie outside the known activity space and can therefore be discarded from further investigation. In this context, the concept of "novel structure" refers to a compound, which is unlikely to share the activity of the query structures. Compounds not perceived as "novel" are suspected to share the activity of the query structures. Nowadays, various databases contain active structures but access to compounds which have been found to be inactive in a biological assay is limited. This work addresses this problem via novelty detection, which does not require proven inactive compounds. The structures are described by spatial autocorrelation functions weighted by atomic physicochemical properties. Different methods for selecting a subset of targets from a larger set are discussed. A comparison with similarity search based on Daylight fingerprints followed by data fusion is presented. The two methods complement each other to a large extent. In a retrospective screening of the WOMBAT database novelty detection with SOM gave enrichment factors between 105 and 462-an improvement over the similarity search based on Daylight fingerprints between 25% and 100%, when the 100 top ranked structures were considered. Novelty detection with SOM is applicable (1) to improve the retrieval of potentially active compounds also in concert with other virtual screening methods; (2) as a library design tool for discarding a large number of compounds, which are unlikely to possess a given biological activity; and (3) for selecting a small number of potentially active compounds from a large data set.


Assuntos
Enzimas/química , Enzimas/metabolismo , Inibidores Enzimáticos/química , Inibidores Enzimáticos/farmacologia , Ligantes , Estrutura Molecular , Fatores de Tempo
10.
J Chem Inf Model ; 47(1): 9-19, 2007.
Artigo em Inglês | MEDLINE | ID: mdl-17238243

RESUMO

In a recent publication we described the application of an unsupervised learning method using self-organizing maps to the separation of three tribes and seven subtribes of the plant family Asteraceae based on a set of sesquiterpene lactones (STLs) isolated from individual species. In the present work, two different structure representations--atom counts (2D) and radial distribution function (RDF) (3D)--and two supervised classification methods--counterpropagation neural networks and k-nearest neighbors (k-NN)--were used to predict the tribe in which a given STL occurs. The data set was extended from 144 to 921 STLs, and the Asteraceae tribes were augmented from three to seven. The k-NN classifier with k = 1 showed the best performance, while the RDF code outperformed the atom counts. The quality of the obtained model was assessed with two test sets, which exemplified two possible applications: (1) finding a plant source for a desired compound and (2) based on a plant species chemical profile (STLs): (a) study the relationship between the current taxonomic classification and plant's chemistry and (b) assign a species to a tribe by majority vote. In addition, the problem of defining the applicability domain of the models was assessed by means of two different approaches-principal component analysis combined with Hotelling T2 statistic and an a posteriori probability-based rule.


Assuntos
Asteraceae/química , Asteraceae/classificação , Lactonas , Modelos Estatísticos , Redes Neurais de Computação , Sesquiterpenos , Classificação , Sistemas Inteligentes , Modelos Biológicos , Plantas/química , Plantas/classificação , Probabilidade
11.
J Comput Aided Mol Des ; 21(10-11): 617-40, 2007.
Artigo em Inglês | MEDLINE | ID: mdl-18008169

RESUMO

Four different ligand-based virtual screening scenarios are studied: (1) prioritizing compounds for subsequent high-throughput screening (HTS); (2) selecting a predefined (small) number of potentially active compounds from a large chemical database; (3) assessing the probability that a given structure will exhibit a given activity; (4) selecting the most active structure(s) for a biological assay. Each of the four scenarios is exemplified by performing retrospective ligand-based virtual screening for eight different biological targets using two large databases--MDDR and WOMBAT. A comparison between the chemical spaces covered by these two databases is presented. The performance of two techniques for ligand--based virtual screening--similarity search with subsequent data fusion (SSDF) and novelty detection with Self-Organizing Maps (ndSOM) is investigated. Three different structure representations--2,048-dimensional Daylight fingerprints, topological autocorrelation weighted by atomic physicochemical properties (sigma electronegativity, polarizability, partial charge, and identity) and radial distribution functions weighted by the same atomic physicochemical properties--are compared. Both methods were found applicable in scenario one. The similarity search was found to perform slightly better in scenario two while the SOM novelty detection is preferred in scenario three. No method/descriptor combination achieved significant success in scenario four.


Assuntos
Desenho de Fármacos , Avaliação Pré-Clínica de Medicamentos/métodos , Interface Usuário-Computador , Inteligência Artificial , Desenho Assistido por Computador , Bases de Dados Factuais , Avaliação Pré-Clínica de Medicamentos/estatística & dados numéricos , Ligantes , Curva ROC , Relação Estrutura-Atividade
12.
Talanta ; 59(1): 123-36, 2003 Jan 02.
Artigo em Inglês | MEDLINE | ID: mdl-18968892

RESUMO

An automated, fast and reliable procedure has been developed for flame atomic absorption analysis of Ca, Fe and Mn in moss. The method is suitable for routine analysis of a large number of moss samples and allows sequential determination of all three elements in the same solution. In order to inhibit the matrix interference on Ca and to level the diverse analytical behaviour of the moss matrix, approximately 1% La was added to both samples and standard solutions as well. An integrated system of 'sandwich-type' air segmented discrete sample introduction and flame atomic absorption detection (ASDI-FAAS) was successfully applied. It works at 'solvent-air-sample-air-solvent' mode, which tolerates the introduction of high salt content solutions, reduces reagent and sample consumption and allows the application of data treatment models to pseudo-steady state signals for bettering the repeatability. For moss samples containing high Ca and Fe concentrations, equivalent procedure was used by turned on 45 degrees burner head without worsening the analytical characteristics. Concerning these three elements, the method is suggested as a cheaper, easier and more trustworthy alternative with a better precision to the inductively coupled plasma-mass spectrometry (ICP-MS) one. The ASDI-FAAS results were used for selection of appropriate isotopes and correction procedures for ICP-MS determination. Both methods show good agreement of the Ca, Fe and Mn results that correspond to the moss reference materials tested.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA