RESUMEN
We have adopted and extended the CHMTRN language and used it for the knowledge base of a computer program to generate a large database of synthetically accessible, drug-like chemical structures, the Synthetically Accessible Virtual Inventory (SAVI) Database. CHMTRN is a powerful language originally developed in the LHASA (Logic and Heuristics Applied to Synthetic Analysis) project at Harvard University and used together with the chemical pattern description language, PATRAN, to describe chemical retro-reactions. The languages have proven to be useful beyond the design of retrosynthetic routes and have the potential for much wider use in chemistry; this paper describes CHMTRN and PATRAN as now reimplemented for the forward-synthetic SAVI project but able to describe both forward and retro-reactions.
Asunto(s)
Técnicas Químicas Combinatorias , Programas Informáticos , Bases de Datos Factuales , HumanosRESUMEN
Knowledge-based systems for toxicity prediction are typically based on rules, known as structural alerts, that describe relationships between structural features and different toxic effects. The identification of structural features associated with toxicological activity can be a time-consuming process and often requires significant input from domain experts. Here, we describe an emerging pattern mining method for the automated identification of activating structural features in toxicity data sets that is designed to help expedite the process of alert development. We apply the contrast pattern tree mining algorithm to generate a set of emerging patterns of structural fragment descriptors. Using the emerging patterns it is possible to form hierarchical clusters of compounds that are defined by the presence of common structural features and represent distinct chemical classes. The method has been tested on a large public in vitro mutagenicity data set and a public hERG channel inhibition data set and is shown to be effective at identifying common toxic features and recognizable classes of toxicants. We also describe how knowledge developers can use emerging patterns to improve the specificity and sensitivity of an existing expert system.
Asunto(s)
Minería de Datos/métodos , Toxicología , Algoritmos , Determinación de Punto Final , Canales de Potasio Éter-A-Go-Go/antagonistas & inhibidores , Pruebas de Mutagenicidad , Bloqueadores de los Canales de Potasio/toxicidadRESUMEN
BACKGROUND: In Cameroon herbs are traditionally used to meet health care needs and plans are on the way to integrate traditional medicine in the health care system, even though the plans have not been put into action yet. The country however has a rich biodiversity, with ~8,620 plant species, some of which are commonly used in the treatment of several microbial infections and a range of diseases (malaria, trypanosomiasis, leishmaniasis, diabetes and tuberculosis). METHODS: Our survey consisted in collecting published data from the literature sources, mainly from PhD theses in Cameroonian university libraries and also using the author queries in major natural product and medicinal chemistry journals. The collected data includes plant sources, uses of plant material in traditional medicine, plant families, region of collection of plant material, isolated metabolites and type (e.g. flavonoid, terpenoid, etc.), measured biological activities of isolated compounds, and any comments on significance of isolated metabolites on the chemotaxonomic classification of the plant species. This data was compiled on a excel sheet and analysed. RESULTS: In this study, a literature survey led to the collection of data on 2,700 secondary metabolites, which have been previously isolated or derived from Cameroonian medicinal plants. This represents distinct phytochemicals derived from 312 plant species belonging to 67 plant families. The plant species are investigated in terms of chemical composition with respect to the various plant families. A correlation between the known biological activities of isolated compounds and the ethnobotanical uses of the plants is also attempted. Insight into future direction for natural product search within the Cameroonian forest and Savanna is provided. CONCLUSIONS: It can be verified that a phytochemical search of active secondary metabolites, which is inspired by knowledge from the ethnobotanical uses of medicinal plants could be very vital in a drug discovery program from plant-derived bioactive compounds.
Asunto(s)
Extractos Vegetales/farmacología , Plantas Medicinales/química , Plantas Medicinales/clasificación , Camerún , Bases de Datos Bibliográficas , Etnobotánica , Humanos , Medicina Tradicional , Fitoterapia , Extractos Vegetales/análisis , Extractos Vegetales/metabolismo , Plantas Medicinales/metabolismoRESUMEN
The design of new alerts, that is, collections of structural features observed to result in toxicological activity, can be a slow process and may require significant input from toxicology and chemistry experts. A method has therefore been developed to help automate alert identification by mining descriptions of activating structural features directly from toxicity data sets. The method is based on jumping emerging pattern mining which is applied to a set of toxic and nontoxic compounds that are represented using atom pair descriptors. Using the resulting jumping emerging patterns, it is possible to cluster toxic compounds into groups defined by the presence of shared structural features and to arrange the clusters into hierarchies. The methodology has been tested on a number of data sets for Ames mutagenicity, oestrogenicity, and hERG channel inhibition end points. These tests have shown the method to be effective at clustering the data sets around minimal jumping-emerging structural patterns and finding descriptions of potentially activating structural features. Furthermore, the mined structural features have been shown to be related to some of the known alerts for all three tested end points.
Asunto(s)
Minería de Datos/métodos , Estrógenos/química , Mutágenos/química , Reconocimiento de Normas Patrones Automatizadas/métodos , Análisis por Conglomerados , Estrógenos/toxicidad , Canales de Potasio Éter-A-Go-Go/antagonistas & inhibidores , Humanos , Mutágenos/toxicidadAsunto(s)
Contaminación de Medicamentos/estadística & datos numéricos , Control de Medicamentos y Narcóticos/estadística & datos numéricos , Sistemas Especialistas , Modelos Estadísticos , Pruebas de Mutagenicidad/estadística & datos numéricos , Mutágenos/efectos adversos , Seguridad del Paciente/estadística & datos numéricos , Animales , Simulación por Computador , Interpretación Estadística de Datos , Contaminación de Medicamentos/legislación & jurisprudencia , Control de Medicamentos y Narcóticos/legislación & jurisprudencia , Humanos , Estructura Molecular , Mutágenos/análisis , Mutágenos/química , Mutágenos/clasificación , Seguridad del Paciente/legislación & jurisprudencia , Formulación de Políticas , Relación Estructura-Actividad Cuantitativa , Reproducibilidad de los Resultados , Medición de RiesgoRESUMEN
We have made available a database of over 1 billion compounds predicted to be easily synthesizable, called Synthetically Accessible Virtual Inventory (SAVI). They have been created by a set of transforms based on an adaptation and extension of the CHMTRN/PATRAN programming languages describing chemical synthesis expert knowledge, which originally stem from the LHASA project. The chemoinformatics toolkit CACTVS was used to apply a total of 53 transforms to about 150,000 readily available building blocks (enamine.net). Only single-step, two-reactant syntheses were calculated for this database even though the technology can execute multi-step reactions. The possibility to incorporate scoring systems in CHMTRN allowed us to subdivide the database of 1.75 billion compounds in sets according to their predicted synthesizability, with the most-synthesizable class comprising 1.09 billion synthetic products. Properties calculated for all SAVI products show that the database should be well-suited for drug discovery. It is being made publicly available for free download from https://doi.org/10.35115/37n9-5738.
RESUMEN
The applicability domain of a (quantitative) structure-activity relationship ([Q]SAR) must be defined, if a model is to be used successfully for toxicity prediction, particularly for regulatory purposes. Previous efforts to set guidelines on the definition of applicability domains have often been biased toward quantitative, rather than qualitative, models. As a result, novel techniques are still required to define the applicability domains of structural alert models and knowledge-based systems. By using Derek for Windows as an example, this study defined the domain for the skin sensitisation structural alert rule-base. This was achieved by fragmenting the molecules within a training set of compounds, then searching the fragments for those created from a test compound. This novel method was able to highlight test chemicals which differed from those in the training set. The information was then used to designate chemicals as being either within or outside the domain of applicability for the structural alert on which that training set was based.
Asunto(s)
Sistemas Especialistas , Modelos Químicos , Relación Estructura-Actividad Cuantitativa , Toxicología/métodos , Alternativas a las Pruebas en Animales/métodos , Humanos , Pruebas de Irritación de la Piel/métodos , Pruebas de Toxicidad/métodosRESUMEN
Molecular modeling has been employed in the search for lead compounds of chemotherapy to fight cancer. In this study, pharmacophore models have been generated and validated for use in virtual screening protocols for eight known anticancer drug targets, including tyrosine kinase, protein kinase B ß, cyclin-dependent kinase, protein farnesyltransferase, human protein kinase, glycogen synthase kinase, and indoleamine 2,3-dioxygenase 1. Pharmacophore models were validated through receiver operating characteristic and Güner-Henry scoring methods, indicating that several of the models generated could be useful for the identification of potential anticancer agents from natural product databases. The validated pharmacophore models were used as three-dimensional search queries for virtual screening of the newly developed AfroCancer database (~400 compounds from African medicinal plants), along with the Naturally Occurring Plant-based Anticancer Compound-Activity-Target dataset (comprising ~1,500 published naturally occurring plant-based compounds from around the world). Additionally, an in silico assessment of toxicity of the two datasets was carried out by the use of 88 toxicity end points predicted by the Lhasa's expert knowledge-based system (Derek), showing that only an insignificant proportion of the promising anticancer agents would be likely showing high toxicity profiles. A diversity study of the two datasets, carried out using the analysis of principal components from the most important physicochemical properties often used to access drug-likeness of compound datasets, showed that the two datasets do not occupy the same chemical space.
Asunto(s)
Simulación por Computador , Plantas Medicinales/química , Proteínas Proto-Oncogénicas c-akt/química , Proteínas Proto-Oncogénicas c-akt/farmacología , Antineoplásicos/administración & dosificación , Antineoplásicos/química , Bases de Datos Factuales , Diseño de Fármacos , Humanos , Modelos Moleculares , Proteínas Proto-Oncogénicas c-akt/metabolismoRESUMEN
A pilot toxicology database system has been created which is accessible on-line via the world-wide web or in-house via an intranet. It is intended to be suitable as a source of toxicological information and to support structure-activity relationship studies, and it can be searched on chemical structural and substructural as well as toxicological and physico-chemical data. Successful completion of the pilot has led to an ongoing project to develop and expand the system.
Asunto(s)
Bases de Datos como Asunto , Toxicología , Estudios de Factibilidad , Cooperación Internacional , Proyectos Piloto , Programas Informáticos , Relación Estructura-ActividadRESUMEN
The paper begins with a discussion of the goals of metabolic predictions in early drug research, and some difficulties toward this objective, mainly the various substrate and product selectivities characteristic of drug metabolism. The major in silico approaches to predict drug metabolism are then classified and summarized. A discrimination is, thus, made between 'local' and 'global' systems. In its second part, an evaluation of METEOR, a rule-based expert system used to predict the metabolism of drugs and other xenobiotics, is reported. The published metabolic data of ten substrates were used in this evaluation, the overall results being discussed in terms of correct vs. disputable (i.e., false-positive and false-negative) predictions. The predictions for four representative substrates are presented in detail (Figs. 1-4), illustrating the interest of such an evaluation in identifying where and how predictive rules can be improved.
Asunto(s)
Sistemas Especialistas , Galantamina/metabolismo , Indinavir/metabolismo , Piridinas/metabolismo , Tiazepinas/metabolismo , Tramadol/metabolismo , Analgésicos Opioides/metabolismo , Fármacos Cardiovasculares/metabolismo , Simulación por Computador , Inhibidores de la Proteasa del VIH/metabolismo , Humanos , Estructura Molecular , Parasimpaticomiméticos/metabolismo , Programas InformáticosRESUMEN
A previous paper1 described new metrics, veracity and utility, for assessing the performance of toxicity prediction systems that report confidence in their predictions. Assessing the performance of systems that predict mammalian metabolism is complicated by the absence of comprehensive sets of negative observations and predictions. This paper presents an approach to assessing the performance of such systems using veracity and utility.
Asunto(s)
Metaboloma/fisiología , Modelos Biológicos , HumanosRESUMEN
This paper suggests guidelines for good computer modelling practice (GCMP) when predicting chemical toxicity, with similar purposes to those for Good Laboratory Practice (GLP). The purpose of GCMP is not to specify what should be delivered with models or predictions but to set out what must be done to ensure that work can be audited, on site, in a way analogous to the auditing of studies conforming to GLP; it is intended to confirm that work has been done properly, as distinct from providing advice on how to do it. Comments are made on the guidelines and how they might be followed, based on practical experience with the implementation of such a scheme in the development of knowledge-based and quantitative structure activity relationship models. It is hoped that publication of this paper will encourage wider discussion of the subject leading to adoption of measures to ensure the trustworthiness of computer modelling work that is carried out in connection with regulatory submissions.
Asunto(s)
Simulación por Computador/normas , Modelos MolecularesRESUMEN
Traditional medicinal practices play a key role in health care systems in countries with developing economies. The aim of this survey was to validate the use of traditional medicine within local Nigerian communities. In this review, we examine the ethnobotanical uses of selected plant species from the Nigerian flora and attempt to correlate the activities of the isolated bioactive principles with known uses of the plant species in African traditional medicine. Thirty-three (33) plant species were identified and about 100 out of the 120 compounds identified with these plants matched with the ethnobotanical uses of the plants.
RESUMEN
We attempt to evaluate the "drug-likeness" of a collection of â¼1500 natural products, exhibiting in vitro or in vivo activities against cancers of various forms, by using a set of calculated molecular descriptors. Compliance to Lipinski's "Rule of Five" and Jorgensen's "Rule of Three" have been used to assess oral availability, by making use of popular parameters like molecular weights, predicted lipophilicities, number of hydrogen bond donors/acceptors, predicted aqueous solubilities, number of primary metabolites and Caco-2 permeabilities. Meanwhile 24 descriptors have been used to predict properties related to the absorption, distribution, metabolism, elimination, and toxicity (ADMET). The ADMET profiles of the anticancer natural products have been analyzed in comparision with the range of properties for 95 % of known drugs. Our results show that the computed parameters fall within the recommended range for about 42 % of the studied compounds, while respectively 63 % and 69 % of the corresponding 'drug-like' and 'lead-like' subsets had properties predicted to fall within the recommended range for 95 % of known drugs. The aim of giving a picture of how drug-like they are and bring out the need to return to natural sources in searching for anticancer lead compounds.
Asunto(s)
Antineoplásicos Fitogénicos/química , Simulación por Computador , Modelos Biológicos , Antineoplásicos Fitogénicos/farmacocinética , Disponibilidad Biológica , Proteínas Sanguíneas/química , Barrera Hematoencefálica , Bases de Datos de Compuestos Químicos , Ensayos de Selección de Medicamentos Antitumorales , Humanos , Enlace de Hidrógeno , Peso Molecular , Permeabilidad , Unión ProteicaRESUMEN
BACKGROUND: Drug metabolism and pharmacokinetic (DMPK) assessment has come to occupy a place of interest during the early stages of drug discovery today. Computer-based methods are slowly gaining ground in this area and are often used as initial tools to eliminate compounds likely to present uninteresting pharmacokinetic profiles and unacceptable levels of toxicity from the list of potential drug candidates, hence cutting down the cost of the discovery of a drug. RESULTS: In the present study, we present an in silico assessment of the DMPK profile of our recently published natural products database of 1,859 unique compounds derived from 224 species of medicinal plants from the Cameroonian forest. In this analysis, we have used 46 computed physico-chemical properties or molecular descriptors to predict the absorption, distribution, metabolism and elimination (ADME) of the compounds. This survey demonstrated that about 50% of the compounds within the Cameroonian medicinal plant and natural products (CamMedNP) database are compliant, having properties which fall within the range of ADME properties of >95% of currently known drugs, while >73% of the compounds have ≤2 violations. Moreover, about 72% of the compounds within the corresponding 'drug-like' subset showed compliance. CONCLUSIONS: In addition to the previously verified levels of 'drug-likeness' and the diversity and the wide range of measured biological activities, the compounds in the CamMedNP database show interesting DMPK profiles and, hence, could represent an important starting point for hit/lead discovery from medicinal plants in Africa.
RESUMEN
PURPOSE: Drug metabolism and pharmacokinetics (DMPK) assessment has come to occupy a place of interest during the early stages of drug discovery today. The use of computer modelling to predict the DMPK and toxicity properties of a natural product library derived from medicinal plants from Central Africa (named ConMedNP). Material from some of the plant sources are currently employed in African Traditional Medicine. METHODS: Computer-based methods are slowly gaining ground in this area and are often used as preliminary criteria for the elimination of compounds likely to present uninteresting pharmacokinetic profiles and unacceptable levels of toxicity from the list of potential drug candidates, hence cutting down the cost of discovery of a drug. In the present study, we present an in silico assessment of the DMPK and toxicity profile of a natural product library containing ~3,200 compounds, derived from 379 species of medicinal plants from 10 countries in the Congo Basin forests and savannas, which have been published in the literature. In this analysis, we have used 46 computed physico-chemical properties or molecular descriptors to predict the absorption, distribution, metabolism and elimination and toxicity (ADMET) of the compounds. RESULTS: This survey demonstrated that about 45% of the compounds within the ConMedNP compound library are compliant, having properties which fall within the range of ADME properties of 95% of currently known drugs, while about 69% of the compounds have ≤ 2 violations. Moreover, about 73% of the compounds within the corresponding "drug-like" subset showed compliance. CONCLUSIONS: In addition to the verified levels of "drug-likeness", diversity and the wide range of measured biological activities, the compounds from medicinal plants in Central Africa show interesting DMPK profiles and hence could represent an important starting point for hit/lead discovery.
RESUMEN
Prediction of mutagenicity by computer is now routinely used in research and by regulatory authorities. Broadly, two different approaches are in wide use. The first is based on statistical analysis of data to find patterns associated with mutagenic activity. The resultant models are generally termed quantitative structure-activity relationships (QSAR). The second is based on capturing human knowledge about the causes of mutagenicity and applying it in ways that mimic human reasoning. These systems are generally called knowledge-based system. Other methods for finding patterns in data, such as the application of neural networks, are in use but less widely so.
Asunto(s)
Sistemas Especialistas , Mutágenos/química , Mutágenos/toxicidad , Relación Estructura-Actividad Cuantitativa , Animales , Simulación por Computador , Humanos , Modelos Biológicos , Mutágenos/metabolismo , Redes Neurales de la Computación , Programas InformáticosRESUMEN
Foreign substances can have a dramatic and unpredictable adverse effect on human health. In the development of new therapeutic agents, it is essential that the potential adverse effects of all candidates be identified as early as possible. The field of predictive toxicology strives to profile the potential for adverse effects of novel chemical substances before they occur, both with traditional in vivo experimental approaches and increasingly through the development of in vitro and computational methods which can supplement and reduce the need for animal testing. To be maximally effective, the field needs access to the largest possible knowledge base of previous toxicology findings, and such results need to be made available in such a fashion so as to be interoperable, comparable, and compatible with standard toolkits. This necessitates the development of open, public, computable, and standardized toxicology vocabularies and ontologies so as to support the applications required by in silico, in vitro, and in vivo toxicology methods and related analysis and reporting activities. Such ontology development will support data management, model building, integrated analysis, validation and reporting, including regulatory reporting and alternative testing submission requirements as required by guidelines such as the REACH legislation, leading to new scientific advances in a mechanistically-based predictive toxicology. Numerous existing ontology and standards initiatives can contribute to the creation of a toxicology ontology supporting the needs of predictive toxicology and risk assessment. Additionally, new ontologies are needed to satisfy practical use cases and scenarios where gaps currently exist. Developing and integrating these resources will require a well-coordinated and sustained effort across numerous stakeholders engaged in a public-private partnership. In this communication, we set out a roadmap for the development of an integrated toxicology ontology, harnessing existing resources where applicable. We describe the stakeholders' requirements analysis from the academic and industry perspectives, timelines, and expected benefits of this initiative, with a view to engagement with the wider community.
Asunto(s)
Toxicología/métodos , Vocabulario Controlado , Alternativas a las Pruebas en Animales , Animales , Biología Computacional , Bases de Datos Factuales , Humanos , Investigación , Medición de Riesgo , Toxicología/economía , Toxicología/legislación & jurisprudenciaRESUMEN
The field of predictive toxicology requires the development of open, public, computable, standardized toxicology vocabularies and ontologies to support the applications required by in silico, in vitro, and in vivo toxicology methods and related analysis and reporting activities. In this article we review ontology developments based on a set of perspectives showing how ontologies are being used in predictive toxicology initiatives and applications. Perspectives on resources and initiatives reviewed include OpenTox, eTOX, Pistoia Alliance, ToxWiz, Virtual Liver, EU-ADR, BEL, ToxML, and Bioclipse. We also review existing ontology developments in neighboring fields that can contribute to establishing an ontological framework for predictive toxicology. A significant set of resources is already available to provide a foundation for an ontological framework for 21st century mechanistic-based toxicology research. Ontologies such as ToxWiz provide a basis for application to toxicology investigations, whereas other ontologies under development in the biological, chemical, and biomedical communities could be incorporated in an extended future framework. OpenTox has provided a semantic web framework for the implementation of such ontologies into software applications and linked data resources. Bioclipse developers have shown the benefit of interoperability obtained through ontology by being able to link their workbench application with remote OpenTox web services. Although these developments are promising, an increased international coordination of efforts is greatly needed to develop a more unified, standardized, and open toxicology ontology framework.
Asunto(s)
Toxicología/métodos , Vocabulario Controlado , Animales , Bases de Datos Factuales , Regulación de la Expresión Génica/efectos de los fármacos , HumanosRESUMEN
Integrated testing strategies are an important and useful approach to reduce animal usage in toxicity testing. Increased usage of integrated testing strategies is foreseen in current chemical legislation, e.g. REACH. Skin sensitisation is a well studied endpoint and many in silico models have been developed for the prediction of the skin sensitising potential of chemicals. This paper discusses the use of the OECD (Q)SAR Application Toolbox, Derek for Windows, the CAESAR global model and SMARTS rules for reactivity within a weight of evidence approach to predict skin sensitisation. Conclusions drawn from a weight of evidence approach can be used within an integrated testing strategy to reduce the requirement for in vivo tests. Using all four models in this manner enabled 76% of the conclusive predictions made regarding the test data to be in agreement with the observed toxicities. In addition, using all four models in conjunction identified areas where further information is required, as confounding results were produced. The actual data requirements for an integrated testing strategy are discussed along with what considerations need to be made for the remaining compounds that were misclassified or for which the programs contradicted one another and a definitive conclusion could not be reached.