RESUMEN
Multiple approaches to quantitative structure-activity relationship (QSAR) modeling using various statistical or machine learning techniques and different types of chemical descriptors have been developed over the years. Oftentimes models are used in consensus to make more accurate predictions at the expense of model interpretation. We propose a simple, fast, and reliable method termed Multi-Descriptor Read Across (MuDRA) for developing both accurate and interpretable models. The method is conceptually related to the well-known kNN approach but uses different types of chemical descriptors simultaneously for similarity assessment. To benchmark the new method, we have built MuDRA models for six different end points (Ames mutagenicity, aquatic toxicity, hepatotoxicity, hERG liability, skin sensitization, and endocrine disruption) and compared the results with those generated with conventional consensus QSAR modeling. We find that models built with MuDRA show consistently high external accuracy similar to that of conventional QSAR models. However, MuDRA models excel in terms of transparency, interpretability, and computational efficiency. We posit that due to its methodological simplicity and reliable predictive accuracy, MuDRA provides a powerful alternative to a much more complex consensus QSAR modeling. MuDRA is implemented and freely available at the Chembench web portal ( https://chembench.mml.unc.edu/mudra ).
Asunto(s)
Relación Estructura-Actividad Cuantitativa , Algoritmos , Bases de Datos Factuales , Humanos , Internet , Modelos Biológicos , Mutágenos/toxicidad , Programas Informáticos , Pruebas de ToxicidadRESUMEN
Elucidation of the mechanistic relationships between drugs, their targets, and diseases is at the core of modern drug discovery research. Thousands of studies relevant to the drug-target-disease (DTD) triangle have been published and annotated in the Medline/PubMed database. Mining this database affords rapid identification of all published studies that confirm connections between vertices of this triangle or enable new inferences of such connections. To this end, we describe the development of Chemotext, a publicly available Web server that mines the entire compendium of published literature in PubMed annotated by Medline Subject Heading (MeSH) terms. The goal of Chemotext is to identify all known DTD relationships and infer missing links between vertices of the DTD triangle. As a proof-of-concept, we show that Chemotext could be instrumental in generating new drug repurposing hypotheses or annotating clinical outcomes pathways for known drugs. The Chemotext Web server is freely available at http://chemotext.mml.unc.edu .
Asunto(s)
Minería de Datos/métodos , Bases de Datos de Compuestos Químicos , Sistemas de Liberación de Medicamentos , Quimioterapia , Internet , Medical Subject Headings , PubMed , Descubrimiento de Drogas , Humanos , Lenguajes de Programación , Interfaz Usuario-ComputadorRESUMEN
The enormous increase in the amount of publicly available chemical genomics data and the growing emphasis on data sharing and open science mandates that cheminformaticians also make their models publicly available for broad use by the scientific community. Chembench is one of the first publicly accessible, integrated cheminformatics Web portals. It has been extensively used by researchers from different fields for curation, visualization, analysis, and modeling of chemogenomics data. Since its launch in 2008, Chembench has been accessed more than 1 million times by more than 5000 users from a total of 98 countries. We report on the recent updates and improvements that increase the simplicity of use, computational efficiency, accuracy, and accessibility of a broad range of tools and services for computer-assisted drug design and computational toxicology available on Chembench. Chembench remains freely accessible at https://chembench.mml.unc.edu.
Asunto(s)
Informática/métodos , Internet , Interfaz Usuario-Computador , Lenguajes de Programación , Relación Estructura-Actividad CuantitativaRESUMEN
MOTIVATION: Advances in the field of cheminformatics have been hindered by a lack of freely available tools. We have created Chembench, a publicly available cheminformatics portal for analyzing experimental chemical structure-activity data. Chembench provides a broad range of tools for data visualization and embeds a rigorous workflow for creating and validating predictive Quantitative Structure-Activity Relationship models and using them for virtual screening of chemical libraries to prioritize the compound selection for drug discovery and/or chemical safety assessment. AVAILABILITY: Freely accessible at: http://chembench.mml.unc.edu CONTACT: alex_tropsha@unc.edu
Asunto(s)
Descubrimiento de Drogas , Programas Informáticos , Biología Computacional , Relación Estructura-Actividad Cuantitativa , Bibliotecas de Moléculas Pequeñas , Relación Estructura-ActividadRESUMEN
UNLABELLED: We built a novel web-based platform for performing discrete molecular dynamics simulations of proteins. In silico protein folding involves searching for minimal frustration in the vast conformational landscape. Conventional approaches for simulating protein folding insufficiently address the problem of simulations in relevant time and length scales necessary for a mechanistic understanding of underlying biomolecular phenomena. Discrete molecular dynamics (DMD) offers an opportunity to bridge the size and timescale gaps and uncover the structural and biological properties of experimentally undetectable protein dynamics. The iFold server supports large-scale simulations of protein folding, thermal denaturation, thermodynamic scan, simulated annealing and p(fold) analysis using DMD and coarse-grained protein model with structure-based Go-interactions between amino acids. AVAILABILITY: http://ifold.dokhlab.org