RESUMEN
We are now seeing the benefit of investments made over the last decade in high-throughput screening (HTS) that is resulting in large structure activity datasets entering public and open databases such as ChEMBL and PubChem. The growth of academic HTS screening centers and the increasing move to academia for early stage drug discovery suggests a great need for the informatics tools and methods to mine such data and learn from it. Collaborative Drug Discovery, Inc. (CDD) has developed a number of tools for storing, mining, securely and selectively sharing, as well as learning from such HTS data. We present a new web based data mining and visualization module directly within the CDD Vault platform for high-throughput drug discovery data that makes use of a novel technology stack following modern reactive design principles. We also describe CDD Models within the CDD Vault platform that enables researchers to share models, share predictions from models, and create models from distributed, heterogeneous data. Our system is built on top of the Collaborative Drug Discovery Vault Activity and Registration data repository ecosystem which allows users to manipulate and visualize thousands of molecules in real time. This can be performed in any browser on any platform. In this chapter we present examples of its use with public datasets in CDD Vault. Such approaches can complement other cheminformatics tools, whether open source or commercial, in providing approaches for data mining and modeling of HTS data.
Asunto(s)
Biología Computacional/métodos , Minería de Datos/métodos , Bases de Datos Farmacéuticas , Conjuntos de Datos como Asunto , Descubrimiento de Drogas/métodos , Programas InformáticosRESUMEN
Anemia, the most common hematological disorder in human immunodeficiency virus (HIV) infection and acquired immunodeficiency syndrome (AIDS), is associated with decreased quality of life and survival. Hypogonadism is prevalent in advanced HIV disease, however, low testosterone levels have not been customarily implicated in HIV-associated anemia. This study was undertaken to determine whether there is a relationship between testosterone levels and androgen use with anemia in HIV, and to characterize other clinical correlates of HIV-associated anemia. This was a cross-sectional, observational study of 200 HIV-positive patients at a public hospital HIV clinic from July 2000 to August 2001. A written questionnaire detailed previous and current medication use, opportunistic infections, and malignancies. Hematological and virological parameters, testosterone, and erythropoietin levels were measured; CD4(+) T lymphocyte count and viral load nadir and peak levels were obtained from the computerized medical record. Anemia was defined as hemoglobin <13.5 g/dl in men and <11.6 g/dl in women. Twenty-four percent of women and 28% of men were anemic. Anemia was associated with lymphopenia (adjusted OR 4.0, 95% CI 1.36-11.80), high erythropoietin levels (adjusted OR 7.73, 95% CI 2.92-20.48), and low testosterone levels (adjusted OR 3.27, 95% CI 1.01-10.60). Anemia was negatively associated with female sex (adjusted OR 0.30, 95% CI 0.11-0.85), current antiretroviral therapy (adjusted OR 0.43, 95% CI 0.20-0.95), current androgen use (adjusted OR 0.20, 95% CI 0.05-0.84), and macrocytosis (adjusted OR 0.23, 95% CI 0.09-0.61). Low testosterone levels may have a positive association and supplemental androgens a negative association with anemia in HIV disease.
Asunto(s)
Anemia/sangre , Infecciones por VIH/sangre , Testosterona/sangre , Adulto , Anemia/etiología , Recuento de Linfocito CD4 , Estudios Transversales , Femenino , Infecciones por VIH/complicaciones , Infecciones por VIH/inmunología , Humanos , Masculino , Persona de Mediana EdadRESUMEN
The search for molecules with activity against Mycobacterium tuberculosis (Mtb) is employing many approaches in parallel including high throughput screening and computational methods. We have developed a database (CDD TB) to capture public and private Mtb data while enabling data mining and collaborations with other researchers. We have used the public data along with several cheminformatics approaches to produce models that describe active and inactive compounds. We have compared these datasets to those for known FDA approved drugs and between Mtb active and inactive compounds. The distribution of polar surface area and pK(a) of active compounds was found to be a statistically significant determinant of activity against Mtb. Hydrophobicity was not always statistically significant. Bayesian classification models for 220, 463 molecules were generated and tested with external molecules, and enabled the discrimination of active or inactive substructures from other datasets in the CDD TB. Computational pharmacophores based on known Mtb drugs were able to map to and retrieve a small subset of some of the Mtb datasets, including a high percentage of Mtb actives. The combination of the database, dataset analysis, Bayesian and pharmacophore models provides new insights into molecular properties and features that are determinants of activity in whole cells. This study provides novel insights into the key 1D molecular descriptors, 2D chemical substructures and 3D pharmacophores which can be used to mine the chemistry space, prioritizing those molecules with a higher probability of activity against Mtb.
Asunto(s)
Biología Computacional/métodos , Bases de Datos Factuales , Tuberculosis , Animales , Descubrimiento de Drogas , HumanosRESUMEN
There is an urgent need for new drugs against tuberculosis which annually claims 1.7-1.8 million lives. One approach to identify potential leads is to screen in vitro small molecules against Mycobacterium tuberculosis (Mtb). Until recently there was no central repository to collect information on compounds screened. Consequently, it has been difficult to analyze molecular properties of compounds that inhibit the growth of Mtb in vitro. We have collected data from publically available sources on over 300 000 small molecules deposited in the Collaborative Drug Discovery TB Database. A cheminformatics analysis on these compounds indicates that inhibitors of the growth of Mtb have statistically higher mean logP, rule of 5 alerts, while also having lower HBD count, atom count and lower PSA (ChemAxon descriptors), compared to compounds that are classed as inactive. Additionally, Bayesian models for selecting Mtb active compounds were evaluated with over 100 000 compounds and, they demonstrated 10 fold enrichment over random for the top ranked 600 compounds. This represents a promising approach for finding compounds active against Mtb in whole cells screened under the same in vitro conditions. Various sets of Mtb hit molecules were also examined by various filtering rules used widely in the pharmaceutical industry to identify compounds with potentially reactive moieties. We found differences between the number of compounds flagged by these rules in Mtb datasets, malaria hits, FDA approved drugs and antibiotics. Combining these approaches may enable selection of compounds with increased probability of inhibition of whole cell Mtb activity.
Asunto(s)
Antituberculosos/análisis , Antituberculosos/farmacología , Bases de Datos Factuales , Evaluación Preclínica de Medicamentos , Mycobacterium tuberculosis/efectos de los fármacos , Bibliotecas de Moléculas Pequeñas/análisis , Bibliotecas de Moléculas Pequeñas/farmacología , Antituberculosos/química , Teorema de Bayes , Bibliotecas de Moléculas Pequeñas/químicaRESUMEN
A convergence of different commercial and publicly accessible chemical informatics, databases and social networking tools is positioned to change the way that research collaborations are initiated, maintained and expanded, particularly in the realm of neglected diseases. A community-based platform that combines traditional drug discovery informatics with Web2.0 features in secure groups is believed to be the key to facilitating richer, instantaneous collaborations involving sensitive drug discovery data and intellectual property. Heterogeneous chemical and biological data from low-throughput or high-throughput experiments are archived, mined and then selectively shared either just securely between specifically designated colleagues or openly on the Internet in standardized formats. We will illustrate several case studies for anti-malarial research enabled by this platform, which we suggest could be easily expanded more broadly for pharmaceutical research in general.