Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 22
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
ACS Omega ; 6(34): 22400-22409, 2021 Aug 31.
Artigo em Inglês | MEDLINE | ID: mdl-34497929

RESUMO

Chemical mixtures have recently come to the attention of open standards and data structures for capturing machine-readable descriptions for informatics uses. At the present time, essentially all transmission of information about mixtures is done using short text descriptions that are readable only by trained scientists, and there are no accessible repositories of marked-up mixture data. We have designed a machine learning tool that can interpret mixture descriptions and upgrade them to the high-level Mixfile format, which can in turn be used to generate Mixtures InChI notation. The interpretation achieves a high success rate and can be used at scale to markup large catalogs and inventories, with some expert checking to catch edge cases. The training data that was accumulated during the project is made openly available, along with previously released mixture editing tools and utilities.

2.
J Cheminform ; 11(1): 33, 2019 May 23.
Artigo em Inglês | MEDLINE | ID: mdl-31124006

RESUMO

We describe a file format that is designed to represent mixtures of compounds in a way that is fully machine readable. This Mixfile format is intended to fill the same role for substances that are composed of multiple components as the venerable Molfile does for specifying individual structures. This much needed datastructure is intended to replace current practices for communicating information about mixtures, which usually relies on human-readable text descriptions, drawing several species within a single molecular diagram, or mutually incompatible ad hoc solutions. We describe an open source software application for editing mixture files, which can also be used as web-ready tools for manipulating the file format. We also present a corpus of mixture examples, which we have extracted from collections of text-based descriptions. Furthermore, we present an early look at the proposed IUPAC Mixtures InChI specification, instances of which can be automatically generated using the Mixfile format as a precursor.

3.
Methods Mol Biol ; 1755: 197-221, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-29671272

RESUMO

We are now seeing the benefit of investments made over the last decade in high-throughput screening (HTS) that is resulting in large structure activity datasets entering public and open databases such as ChEMBL and PubChem. The growth of academic HTS screening centers and the increasing move to academia for early stage drug discovery suggests a great need for the informatics tools and methods to mine such data and learn from it. Collaborative Drug Discovery, Inc. (CDD) has developed a number of tools for storing, mining, securely and selectively sharing, as well as learning from such HTS data. We present a new web based data mining and visualization module directly within the CDD Vault platform for high-throughput drug discovery data that makes use of a novel technology stack following modern reactive design principles. We also describe CDD Models within the CDD Vault platform that enables researchers to share models, share predictions from models, and create models from distributed, heterogeneous data. Our system is built on top of the Collaborative Drug Discovery Vault Activity and Registration data repository ecosystem which allows users to manipulate and visualize thousands of molecules in real time. This can be performed in any browser on any platform. In this chapter we present examples of its use with public datasets in CDD Vault. Such approaches can complement other cheminformatics tools, whether open source or commercial, in providing approaches for data mining and modeling of HTS data.


Assuntos
Biologia Computacional/métodos , Mineração de Dados/métodos , Bases de Dados de Produtos Farmacêuticos , Conjuntos de Dados como Assunto , Descoberta de Drogas/métodos , Software
4.
Drug Discov Today ; 22(3): 555-565, 2017 03.
Artigo em Inglês | MEDLINE | ID: mdl-27884746

RESUMO

Neglected disease drug discovery is generally poorly funded compared with major diseases and hence there is an increasing focus on collaboration and precompetitive efforts such as public-private partnerships (PPPs). The More Medicines for Tuberculosis (MM4TB) project is one such collaboration funded by the EU with the goal of discovering new drugs for tuberculosis. Collaborative Drug Discovery has provided a commercial web-based platform called CDD Vault which is a hosted collaborative solution for securely sharing diverse chemistry and biology data. Using CDD Vault alongside other commercial and free cheminformatics tools has enabled support of this and other large collaborative projects, aiding drug discovery efforts and fostering collaboration. We will describe CDD's efforts in assisting with the MM4TB project.


Assuntos
Antituberculosos , Descoberta de Drogas , Animais , Antituberculosos/uso terapêutico , Humanos , Aprendizado de Máquina , Terapia de Alvo Molecular , Tuberculose/tratamento farmacológico
5.
Pharm Res ; 33(1): 194-205, 2016 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-26311555

RESUMO

PURPOSE: We propose a framework with simple proxies to dissect the relative energy contributions responsible for standard drug discovery binding activity. METHODS: We explore a rule of thumb using hydrogen-bond donors, hydrogen-bond acceptors and rotatable bonds as relative proxies for the thermodynamic terms. We apply this methodology to several datasets (e.g., multiple small molecules profiled against kinases, Mycobacterium tuberculosis (Mtb) high throughput screening (HTS) and structure based drug design (SBDD) derived compounds, and FDA approved drugs). RESULTS: We found that Mtb active compounds developed through SBDD methods had statistically significantly larger PEnthalpy values than HTS derived compounds, suggesting these compounds had relatively more hydrogen bond donor and hydrogen bond acceptors compared to rotatable bonds. In recent FDA approved medicines we found that compounds identified via target-based approaches had a more balanced enthalpic relationship between these descriptors compared to compounds identified via phenotypic screens CONCLUSIONS: As it is common to experimentally optimize directly for total binding energy, these computational methods provide alternative calculations and approaches useful for compound optimization alongside other common metrics in available software and databases.


Assuntos
Descoberta de Drogas/métodos , Termodinâmica , Biologia Computacional , Bases de Dados Factuais , Entropia , Ensaios de Triagem em Larga Escala , Ligação de Hidrogênio , Mycobacterium tuberculosis/efeitos dos fármacos , Fosfotransferases/química , Receptores de Droga/química , Bibliotecas de Moléculas Pequenas , Relação Estrutura-Atividade
6.
PLoS Negl Trop Dis ; 9(6): e0003878, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26114876

RESUMO

BACKGROUND: Chagas disease is a neglected tropical disease (NTD) caused by the eukaryotic parasite Trypanosoma cruzi. The current clinical and preclinical pipeline for T. cruzi is extremely sparse and lacks drug target diversity. METHODOLOGY/PRINCIPAL FINDINGS: In the present study we developed a computational approach that utilized data from several public whole-cell, phenotypic high throughput screens that have been completed for T. cruzi by the Broad Institute, including a single screen of over 300,000 molecules in the search for chemical probes as part of the NIH Molecular Libraries program. We have also compiled and curated relevant biological and chemical compound screening data including (i) compounds and biological activity data from the literature, (ii) high throughput screening datasets, and (iii) predicted metabolites of T. cruzi metabolic pathways. This information was used to help us identify compounds and their potential targets. We have constructed a Pathway Genome Data Base for T. cruzi. In addition, we have developed Bayesian machine learning models that were used to virtually screen libraries of compounds. Ninety-seven compounds were selected for in vitro testing, and 11 of these were found to have EC50 < 10 µM. We progressed five compounds to an in vivo mouse efficacy model of Chagas disease and validated that the machine learning model could identify in vitro active compounds not in the training set, as well as known positive controls. The antimalarial pyronaridine possessed 85.2% efficacy in the acute Chagas mouse model. We have also proposed potential targets (for future verification) for this compound based on structural similarity to known compounds with targets in T. cruzi. CONCLUSIONS/ SIGNIFICANCE: We have demonstrated how combining chemoinformatics and bioinformatics for T. cruzi drug discovery can bring interesting in vivo active molecules to light that may have been overlooked. The approach we have taken is broadly applicable to other NTDs.


Assuntos
Doença de Chagas/parasitologia , Descoberta de Drogas/métodos , Genoma de Protozoário/genética , Aprendizado de Máquina , Tripanossomicidas/farmacologia , Trypanosoma cruzi/genética , Animais , Teorema de Bayes , Linhagem Celular , Doença de Chagas/tratamento farmacológico , Biologia Computacional , Modelos Animais de Doenças , Feminino , Ensaios de Triagem em Larga Escala , Humanos , Redes e Vias Metabólicas , Camundongos , Camundongos Endogâmicos BALB C , Tripanossomicidas/isolamento & purificação , Trypanosoma cruzi/efeitos dos fármacos
7.
J Chem Inf Model ; 54(10): 2996-3004, 2014 Oct 27.
Artigo em Inglês | MEDLINE | ID: mdl-25244007

RESUMO

In a decade with over half a billion dollars of investment, more than 300 chemical probes have been identified to have biological activity through NIH funded screening efforts. We have collected the evaluations of an experienced medicinal chemist on the likely chemistry quality of these probes based on a number of criteria including literature related to the probe and potential chemical reactivity. Over 20% of these probes were found to be undesirable. Analysis of the molecular properties of these compounds scored as desirable suggested higher pKa, molecular weight, heavy atom count, and rotatable bond number. We were particularly interested whether the human evaluation aspect of medicinal chemistry due diligence could be computationally predicted. We used a process of sequential Bayesian model building and iterative testing as we included additional probes. Following external validation of these methods and comparing different machine learning methods, we identified Bayesian models with accuracy comparable to other measures of drug-likeness and filtering rules created to date.


Assuntos
Inteligência Artificial , Modelos Estatísticos , Sondas Moleculares/química , Teorema de Bayes , Simulação por Computador , Humanos , Sondas Moleculares/economia , Peso Molecular , Controle de Qualidade , Sensibilidade e Especificidade
8.
PeerJ ; 2: e524, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25165633

RESUMO

Bioinformatics and computer aided drug design rely on the curation of a large number of protocols for biological assays that measure the ability of potential drugs to achieve a therapeutic effect. These assay protocols are generally published by scientists in the form of plain text, which needs to be more precisely annotated in order to be useful to software methods. We have developed a pragmatic approach to describing assays according to the semantic definitions of the BioAssay Ontology (BAO) project, using a hybrid of machine learning based on natural language processing, and a simplified user interface designed to help scientists curate their data with minimum effort. We have carried out this work based on the premise that pure machine learning is insufficiently accurate, and that expecting scientists to find the time to annotate their protocols manually is unrealistic. By combining these approaches, we have created an effective prototype for which annotation of bioassay text within the domain of the training set can be accomplished very quickly. Well-trained annotations require single-click user approval, while annotations from outside the training set domain can be identified using the search feature of a well-designed user interface, and subsequently used to improve the underlying models. By drastically reducing the time required for scientists to annotate their assays, we can realistically advocate for semantic annotation to become a standard part of the publication process. Once even a small proportion of the public body of bioassay data is marked up, bioinformatics researchers can begin to construct sophisticated and useful searching and analysis algorithms that will provide a diverse and powerful set of tools for drug discovery researchers.

9.
Tuberculosis (Edinb) ; 94(2): 162-9, 2014 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-24440548

RESUMO

The search for compounds active against Mycobacterium tuberculosis is reliant upon high-throughput screening (HTS) in whole cells. We have used Bayesian machine learning models which can predict anti-tubercular activity to filter an internal library of over 150,000 compounds prior to in vitro testing. We used this to select and test 48 compounds in vitro; 11 were active with MIC values ranging from 0.4 µM to 10.2 µM, giving a high hit rate of 22.9%. Among the hits, we identified several compounds belonging to the same series including five quinolones (including ciprofloxacin), three molecules with long aliphatic linkers and three singletons. This approach represents a rapid method to prioritize compounds for testing that can be used alongside medicinal chemistry insight and other filters to identify active molecules. Such models can significantly increase the hit rate of HTS, above the usual 1% or lower rates seen. In addition, the potential targets for the 11 molecules were predicted using TB Mobile and clustering alongside a set of over 740 molecules with known M. tuberculosis target annotations. These predictions may serve as a mechanism for prioritizing compounds for further optimization.


Assuntos
Antituberculosos/farmacologia , Mycobacterium tuberculosis/efeitos dos fármacos , Bibliotecas de Moléculas Pequenas/farmacologia , Antituberculosos/química , Teorema de Bayes , Bases de Dados de Proteínas , Descoberta de Drogas , Feminino , Humanos , Masculino , Reprodutibilidade dos Testes
11.
PLoS One ; 8(5): e63240, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23667592

RESUMO

High-throughput screening (HTS) in whole cells is widely pursued to find compounds active against Mycobacterium tuberculosis (Mtb) for further development towards new tuberculosis (TB) drugs. Hit rates from these screens, usually conducted at 10 to 25 µM concentrations, typically range from less than 1% to the low single digits. New approaches to increase the efficiency of hit identification are urgently needed to learn from past screening data. The pharmaceutical industry has for many years taken advantage of computational approaches to optimize compound libraries for in vitro testing, a practice not fully embraced by academic laboratories in the search for new TB drugs. Adapting these proven approaches, we have recently built and validated Bayesian machine learning models for predicting compounds with activity against Mtb based on publicly available large-scale HTS data from the Tuberculosis Antimicrobial Acquisition Coordinating Facility. We now demonstrate the largest prospective validation to date in which we computationally screened 82,403 molecules with these Bayesian models, assayed a total of 550 molecules in vitro, and identified 124 actives against Mtb. Individual hit rates for the different datasets varied from 15-28%. We have identified several FDA approved and late stage clinical candidate kinase inhibitors with activity against Mtb which may represent starting points for further optimization. The computational models developed herein and the commercially available molecules derived from them are now available to any group pursuing Mtb drug discovery.


Assuntos
Antituberculosos/farmacologia , Descoberta de Drogas , Modelos Teóricos , Mycobacterium tuberculosis/efeitos dos fármacos , Teorema de Bayes , Bases de Dados de Proteínas , Relação Dose-Resposta a Droga , Curva ROC , Reprodutibilidade dos Testes , Bibliotecas de Moléculas Pequenas/farmacologia
12.
Methods Mol Biol ; 993: 139-54, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23568469

RESUMO

The broad goals of Collaborative Drug Discovery (CDD) are to enable a collaborative "cloud-based" tool to be used to bring together neglected disease researchers and other researchers from usually separate areas, to collaborate and to share compounds and drug discovery data in the research community, which will ultimately result in long-term improvements in the research enterprise and health care delivery. This chapter briefly introduces CDD software and describes applications in antimalarial and tuberculosis research.


Assuntos
Comportamento Cooperativo , Bases de Dados de Produtos Farmacêuticos , Descoberta de Drogas/métodos , Software , Animais , Antimaláricos/farmacologia , Antituberculosos/farmacologia , Humanos , Internet , Mycobacterium tuberculosis/efeitos dos fármacos
13.
Chem Biol ; 20(3): 370-8, 2013 Mar 21.
Artigo em Inglês | MEDLINE | ID: mdl-23521795

RESUMO

Identification of unique leads represents a significant challenge in drug discovery. This hurdle is magnified in neglected diseases such as tuberculosis. We have leveraged public high-throughput screening (HTS) data to experimentally validate a virtual screening approach employing Bayesian models built with bioactivity information (single-event model) as well as bioactivity and cytotoxicity information (dual-event model). We virtually screened a commercial library and experimentally confirmed actives with hit rates exceeding typical HTS results by one to two orders of magnitude. This initial dual-event Bayesian model identified compounds with antitubercular whole-cell activity and low mammalian cell cytotoxicity from a published set of antimalarials. The most potent hit exhibits the in vitro activity and in vitro/in vivo safety profile of a drug lead. These Bayesian models offer significant economies in time and cost to drug discovery.


Assuntos
Antituberculosos/farmacologia , Antituberculosos/toxicidade , Descoberta de Drogas , Animais , Teorema de Bayes , Chlorocebus aethiops , Avaliação Pré-Clínica de Medicamentos , Feminino , Concentração Inibidora 50 , Macrófagos/efeitos dos fármacos , Camundongos , Mycobacterium tuberculosis/efeitos dos fármacos , Células Vero
14.
Pharm Res ; 29(8): 2115-27, 2012 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-22477069

RESUMO

PURPOSE: New strategies for developing inhibitors of Mycobacterium tuberculosis (Mtb) are required in order to identify the next generation of tuberculosis (TB) drugs. Our approach leverages the integration of intensive data mining and curation and computational approaches, including cheminformatics combined with bioinformatics, to suggest biological targets and their small molecule modulators. METHODS: We now describe an approach that uses the TBCyc pathway and genome database, the Collaborative Drug Discovery database of molecules with activity against Mtb and their associated targets, a 3D pharmacophore approach and Bayesian models of TB activity in order to select pathways and metabolites and ultimately prioritize molecules that may be acting as substrate mimics and exhibit activity against TB. RESULTS: In this study we combined the TB cheminformatics and pathways databases that enabled us to computationally search >80,000 vendor available molecules and ultimately test 23 compounds in vitro that resulted in two compounds (N-(2-furylmethyl)-N'-[(5-nitro-3-thienyl)carbonyl]thiourea and N-[(5-nitro-3-thienyl)carbonyl]-N'-(2-thienylmethyl)thiourea) proposed as mimics of D-fructose 1,6 bisphosphate, (MIC of 20 and 40 µg/ml, respectively). CONCLUSION: This is a simple yet novel approach that has the potential to identify inhibitors of bacterial growth as illustrated by compounds identified in this study that have activity against Mtb.


Assuntos
Antituberculosos/química , Antituberculosos/farmacologia , Biologia Computacional/métodos , Descoberta de Drogas/métodos , Mycobacterium tuberculosis/efeitos dos fármacos , Tuberculose/tratamento farmacológico , Animais , Proteínas de Bactérias/genética , Proteínas de Bactérias/metabolismo , Teorema de Bayes , Mineração de Dados , Bases de Dados Factuais , Humanos , Redes e Vias Metabólicas/efeitos dos fármacos , Modelos Moleculares , Terapia de Alvo Molecular/métodos , Mycobacterium tuberculosis/enzimologia , Mycobacterium tuberculosis/genética , Mycobacterium tuberculosis/metabolismo , Tuberculose/microbiologia
15.
Pharm Res ; 29(7): 1717-21, 2012 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-22362409

RESUMO

Understanding the regulation of gene expression is critical to many areas of biology while control via RNAs has found considerable interest as a tool for scientific discovery and potential therapeutic applications. For example whole genome RNA interference (RNAi) screens and whole proteome scans provide views of how the entire transcriptome or proteome responds to biological, chemical or environmental perturbations of a gene's activity. Small RNA (sRNA) or MicroRNA (miRNA) are known to regulate pathways and bind mRNA, while the function of miRNAs discovered in experimental studies is often unknown. In both cases, RNAi and miRNA require labor intensive studies to tease out their functions within gene networks. Available software to analyze relationships is currently an ad hoc and often a manual process that can take up to several hours to analyze a single candidate RNAi or miRNA. With experiments frequently highlighting tens to hundreds of candidates this represents a considerable bottleneck. We suggest there is a gap in miRNA and RNAi research caused by inadequate current software that could be improved. For example a new software application could be created that provides interactive, comprehensive target analysis that leverages past datasets to lead to statistically stronger analyses.


Assuntos
MicroRNAs/genética , Interferência de RNA , Software , Animais , Bases de Dados Genéticas , Genômica/métodos , Humanos
17.
Mol Biosyst ; 6(11): 2316-2324, 2010 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-20835433

RESUMO

There is an urgent need for new drugs against tuberculosis which annually claims 1.7-1.8 million lives. One approach to identify potential leads is to screen in vitro small molecules against Mycobacterium tuberculosis (Mtb). Until recently there was no central repository to collect information on compounds screened. Consequently, it has been difficult to analyze molecular properties of compounds that inhibit the growth of Mtb in vitro. We have collected data from publically available sources on over 300 000 small molecules deposited in the Collaborative Drug Discovery TB Database. A cheminformatics analysis on these compounds indicates that inhibitors of the growth of Mtb have statistically higher mean logP, rule of 5 alerts, while also having lower HBD count, atom count and lower PSA (ChemAxon descriptors), compared to compounds that are classed as inactive. Additionally, Bayesian models for selecting Mtb active compounds were evaluated with over 100 000 compounds and, they demonstrated 10 fold enrichment over random for the top ranked 600 compounds. This represents a promising approach for finding compounds active against Mtb in whole cells screened under the same in vitro conditions. Various sets of Mtb hit molecules were also examined by various filtering rules used widely in the pharmaceutical industry to identify compounds with potentially reactive moieties. We found differences between the number of compounds flagged by these rules in Mtb datasets, malaria hits, FDA approved drugs and antibiotics. Combining these approaches may enable selection of compounds with increased probability of inhibition of whole cell Mtb activity.


Assuntos
Antituberculosos/análise , Antituberculosos/farmacologia , Bases de Dados Factuais , Avaliação Pré-Clínica de Medicamentos , Mycobacterium tuberculosis/efeitos dos fármacos , Bibliotecas de Moléculas Pequenas/análise , Bibliotecas de Moléculas Pequenas/farmacologia , Antituberculosos/química , Teorema de Bayes , Bibliotecas de Moléculas Pequenas/química
18.
Drug Metab Dispos ; 38(11): 2083-90, 2010 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-20693417

RESUMO

Ligand-based computational models could be more readily shared between researchers and organizations if they were generated with open source molecular descriptors [e.g., chemistry development kit (CDK)] and modeling algorithms, because this would negate the requirement for proprietary commercial software. We initially evaluated open source descriptors and model building algorithms using a training set of approximately 50,000 molecules and a test set of approximately 25,000 molecules with human liver microsomal metabolic stability data. A C5.0 decision tree model demonstrated that CDK descriptors together with a set of Smiles Arbitrary Target Specification (SMARTS) keys had good statistics [κ = 0.43, sensitivity = 0.57, specificity = 0.91, and positive predicted value (PPV) = 0.64], equivalent to those of models built with commercial Molecular Operating Environment 2D (MOE2D) and the same set of SMARTS keys (κ = 0.43, sensitivity = 0.58, specificity = 0.91, and PPV = 0.63). Extending the dataset to ∼193,000 molecules and generating a continuous model using Cubist with a combination of CDK and SMARTS keys or MOE2D and SMARTS keys confirmed this observation. When the continuous predictions and actual values were binned to get a categorical score we observed a similar κ statistic (0.42). The same combination of descriptor set and modeling method was applied to passive permeability and P-glycoprotein efflux data with similar model testing statistics. In summary, open source tools demonstrated predictive results comparable to those of commercial software with attendant cost savings. We discuss the advantages and disadvantages of open source descriptors and the opportunity for their use as a tool for organizations to share data precompetitively, avoiding repetition and assisting drug discovery.


Assuntos
Biologia Computacional/métodos , Descoberta de Drogas/métodos , Modelos Biológicos , Preparações Farmacêuticas/metabolismo , Software , Toxicologia/métodos , Absorção , Algoritmos , Simulação por Computador , Estabilidade de Medicamentos , Humanos , Microssomos Hepáticos/metabolismo , Preparações Farmacêuticas/química , Valor Preditivo dos Testes , Solubilidade , Distribuição Tecidual
19.
Pharm Res ; 27(10): 2035-9, 2010 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-20683645

RESUMO

Cheminformatics is at a turning point, the pharmaceutical industry benefits from using the various methods developed over the last twenty years, but in our opinion we need to see greater development of novel approaches that non-experts can use. This will be achieved by more collaborations between software companies, academics and the evolving pharmaceutical industry. We suggest that cheminformatics should also be looking to other industries that use high performance computing technologies for inspiration. We describe the needs and opportunities which may benefit from the development of open cheminformatics technologies, mobile computing, the movement of software to the cloud and precompetitive initiatives.


Assuntos
Química Farmacêutica , Informática , Armazenamento e Recuperação da Informação , Relação Quantitativa Estrutura-Atividade , Química Farmacêutica/métodos , Química Farmacêutica/tendências , Bases de Dados Factuais , Informática/métodos , Informática/tendências , Armazenamento e Recuperação da Informação/métodos , Armazenamento e Recuperação da Informação/tendências , Software
20.
Mol Biosyst ; 6(5): 840-51, 2010 May.
Artigo em Inglês | MEDLINE | ID: mdl-20567770

RESUMO

The search for molecules with activity against Mycobacterium tuberculosis (Mtb) is employing many approaches in parallel including high throughput screening and computational methods. We have developed a database (CDD TB) to capture public and private Mtb data while enabling data mining and collaborations with other researchers. We have used the public data along with several cheminformatics approaches to produce models that describe active and inactive compounds. We have compared these datasets to those for known FDA approved drugs and between Mtb active and inactive compounds. The distribution of polar surface area and pK(a) of active compounds was found to be a statistically significant determinant of activity against Mtb. Hydrophobicity was not always statistically significant. Bayesian classification models for 220, 463 molecules were generated and tested with external molecules, and enabled the discrimination of active or inactive substructures from other datasets in the CDD TB. Computational pharmacophores based on known Mtb drugs were able to map to and retrieve a small subset of some of the Mtb datasets, including a high percentage of Mtb actives. The combination of the database, dataset analysis, Bayesian and pharmacophore models provides new insights into molecular properties and features that are determinants of activity in whole cells. This study provides novel insights into the key 1D molecular descriptors, 2D chemical substructures and 3D pharmacophores which can be used to mine the chemistry space, prioritizing those molecules with a higher probability of activity against Mtb.


Assuntos
Biologia Computacional/métodos , Bases de Dados Factuais , Tuberculose , Animais , Descoberta de Drogas , Humanos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...