Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 3 de 3
Filtrar
1.
Bioinformatics ; 31(12): 1966-73, 2015 Jun 15.
Artigo em Inglês | MEDLINE | ID: mdl-25697821

RESUMO

MOTIVATION: Cytochrome P450s are a family of enzymes responsible for the metabolism of approximately 90% of FDA-approved drugs. Medicinal chemists often want to know which atoms of a molecule-its metabolized sites-are oxidized by Cytochrome P450s in order to modify their metabolism. Consequently, there are several methods that use literature-derived, atom-resolution data to train models that can predict a molecule's sites of metabolism. There is, however, much more data available at a lower resolution, where the exact site of metabolism is not known, but the region of the molecule that is oxidized is known. Until now, no site-of-metabolism models made use of region-resolution data. RESULTS: Here, we describe XenoSite-Region, the first reported method for training site-of-metabolism models with region-resolution data. Our approach uses the Expectation Maximization algorithm to train a site-of-metabolism model. Region-resolution metabolism data was simulated from a large site-of-metabolism dataset, containing 2000 molecules with 3400 metabolized and 30 000 un-metabolized sites and covering nine Cytochrome P450 isozymes. When training on the same molecules (but with only region-level information), we find that this approach yields models almost as accurate as models trained with atom-resolution data. Moreover, we find that atom-resolution trained models are more accurate when also trained with region-resolution data from additional molecules. Our approach, therefore, opens up a way to extend the applicable domain of site-of-metabolism models into larger regions of chemical space. This meets a critical need in drug development by tapping into underutilized data commonly available in most large drug companies. AVAILABILITY AND IMPLEMENTATION: The algorithm, data and a web server are available at http://swami.wustl.edu/xregion.


Assuntos
Algoritmos , Biologia Computacional/métodos , Sistema Enzimático do Citocromo P-450/metabolismo , Modelos Moleculares , Bibliotecas de Moléculas Pequenas/metabolismo , Xenobióticos/metabolismo , Sistema Enzimático do Citocromo P-450/química , Humanos , Simulação de Acoplamento Molecular , Relação Estrutura-Atividade
2.
J Comput Aided Mol Des ; 27(5): 469-78, 2013 May.
Artigo em Inglês | MEDLINE | ID: mdl-23585219

RESUMO

In a typical high-throughput screening (HTS) campaign, less than 1 % of the small-molecule library is characterized by confirmatory experiments. As much as 99 % of the library's molecules are set aside--and not included in downstream analysis--although some of these molecules would prove active were they sent for confirmatory testing. These missing experimental measurements prevent active molecules from being identified by screeners. In this study, we propose managing missing measurements using imputation--a powerful technique from the machine learning community--to fill in accurate guesses where measurements are missing. We then use these imputed measurements to construct an imputed visualization of HTS results, based on the scaffold tree visualization from the literature. This imputed visualization identifies almost all groups of active molecules from a HTS, even those that would otherwise be missed. We validate our methodology by simulating HTS experiments using the data from eight quantitative HTS campaigns, and the implications for drug discovery are discussed. In particular, this method can rapidly and economically identify novel active molecules, each of which could have novel function in either binding or selectivity in addition to representing new intellectual property.


Assuntos
Ensaios de Triagem em Larga Escala , Bibliotecas de Moléculas Pequenas , Inteligência Artificial , Descoberta de Drogas , Humanos , Software
3.
J Biomol Screen ; 17(8): 1071-9, 2012 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-22693105

RESUMO

Public databases that store the data from small-molecule screens are a rich and untapped resource of chemical and biological information. However, screening databases are unorganized, which makes interpreting their data difficult. We propose a method of inferring workflow graphs--which encode the relationships between assays in screening projects--directly from screening data and using these workflows to organize each project's data. On the basis of four heuristics regarding the organization of screening projects, we designed an algorithm that extracts a project's workflow graph from screening data. Where possible, the algorithm is evaluated by comparing each project's inferred workflow to its documentation. In the majority of cases, there are no discrepancies between the two. Most errors can be traced to points in the project where screeners chose additional molecules to test based on structural similarity to promising molecules, a case our algorithm is not yet capable of handling. Nonetheless, these workflows accurately organize most of the data and also provide a method of visualizing a screening project. This method is robust enough to build a workflow-oriented front-end to PubChem and is currently being used regularly by both our lab and our collaborators. A Python implementation of the algorithm is available online, and a searchable database of all PubChem workflows is available at http://swami.wustl.edu/flow.


Assuntos
Algoritmos , Mineração de Dados/métodos , Bases de Dados de Compostos Químicos , Avaliação Pré-Clínica de Medicamentos/métodos , Biologia Computacional , Gráficos por Computador , Sistemas de Gerenciamento de Base de Dados , Ensaios de Triagem em Larga Escala/métodos , Estrutura Molecular , Bibliotecas de Moléculas Pequenas/farmacologia
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA