Hidden active information in a random compound library: extraction using a pseudo-structure-activity relationship model.

Fukunishi, Hiroaki; Teramoto, Reiji; Shimada, Jiro

Fukunishi, Hiroaki; Teramoto, Reiji; Shimada, Jiro.

Afiliação

Fukunishi H; Nano Electronics Research Laboratories, Central Research Laboratories, NEC Corporation, 34, Miyukigaoka, Tsukuba, Ibaraki 305-8501, Japan. h-fukunishi@bu.jp.nec.com

J Chem Inf Model ; 48(3): 575-82, 2008 Mar.

Article em En | MEDLINE | ID: mdl-18278890

ABSTRACT

ABSTRACT

We propose a hypothesis that "a model of active compound can be provided by integrating information of compounds high-ranked by docking simulation of a random compound library". In our hypothesis, the inclusion of true active compounds in the high-ranked compound is not necessary. We regard the high-ranked compounds as being pseudo-active compounds. As a method to embody our hypothesis, we introduce a pseudo-structure-activity relationship (PSAR) model. Although the PSAR model is the same as a quantitative structure activity relationship (QSAR) model, in terms of statistical methodology, the implications of the training data are different. Known active compounds (ligands) are used as training data in the QSAR model, whereas the pseudo-active compounds are used in the PSAR model. In this study, Random Forest was used as a machine-learning algorithm. From tests for four functionally different targets, estrogen receptor antagonist (ER), thymidine kinase (TK), thrombin, and acetylcholine esterase (AChE), using five scoring functions, we obtained three

conclusions:

(1) the PSAR models significantly gave higher percentages of known ligands found than random sampling, and these results are sufficient to support our hypothesis; (2) the PSAR models gave higher percentages of known ligands found than normal scoring by scoring function, and these results demonstrate the practical usefulness of the PSAR model; and (3) the PSAR model can assess compounds failed in the docking simulation. Note that PSAR and QSAR models are used in different situations; the advantage of the PSAR model emerges when no ligand is available as training data or when one wants to find novel types of ligands, whereas the QSAR model is effective for finding compounds similar to known ligands when the ligands are already known.

Assuntos

Modelos Moleculares; Proteínas/química; Curva ROC; Relação Estrutura-Atividade

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Modelos Moleculares Tipo de estudo: Clinical_trials / Prognostic_studies Idioma: En Ano de publicação: 2008 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google