Computational Prediction of Compound-Protein Interactions for Orphan Targets Using CGBVS.

Kanai, Chisato; Kawasaki, Enzo; Murakami, Ryuta; Morita, Yusuke; Yoshimori, Atsushi

Kanai, Chisato; Kawasaki, Enzo; Murakami, Ryuta; Morita, Yusuke; Yoshimori, Atsushi.

Affiliation

Kanai C; Data Science Division, INTAGE Healthcare Inc., 2F NREG Midosuji Bldg., 3-5-7 Kawara-Machi, Chuo-ku, Osaka 541-0048, Japan.
Kawasaki E; Data Science Division, INTAGE Healthcare Inc., 2F NREG Midosuji Bldg., 3-5-7 Kawara-Machi, Chuo-ku, Osaka 541-0048, Japan.
Murakami R; Data Science Division, INTAGE Healthcare Inc., 2F NREG Midosuji Bldg., 3-5-7 Kawara-Machi, Chuo-ku, Osaka 541-0048, Japan.
Morita Y; Business Development Division, Advanced Technology Department, INTAGE Inc., Akihabara Building, 3 Kanda-Neribeicho, Chiyoda-ku, Tokyo 101-8201, Japan.
Yoshimori A; Institute for Theoretical Medicine Inc., 26-1 Muraoka-Higashi 2-Chome, Fujisawa 251-0012, Japan.

Molecules ; 26(17)2021 Aug 24.

Article de En | MEDLINE | ID: mdl-34500569

ABSTRACT

ABSTRACT

A variety of Artificial Intelligence (AI)-based (Machine Learning) techniques have been developed with regard to in silico prediction of Compound-Protein interactions (CPI)-one of which is a technique we refer to as chemical genomics-based virtual screening (CGBVS). Prediction calculations done via pairwise kernel-based support vector machine (SVM) is the main feature of CGBVS which gives high prediction accuracy, with simple implementation and easy handling. We studied whether the CGBVS technique can identify ligands for targets without ligand information (orphan targets) using data from G protein-coupled receptor (GPCR) families. As the validation method, we tested whether the ligand prediction was correct for a virtual orphan GPCR in which all ligand information for one selected target was omitted from the training data. We have specifically expressed the results of this study as applicability index and developed a method to determine whether CGBVS can be used to predict GPCR ligands. Validation results showed that the prediction accuracy of each GPCR differed greatly, but models using Multiple Sequence Alignment (MSA) as the protein descriptor performed well in terms of overall prediction accuracy. We also discovered that the effect of the type compound descriptors on the prediction accuracy was less significant than that of the type of protein descriptors used. Furthermore, we found that the accuracy of the ligand prediction depends on the amount of ligand information with regard to GPCRs related to the target. Additionally, the prediction accuracy tends to be high if a large amount of ligand information for related proteins is used in the training.

Sujet(s)
Mots clés

area under receiver operating characteristics (AUROC); enrichment factor (EF); orphan GPCR; virtual orphan GPCR

Texte intégral

Ajouter à My VHL

Imprimer

XML

PubMed Links

Recherche sur Google

Texte intégral: 1 Collection: 01-internacional Base de données: MEDLINE Sujet principal: Préparations pharmaceutiques / Protéines Type d'étude: Prognostic_studies / Risk_factors_studies Limites: Humans Langue: En Journal: Molecules Sujet du journal: BIOLOGIA Année: 2021 Type de document: Article Pays d'affiliation: Japon

Texte intégral

Ajouter à My VHL

Imprimer

XML

PubMed Links

Recherche sur Google