Your browser doesn't support javascript.
loading
Discovery of moiety preference by Shapley value in protein kinase family using random forest models.
Huang, Yu-Wei; Hsu, Yen-Chao; Chuang, Yi-Hsuan; Chen, Yun-Ti; Lin, Xiang-Yu; Fan, You-Wei; Pathak, Nikhil; Yang, Jinn-Moon.
Afiliação
  • Huang YW; Institute of Biomedical Engineering, National Yang Ming Chiao Tung University, Hsinchu, Taiwan.
  • Hsu YC; Institute of Bioinformatics and Systems Biology, National Yang Ming Chiao Tung University, Hsinchu, Taiwan.
  • Chuang YH; Institute of Bioinformatics and Systems Biology, National Yang Ming Chiao Tung University, Hsinchu, Taiwan.
  • Chen YT; Institute of Bioinformatics and Systems Biology, National Yang Ming Chiao Tung University, Hsinchu, Taiwan.
  • Lin XY; Institute of Bioinformatics and Systems Biology, National Yang Ming Chiao Tung University, Hsinchu, Taiwan.
  • Fan YW; Institute of Molecular Medicine and Bioengineering, National Yang Ming Chiao Tung University, Hsinchu, Taiwan.
  • Pathak N; Institute of Bioinformatics and Systems Biology, National Yang Ming Chiao Tung University, Hsinchu, Taiwan.
  • Yang JM; Institute of Bioinformatics and Systems Biology, National Yang Ming Chiao Tung University, Hsinchu, Taiwan. moon@faculty.nctu.edu.tw.
BMC Bioinformatics ; 23(Suppl 4): 130, 2022 Apr 15.
Article em En | MEDLINE | ID: mdl-35428180
ABSTRACT

BACKGROUND:

Human protein kinases play important roles in cancers, are highly co-regulated by kinase families rather than a single kinase, and complementarily regulate signaling pathways. Even though there are > 100,000 protein kinase inhibitors, only 67 kinase drugs are currently approved by the Food and Drug Administration (FDA).

RESULTS:

In this study, we used "merged moiety-based interpretable features (MMIFs)," which merged four moiety-based compound features, including Checkmol fingerprint, PubChem fingerprint, rings in drugs, and in-house moieties as the input features for building random forest (RF) models. By using > 200,000 bioactivity test data, we classified inhibitors as kinase family inhibitors or non-inhibitors in the machine learning. The results showed that our RF models achieved good accuracy (> 0.8) for the 10 kinase families. In addition, we found kinase common and specific moieties across families using the Shapley Additive exPlanations (SHAP) approach. We also verified our results using protein kinase complex structures containing important interactions of the hinges, DFGs, or P-loops in the ATP pocket of active sites.

CONCLUSIONS:

In summary, we not only constructed highly accurate prediction models for predicting inhibitors of kinase families but also discovered common and specific inhibitor moieties between different kinase families, providing new opportunities for designing protein kinase inhibitors.
Assuntos
Palavras-chave

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Proteínas Quinases / Aprendizado de Máquina Idioma: En Ano de publicação: 2022 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Proteínas Quinases / Aprendizado de Máquina Idioma: En Ano de publicação: 2022 Tipo de documento: Article