Your browser doesn't support javascript.
loading
Reducing false positive rate of docking-based virtual screening by active learning.
Wang, Lei; Shi, Shao-Hua; Li, Hui; Zeng, Xiang-Xiang; Liu, Su-You; Liu, Zhao-Qian; Deng, Ya-Feng; Lu, Ai-Ping; Hou, Ting-Jun; Cao, Dong-Sheng.
Afiliação
  • Wang L; Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410013, Hunan, China.
  • Shi SH; Institute for Advancing Translational Medicine in Bone and Joint Diseases, School of Chinese Medicine, Hong Kong Baptist University, Hong Kong SAR, China.
  • Li H; Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410013, Hunan, China.
  • Zeng XX; Department of Computer Science, Hunan University, Changsha 410082, Hunan, China.
  • Liu SY; Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410013, Hunan, China.
  • Liu ZQ; Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410013, Hunan, China.
  • Deng YF; CarbonSilicon AI Technology Co., Ltd, Hangzhou, Zhejiang 310018, China.
  • Lu AP; Institute for Advancing Translational Medicine in Bone and Joint Diseases, School of Chinese Medicine, Hong Kong Baptist University, Hong Kong SAR, China.
  • Hou TJ; Hangzhou Institute of Innovative Medicine, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China.
  • Cao DS; Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410013, Hunan, China.
Brief Bioinform ; 24(1)2023 01 19.
Article em En | MEDLINE | ID: mdl-36642412
ABSTRACT
Machine learning-based scoring functions (MLSFs) have become a very favorable alternative to classical scoring functions because of their potential superior screening performance. However, the information of negative data used to construct MLSFs was rarely reported in the literature, and meanwhile the putative inactive molecules recorded in existing databases usually have obvious bias from active molecules. Here we proposed an easy-to-use method named AMLSF that combines active learning using negative molecular selection strategies with MLSF, which can iteratively improve the quality of inactive sets and thus reduce the false positive rate of virtual screening. We chose energy auxiliary terms learning as the MLSF and validated our method on eight targets in the diverse subset of DUD-E. For each target, we screened the IterBioScreen database by AMLSF and compared the screening results with those of the four control models. The results illustrate that the number of active molecules in the top 1000 molecules identified by AMLSF was significantly higher than those identified by the control models. In addition, the free energy calculation results for the top 10 molecules screened out by the AMLSF, null model and control models based on DUD-E also proved that more active molecules can be identified, and the false positive rate can be reduced by AMLSF.
Assuntos
Palavras-chave

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Proteínas Tipo de estudo: Diagnostic_studies / Prognostic_studies / Screening_studies Idioma: En Revista: Brief Bioinform Assunto da revista: BIOLOGIA / INFORMATICA MEDICA Ano de publicação: 2023 Tipo de documento: Article País de afiliação: China

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Proteínas Tipo de estudo: Diagnostic_studies / Prognostic_studies / Screening_studies Idioma: En Revista: Brief Bioinform Assunto da revista: BIOLOGIA / INFORMATICA MEDICA Ano de publicação: 2023 Tipo de documento: Article País de afiliação: China