Your browser doesn't support javascript.
loading
TargetDBP+: Enhancing the Performance of Identifying DNA-Binding Proteins via Weighted Convolutional Features.
Hu, Jun; Rao, Liang; Zhu, Yi-Heng; Zhang, Gui-Jun; Yu, Dong-Jun.
Afiliação
  • Hu J; College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, P. R. China.
  • Rao L; Key Laboratory of Data Science and Intelligence Application, Fujian Province University, Zhangzhou 363000, P. R. China.
  • Zhu YH; College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, P. R. China.
  • Zhang GJ; School of Computer Science and Engineering, Nanjing University of Science and Technology, Xiaolingwei 200, Nanjing 210094, P. R. China.
  • Yu DJ; College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, P. R. China.
J Chem Inf Model ; 61(1): 505-515, 2021 01 25.
Article em En | MEDLINE | ID: mdl-33410688
ABSTRACT
Protein-DNA interactions exist ubiquitously and play important roles in the life cycles of living cells. The accurate identification of DNA-binding proteins (DBPs) is one of the key steps to understand the mechanisms of protein-DNA interactions. Although many DBP identification methods have been proposed, the current performance is still unsatisfactory. In this study, a new method, called TargetDBP+, is developed to further enhance the performance of identifying DBPs. In TargetDBP+, five convolutional features are first extracted from five feature sources, i.e., amino acid one-hot matrix (AAOHM), position-specific scoring matrix (PSSM), predicted secondary structure probability matrix (PSSPM), predicted solvent accessibility probability matrix (PSAPM), and predicted probabilities of DNA-binding sites (PPDBSs); second, the five features are weightedly and serially combined using the weights of all of the elements learned by the differential evolution algorithm; and finally, the DBP identification model of TargetDBP+ is trained using the support vector machine (SVM) algorithm. To evaluate the developed TargetDBP+ and compare it with other existing methods, a new gold-standard benchmark data set, called UniSwiss, is constructed, which consists of 4881 DBPs and 4881 non-DBPs extracted from the UniprotKB/Swiss-Prot database. Experimental results demonstrate that TargetDBP+ can obtain an accuracy of 85.83% and precision of 88.45% covering 82.41% of all DBP data on the independent validation subset of UniSwiss, with the MCC value (0.718) being significantly higher than those of other state-of-the-art control methods. The web server of TargetDBP+ is accessible at http//csbio.njust.edu.cn/bioinf/targetdbpplus/; the UniSwiss data set and stand-alone program of TargetDBP+ are accessible at https//github.com/jun-csbio/TargetDBPplus.
Assuntos

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Proteínas de Ligação a DNA / Máquina de Vetores de Suporte Tipo de estudo: Prognostic_studies Idioma: En Ano de publicação: 2021 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Proteínas de Ligação a DNA / Máquina de Vetores de Suporte Tipo de estudo: Prognostic_studies Idioma: En Ano de publicação: 2021 Tipo de documento: Article