Your browser doesn't support javascript.
loading
ProkDBP: Toward more precise identification of prokaryotic DNA binding proteins.
Pradhan, Upendra Kumar; Meher, Prabina Kumar; Naha, Sanchita; Das, Ritwika; Gupta, Ajit; Parsad, Rajender.
Afiliação
  • Pradhan UK; Division of Statistical Genetics, ICAR-Indian Agricultural Statistics Research Institute, PUSA, New Delhi, India.
  • Meher PK; Division of Statistical Genetics, ICAR-Indian Agricultural Statistics Research Institute, PUSA, New Delhi, India.
  • Naha S; Division of Computer Applications, ICAR-Indian Agricultural Statistics Research Institute, PUSA, New Delhi, India.
  • Das R; Division of Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, PUSA, New Delhi, India.
  • Gupta A; Division of Statistical Genetics, ICAR-Indian Agricultural Statistics Research Institute, PUSA, New Delhi, India.
  • Parsad R; ICAR-Indian Agricultural Statistics Research Institute, PUSA, New Delhi, India.
Protein Sci ; 33(6): e5015, 2024 Jun.
Article em En | MEDLINE | ID: mdl-38747369
ABSTRACT
Prokaryotic DNA binding proteins (DBPs) play pivotal roles in governing gene regulation, DNA replication, and various cellular functions. Accurate computational models for predicting prokaryotic DBPs hold immense promise in accelerating the discovery of novel proteins, fostering a deeper understanding of prokaryotic biology, and facilitating the development of therapeutics targeting for potential disease interventions. However, existing generic prediction models often exhibit lower accuracy in predicting prokaryotic DBPs. To address this gap, we introduce ProkDBP, a novel machine learning-driven computational model for prediction of prokaryotic DBPs. For prediction, a total of nine shallow learning algorithms and five deep learning models were utilized, with the shallow learning models demonstrating higher performance metrics compared to their deep learning counterparts. The light gradient boosting machine (LGBM), coupled with evolutionarily significant features selected via random forest variable importance measure (RF-VIM) yielded the highest five-fold cross-validation accuracy. The model achieved the highest auROC (0.9534) and auPRC (0.9575) among the 14 machine learning models evaluated. Additionally, ProkDBP demonstrated substantial performance with an independent dataset, exhibiting higher values of auROC (0.9332) and auPRC (0.9371). Notably, when benchmarked against several cutting-edge existing models, ProkDBP showcased superior predictive accuracy. Furthermore, to promote accessibility and usability, ProkDBP (https//iasri-sg.icar.gov.in/prokdbp/) is available as an online prediction tool, enabling free access to interested users. This tool stands as a significant contribution, enhancing the repertoire of resources for accurate and efficient prediction of prokaryotic DBPs.
Assuntos
Palavras-chave

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Proteínas de Bactérias / Proteínas de Ligação a DNA / Aprendizado de Máquina Idioma: En Ano de publicação: 2024 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Proteínas de Bactérias / Proteínas de Ligação a DNA / Aprendizado de Máquina Idioma: En Ano de publicação: 2024 Tipo de documento: Article