Búsqueda | Portal de Búsqueda de la BVS España

ATPdock: a template-based method for ATP-specific protein-ligand docking.

Rao, Liang; Jia, Ning-Xin; Hu, Jun; Yu, Dong-Jun; Zhang, Gui-Jun.

Bioinformatics ; 38(2): 556-558, 2022 01 03.

Artículo en Inglés | MEDLINE | ID: mdl-34546290

RESUMEN

MOTIVATION: Accurately identifying protein-ATP binding poses is significantly valuable for both basic structure biology and drug discovery. Although many docking methods have been designed, most of them require a user-defined binding site and are difficult to achieve a high-quality protein-ATP docking result. It is critical to develop a protein-ATP-specific blind docking method without user-defined binding sites. RESULTS: Here, we present ATPdock, a template-based method for docking ATP into protein. For each query protein, if no pocket site is given, ATPdock first identifies its most potential pocket using ATPbind, an ATP-binding site predictor; then, the template pocket, which is most similar to the given or identified pocket, is searched from the database of pocket-ligand structures using APoc, a pocket structural alignment tool; thirdly, the rough docking pose of ATP (rdATP) is generated using LS-align, a ligand structural alignment tool, to align the initial ATP pose to the template ligand corresponding to template pocket; finally, the Metropolis Monte Carlo simulation is used to fine-tune the rdATP under the guidance of AutoDock Vina energy function. Benchmark tests show that ATPdock significantly outperforms other state-of-the-art methods in docking accuracy. AVAILABILITY AND IMPLEMENTATION: https://jun-csbio.github.io/atpdock/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Asunto(s)

Adenosina Trifosfato , Proteínas , Ligandos , Proteínas/química , Sitios de Unión , Unión Proteica , Adenosina Trifosfato/metabolismo , Simulación del Acoplamiento Molecular

Improving DNA-Binding Protein Prediction Using Three-Part Sequence-Order Feature Extraction and a Deep Neural Network Algorithm.

Hu, Jun; Zeng, Wen-Wu; Jia, Ning-Xin; Arif, Muhammad; Yu, Dong-Jun; Zhang, Gui-Jun.

J Chem Inf Model ; 63(3): 1044-1057, 2023 02 13.

Artículo en Inglés | MEDLINE | ID: mdl-36719781

RESUMEN

Identification of the DNA-binding protein (DBP) helps dig out information embedded in the DNA-protein interaction, which is significant to understanding the mechanisms of DNA replication, transcription, and repair. Although existing computational methods for predicting the DBPs based on protein sequences have obtained great success, there is still room for improvement since the sequence-order information is not fully mined in these methods. In this study, a new three-part sequence-order feature extraction (called TPSO) strategy is developed to extract more discriminative information from protein sequences for predicting the DBPs. For each query protein, TPSO first divides its primary sequence features into N- and C-terminal fragments and then extracts the numerical pseudo features of three parts including the full sequence and these two fragments, respectively. Based on TPSO, a novel deep learning-based method, called TPSO-DBP, is proposed, which employs the sequence-based single-view features, the bidirectional long short-term memory (BiLSTM) and fully connected (FC) neural networks to learn the DBP prediction model. Empirical outcomes reveal that TPSO-DBP can achieve an accuracy of 87.01%, covering 85.30% of all DBPs, while achieving a Matthew's correlation coefficient value (0.741) that is significantly higher than most existing state-of-the-art DBP prediction methods. Detailed data analyses have indicated that the advantages of TPSO-DBP lie in the utilization of TPSO, which helps extract more concealed prominent patterns, and the deep neural network framework composed of BiLSTM and FC that learns the nonlinear relationships between input features and DBPs. The standalone package and web server of TPSO-DBP are freely available at https://jun-csbio.github.io/TPSO-DBP/.

Asunto(s)

Proteínas de Unión al ADN , Redes Neurales de la Computación , Proteínas de Unión al ADN/metabolismo , Algoritmos , Secuencia de Aminoácidos

Predicting RNA solvent accessibility from multi-scale context feature via multi-shot neural network.

Fan, Xue-Qiang; Hu, Jun; Tang, Yu-Xuan; Jia, Ning-Xin; Yu, Dong-Jun; Zhang, Gui-Jun.

Anal Biochem ; 654: 114802, 2022 10 01.

Artículo en Inglés | MEDLINE | ID: mdl-35809650

RESUMEN

Knowledge of RNA solvent accessibility has recently become attractive due to the increasing awareness of its importance for key biological process. Accurately predicting the solvent accessibility of RNA is crucial for understanding its 3D structure and biological function. In this study, we develop a novel computational method, termed M2pred, for accurately predicting the solvent accessibility of RNA from sequence-based multi-scale context feature. In M2pred, three single-view features, i.e., base-pairing probabilities, position-specific frequency matrix, and a binary one-hot encoding, are first generated as three feature sources, and immediately concatenated to engender a super feature. Secondly, for the super feature, the matrix-format features of each nucleotide are extracted using an initialized sliding window technique, and regularly stacked into a cube-format feature. Then, using multi-scale context feature extraction strategy, a pyramid feature constructed of contextual feature of four scales related to target nucleotides is extracted from the cube-format feature. Finally, a customized multi-shot neural network framework, which is equipped with four different scales of receptive fields mainly integrating several residual attention blocks, is designed to dig discrimination information from the contextual pyramid feature. Experimental results demonstrate that the proposed M2pred achieve a high prediction performance and outperforms existing state-of-the-art prediction methods of RNA solvent accessibility.

Asunto(s)

Redes Neurales de la Computación , ARN , Nucleótidos , ARN/química , Solventes/química

Improved protein relative solvent accessibility prediction using deep multi-view feature learning framework.

Fan, Xue-Qiang; Hu, Jun; Jia, Ning-Xin; Yu, Dong-Jun; Zhang, Gui-Jun.

Anal Biochem ; 631: 114358, 2021 10 15.

Artículo en Inglés | MEDLINE | ID: mdl-34478704

RESUMEN

The accurate prediction of the relative solvent accessibility of a protein is critical to understanding its 3D structure and biological function. In this study, a novel deep multi-view feature learning (DMVFL) framework that integrates three different neural network units, i.e., bidirectional long short-term memory recurrent neural network, squeeze-and-excitation, and fully-connected hidden layer, with four sequence-based single-view features, i.e., position-specific scoring matrix, position-specific frequency matrix, predicted secondary structure, and roughly predicted three-state relative solvent accessibility probability, is developed to accurately predict relative solvent accessibility information of protein. On the basis of this newly developed framework, one new protein relative solvent accessibility predictor was proposed and called DMVFL-RSA, which employs a customized multiple feedback mechanism that helps to extract discriminative information embedded in the four single-view features. In benchmark tests on TEST524 and CASP14-derived (CASP14set) datasets, DMVFL-RSA outperforms other existing state-of-the-art protein relative solvent accessibility predictors when predicting two-state (exposure threshold of 25%), three-state (exposure thresholds of 9% and 36%), and four-state (exposure thresholds of 4%, 25%, and 50%) discrete values. For real-valued prediction on TEST524 and CASP14set, DMVFL-RSA has also gained high Pearson correlation coefficient values, indicating a positive correlation between the predicted and native relative solvent accessibility. Detailed analyses show that the major advantages of DMVFL-RSA lie in the high efficiency of the DMVFL framework, the applied multiple feedback mechanism, and the strong sensitivity of the sequence-based features. The web server of DMVFL-RSA is freely available at https://jun-csbio.github.io/DMVFL-RSA/for academic use. The standalone package of DMVFL-RSA is downloadable at https://github.com/XueQiangFan/DMVFL-RSA.

Asunto(s)

Biología Computacional/métodos , Aprendizaje Profundo , Proteínas/química , Solventes/química , Bases de Datos de Proteínas , Retroalimentación , Internet , Redes Neurales de la Computación , Estructura Secundaria de Proteína

Protein-DNA Binding Residue Prediction via Bagging Strategy and Sequence-Based Cube-Format Feature.

Hu, Jun; Bai, Yan-Song; Zheng, Lin-Lin; Jia, Ning-Xin; Yu, Dong-Jun; Zhang, Gui-Jun.

IEEE/ACM Trans Comput Biol Bioinform ; 19(6): 3635-3645, 2022.

Artículo en Inglés | MEDLINE | ID: mdl-34714748

RESUMEN

Protein-DNA interactions play an important role in diverse biological processes. Accurately identifying protein-DNA binding residues is a critical but challenging task for protein function annotations and drug design. Although wet-lab experimental methods are the most accurate way to identify protein-DNA binding residues, they are time consuming and labor intensive. There is an urgent need to develop computational methods to rapidly and accurately predict protein-DNA binding residues. In this study, we propose a novel sequence-based method, named PredDBR, for predicting DNA-binding residues. In PredDBR, for each query protein, its position-specific frequency matrix (PSFM), predicted secondary structure (PSS), and predicted probabilities of ligand-binding residues (PPLBR) are first generated as three feature sources. Secondly, for each feature source, the sliding window technique is employed to extract the matrix-format feature of each residue. Then, we design two strategies, i.e., square root (SR) and average (AVE), to separately transform PSFM-based and two predicted feature source-based, i.e., PSS-based and PPLBR-based, matrix-format features of each residue into three corresponding cube-format features. Finally, after serially combining the three cube-format features, the ensemble classifier is generated via applying bagging strategy to multiple base classifiers built by the framework of 2D convolutional neural network. The computational experimental results demonstrate that the proposed PredDBR achieves an average overall accuracy of 93.7% and a Mathew's correlation coefficient of 0.405 on two independent validation datasets and outperforms several state-of-the-art sequenced-based protein-DNA binding residue predictors. The PredDBR web-server is available at https://jun-csbio.github.io/PredDBR/.

Asunto(s)

Redes Neurales de la Computación , Proteínas , Proteínas/química , Unión Proteica , Estructura Secundaria de Proteína , ADN/química

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA