Pesquisa | Portal de Pesquisa da BVS Enfermagem

Identification of DNA-protein binding residues through integration of Transformer encoder and Bi-directional Long Short-Term Memory.

Zhao, Haipeng; Zhu, Baozhong; Jiang, Tengsheng; Cui, Zhiming; Wu, Hongjie.

Math Biosci Eng ; 21(1): 170-185, 2024 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-38303418

RESUMO

DNA-protein binding is crucial for the normal development and function of organisms. The significance of accurately identifying DNA-protein binding sites lies in its role in disease prevention and the development of innovative approaches to disease treatment. In the present study, we introduce a precise and robust identifier for DNA-protein binding residues. In the context of protein representation, we combine the evolutionary information of the protein, represented by its position-specific scoring matrix, with the spatial information of the protein's secondary structure, enriching the overall informational content. This approach initially employs a combination of Bi-directional Long Short-Term Memory and Transformer encoder to jointly extract the interdependencies among residues within the protein sequence. Subsequently, convolutional operations are applied to the resulting feature matrix to capture local features of the residues. Experimental results on the benchmark dataset demonstrate that our method exhibits a higher level of competitiveness when compared to contemporary classifiers. Specifically, our method achieved an MCC of 0.349, SP of 96.50%, SN of 44.03% and ACC of 94.59% on the PDNA-41 dataset.

Assuntos

Memória de Curto Prazo , Proteínas , Ligação Proteica , Proteínas/química , Sítios de Ligação , DNA/química

AttentionMGT-DTA: A multi-modal drug-target affinity prediction using graph transformer and attention mechanism.

Wu, Hongjie; Liu, Junkai; Jiang, Tengsheng; Zou, Quan; Qi, Shujie; Cui, Zhiming; Tiwari, Prayag; Ding, Yijie.

Neural Netw ; 169: 623-636, 2024 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-37976593

RESUMO

The accurate prediction of drug-target affinity (DTA) is a crucial step in drug discovery and design. Traditional experiments are very expensive and time-consuming. Recently, deep learning methods have achieved notable performance improvements in DTA prediction. However, one challenge for deep learning-based models is appropriate and accurate representations of drugs and targets, especially the lack of effective exploration of target representations. Another challenge is how to comprehensively capture the interaction information between different instances, which is also important for predicting DTA. In this study, we propose AttentionMGT-DTA, a multi-modal attention-based model for DTA prediction. AttentionMGT-DTA represents drugs and targets by a molecular graph and binding pocket graph, respectively. Two attention mechanisms are adopted to integrate and interact information between different protein modalities and drug-target pairs. The experimental results showed that our proposed model outperformed state-of-the-art baselines on two benchmark datasets. In addition, AttentionMGT-DTA also had high interpretability by modeling the interaction strength between drug atoms and protein residues. Our code is available at https://github.com/JK-Liu7/AttentionMGT-DTA.

Assuntos

Benchmarking , Descoberta de Drogas

MV-H-RKM: A Multiple View-Based Hypergraph Regularized Restricted Kernel Machine for Predicting DNA-Binding Proteins.

Guan, Shixuan; Qian, Yuqing; Jiang, Tengsheng; Jiang, Min; Ding, Yijie; Wu, Hongjie.

IEEE/ACM Trans Comput Biol Bioinform ; 20(2): 1246-1256, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-35731758

RESUMO

DNA-binding proteins (DBPs) have a significant impact on many life activities, so identification of DBPs is a crucial issue. And it is greatly helpful to understand the mechanism of protein-DNA interactions. In traditional experimental methods, it is significant time-consuming and labor-consuming to identify DBPs. In recent years, many researchers have proposed lots of different DBP identification methods based on machine learning algorithm to overcome shortcomings mentioned above. However, most existing methods cannot get satisfactory results. In this paper, we focus on developing a new predictor of DBPs, called Multi-View Hypergraph Restricted Kernel Machines (MV-H-RKM). In this method, we extract five features from the three views of the proteins. To fuse these features, we couple them by means of the shared hidden vector. Besides, we employ the hypergraph regularization to enforce the structure consistency between original features and the hidden vector. Experimental results show that the accuracy of MV-H-RKM is 84.09% and 85.48% on PDB1075 and PDB186 data set respectively, and demonstrate that our proposed method performs better than other state-of-the-art approaches. The code is publicly available at https://github.com/ShixuanGG/MV-H-RKM.

Assuntos

Proteínas de Ligação a DNA , Máquina de Vetores de Suporte , Proteínas de Ligação a DNA/química , Algoritmos , DNA/química , Aprendizado de Máquina

TrGPCR:GPCR-ligand Binding Affinity Predicting based on Dynamic Deep Transfer Learning.

Lu, Yaoyao; Zhang, Runhua; Jiang, Tengsheng; Fu, Qiming; Cui, Zhiming; Wu, Hongjie.

IEEE J Biomed Health Inform ; PP2023 Aug 23.

Artigo em Inglês | MEDLINE | ID: mdl-37610904

RESUMO

Predicting G protein-coupled receptor (GPCR)-ligand binding affinity plays a crucial role in drug development. However, determining GPCR-ligand binding affinities is time-consuming and resource-intensive. Although many studies used data-driven methods to predict binding affinity, most of these methods required protein 3D structure, which was often unknown. Moreover, part of these studies only considered the sequence characteristics of the protein, ignoring the secondary structure of the protein. The number of known GPCR for affinity prediction is only a few thousand, which is insufficient for deep learning training. Therefore, this study aimed to propose a deep transfer learning method called TrGPCR, which used dynamic transfer learning to solve the problem of insufficient GPCR data. We used the Binding Database(BindingDB) as the source domain and the GLASS(GPCR-Ligand Association) database as the target domain. We also introduced protein secondary structures, called pockets, as features to predict binding affinities. Compared with DeepDTA, our model improved by 5.2% on RMSE(root mean square error) and 4.5% on MAE(mean squared error).

DNA protein binding recognition based on lifelong learning.

Liu, Yongsan; Guan, ShiXuan; Jiang, TengSheng; Fu, Qiming; Ma, Jieming; Cui, Zhiming; Ding, Yijie; Wu, Hongjie.

Comput Biol Med ; 164: 107094, 2023 09.

Artigo em Inglês | MEDLINE | ID: mdl-37459792

RESUMO

In recent years, research in the field of bioinformatics has focused on predicting the raw sequences of proteins, and some scholars consider DNA-binding protein prediction as a classification task. Many statistical and machine learning-based methods have been widely used in DNA-binding proteins research. The aforementioned methods are indeed more efficient than those based on manual classification, but there is still room for improvement in terms of prediction accuracy and speed. In this study, researchers used Average Blocks, Discrete Cosine Transform, Discrete Wavelet Transform, Global encoding, Normalized Moreau-Broto Autocorrelation and Pseudo position-specific scoring matrix to extract evolutionary features. A dynamic deep network based on lifelong learning architecture was then proposed in order to fuse six features and thus allow for more efficient classification of DNA-binding proteins. The multi-feature fusion allows for a more accurate description of the desired protein information than single features. This model offers a fresh perspective on the dichotomous classification problem in bioinformatics and broadens the application field of lifelong learning. The researchers ran trials on three datasets and contrasted them with other classification techniques to show the model's effectiveness in this study. The findings demonstrated that the model used in this research was superior to other approaches in terms of single-sample specificity (81.0%, 83.0%) and single-sample sensitivity (82.4%, 90.7%), and achieves high accuracy on the benchmark dataset (88.4%, 80.0%, and 76.6%).

Assuntos

Proteínas de Ligação a DNA , Aprendizado de Máquina , Ligação Proteica , Proteínas de Ligação a DNA/metabolismo , Biologia Computacional/métodos , DNA

DNA-binding protein prediction based on deep transfer learning.

Yan, Jun; Jiang, Tengsheng; Liu, Junkai; Lu, Yaoyao; Guan, Shixuan; Li, Haiou; Wu, Hongjie; Ding, Yijie.

Math Biosci Eng ; 19(8): 7719-7736, 2022 05 24.

Artigo em Inglês | MEDLINE | ID: mdl-35801442

RESUMO

The study of DNA binding proteins (DBPs) is of great importance in the biomedical field and plays a key role in this field. At present, many researchers are working on the prediction and detection of DBPs. Traditional DBP prediction mainly uses machine learning methods. Although these methods can obtain relatively high pre-diction accuracy, they consume large quantities of human effort and material resources. Transfer learning has certain advantages in dealing with such prediction problems. Therefore, in the present study, two features were extracted from a protein sequence, a transfer learning method was used, and two classical transfer learning algorithms were compared to transfer samples and construct data sets. In the final step, DBPs are detected by building a deep learning neural network model in a way that uses attention mechanisms.

Assuntos

Proteínas de Ligação a DNA , Redes Neurais de Computação , Algoritmos , Humanos , Aprendizado de Máquina

G Protein-Coupled Receptor Interaction Prediction Based on Deep Transfer Learning.

Jiang, Tengsheng; Chen, Yuhui; Guan, Shixuan; Hu, Zhongtian; Lu, Weizhong; Fu, Qiming; Ding, Yijie; Li, Haiou; Wu, Hongjie.

IEEE/ACM Trans Comput Biol Bioinform ; 19(6): 3126-3134, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-34780331

RESUMO

G protein-coupled receptors (GPCRs) account for about 40% to 50% of drug targets. Many human diseases are related to G protein coupled receptors. Accurate prediction of GPCR interaction is not only essential to understand its structural role, but also helps design more effective drugs. At present, the prediction of GPCR interaction mainly uses machine learning methods. Machine learning methods generally require a large number of independent and identically distributed samples to achieve good results. However, the number of available GPCR samples that have been marked is scarce. Transfer learning has a strong advantage in dealing with such small sample problems. Therefore, this paper proposes a transfer learning method based on sample similarity, using XGBoost as a weak classifier and using the TrAdaBoost algorithm based on JS divergence for data weight initialization to transfer samples to construct a data set. After that, the deep neural network based on the attention mechanism is used for model training. The existing GPCR is used for prediction. In short-distance contact prediction, the accuracy of our method is 0.26 higher than similar methods.

Assuntos

Algoritmos , Receptores Acoplados a Proteínas G , Humanos , Receptores Acoplados a Proteínas G/química , Redes Neurais de Computação , Aprendizado de Máquina

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA