Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Brief Bioinform ; 24(1)2023 01 19.
Artigo em Inglês | MEDLINE | ID: mdl-36631407

RESUMO

Recently, peptide-based drugs have gained unprecedented interest in discovering and developing antifungal drugs due to their high efficacy, broad-spectrum activity, low toxicity and few side effects. However, it is time-consuming and expensive to identify antifungal peptides (AFPs) experimentally. Therefore, computational methods for accurately predicting AFPs are highly required. In this work, we develop AFP-MFL, a novel deep learning model that predicts AFPs only relying on peptide sequences without using any structural information. AFP-MFL first constructs comprehensive feature profiles of AFPs, including contextual semantic information derived from a pre-trained protein language model, evolutionary information, and physicochemical properties. Subsequently, the co-attention mechanism is utilized to integrate contextual semantic information with evolutionary information and physicochemical properties separately. Extensive experiments show that AFP-MFL outperforms state-of-the-art models on four independent test datasets. Furthermore, the SHAP method is employed to explore each feature contribution to the AFPs prediction. Finally, a user-friendly web server of the proposed AFP-MFL is developed and freely accessible at http://inner.wei-group.net/AFPMFL/, which can be considered as a powerful tool for the rapid screening and identification of novel AFPs.


Assuntos
Antifúngicos , alfa-Fetoproteínas , Antifúngicos/farmacologia , Algoritmos , Peptídeos/química , Biologia Computacional/métodos
2.
Brief Bioinform ; 24(1)2023 01 19.
Artigo em Inglês | MEDLINE | ID: mdl-36562719

RESUMO

BACKGROUND: Cell-penetrating peptides (CPPs) have received considerable attention as a means of transporting pharmacologically active molecules into living cells without damaging the cell membrane, and thus hold great promise as future therapeutics. Recently, several machine learning-based algorithms have been proposed for predicting CPPs. However, most existing predictive methods do not consider the agreement (disagreement) between similar (dissimilar) CPPs and depend heavily on expert knowledge-based handcrafted features. RESULTS: In this study, we present SiameseCPP, a novel deep learning framework for automated CPPs prediction. SiameseCPP learns discriminative representations of CPPs based on a well-pretrained model and a Siamese neural network consisting of a transformer and gated recurrent units. Contrastive learning is used for the first time to build a CPP predictive model. Comprehensive experiments demonstrate that our proposed SiameseCPP is superior to existing baseline models for predicting CPPs. Moreover, SiameseCPP also achieves good performance on other functional peptide datasets, exhibiting satisfactory generalization ability.


Assuntos
Peptídeos Penetradores de Células , Peptídeos Penetradores de Células/metabolismo , Algoritmos , Transporte Biológico , Redes Neurais de Computação , Aprendizado de Máquina
3.
Bioinformatics ; 40(6)2024 Jun 03.
Artigo em Inglês | MEDLINE | ID: mdl-38867692

RESUMO

MOTIVATION: Macrocyclic peptides hold great promise as therapeutics targeting intracellular proteins. This stems from their remarkable ability to bind flat protein surfaces with high affinity and specificity while potentially traversing the cell membrane. Research has already explored their use in developing inhibitors for intracellular proteins, such as KRAS, a well-known driver in various cancers. However, computational approaches for de novo macrocyclic peptide design remain largely unexplored. RESULTS: Here, we introduce HELM-GPT, a novel method that combines the strength of the hierarchical editing language for macromolecules (HELM) representation and generative pre-trained transformer (GPT) for de novo macrocyclic peptide design. Through reinforcement learning (RL), our experiments demonstrate that HELM-GPT has the ability to generate valid macrocyclic peptides and optimize their properties. Furthermore, we introduce a contrastive preference loss during the RL process, further enhanced the optimization performance. Finally, to co-optimize peptide permeability and KRAS binding affinity, we propose a step-by-step optimization strategy, demonstrating its effectiveness in generating molecules fulfilling both criteria. In conclusion, the HELM-GPT method can be used to identify novel macrocyclic peptides to target intracellular proteins. AVAILABILITY AND IMPLEMENTATION: The code and data of HELM-GPT are freely available on GitHub (https://github.com/charlesxu90/helm-gpt).


Assuntos
Peptídeos Cíclicos , Peptídeos Cíclicos/química , Biologia Computacional/métodos , Desenho de Fármacos , Peptídeos/química , Humanos , Algoritmos , Software
4.
Brief Bioinform ; 22(5)2021 09 02.
Artigo em Inglês | MEDLINE | ID: mdl-33822870

RESUMO

MOTIVATION: Peptides have recently emerged as promising therapeutic agents against various diseases. For both research and safety regulation purposes, it is of high importance to develop computational methods to accurately predict the potential toxicity of peptides within the vast number of candidate peptides. RESULTS: In this study, we proposed ATSE, a peptide toxicity predictor by exploiting structural and evolutionary information based on graph neural networks and attention mechanism. More specifically, it consists of four modules: (i) a sequence processing module for converting peptide sequences to molecular graphs and evolutionary profiles, (ii) a feature extraction module designed to learn discriminative features from graph structural information and evolutionary information, (iii) an attention module employed to optimize the features and (iv) an output module determining a peptide as toxic or non-toxic, using optimized features from the attention module. CONCLUSION: Comparative studies demonstrate that the proposed ATSE significantly outperforms all other competing methods. We found that structural information is complementary to the evolutionary information, effectively improving the predictive performance. Importantly, the data-driven features learned by ATSE can be interpreted and visualized, providing additional information for further analysis. Moreover, we present a user-friendly online computational platform that implements the proposed ATSE, which is now available at http://server.malab.cn/ATSE. We expect that it can be a powerful and useful tool for researchers of interest.


Assuntos
Biologia Computacional/métodos , Aprendizado de Máquina , Redes Neurais de Computação , Peptídeos/toxicidade , Software , Bases de Dados de Proteínas , Conjuntos de Dados como Assunto , Evolução Molecular , Humanos , Peptídeos/química
5.
Brief Bioinform ; 22(6)2021 11 05.
Artigo em Inglês | MEDLINE | ID: mdl-34117740

RESUMO

The prediction of peptide secondary structures is fundamentally important to reveal the functional mechanisms of peptides with potential applications as therapeutic molecules. In this study, we propose a multi-view deep learning method named Peptide Secondary Structure Prediction based on Multi-View Information, Restriction and Transfer learning (PSSP-MVIRT) for peptide secondary structure prediction. To sufficiently exploit discriminative information, we introduce a multi-view fusion strategy to integrate different information from multiple perspectives, including sequential information, evolutionary information and hidden state information, respectively, and generate a unified feature space. Moreover, we construct a hybrid network architecture of Convolutional Neural Network and Bi-directional Gated Recurrent Unit to extract global and local features of peptides. Furthermore, we utilize transfer learning to effectively alleviate the lack of training samples (peptides with experimentally validated structures). Comparative results on independent tests demonstrate that our proposed method significantly outperforms state-of-the-art methods. In particular, our method exhibits better performance at the segment level, suggesting the strong ability of our model in capturing local discriminative information. The case study also shows that our PSSP-MVIRT achieves promising and robust performance in the prediction of new peptide secondary structures. Importantly, we establish a webserver to implement the proposed method, which is currently accessible via http://server.malab.cn/PSSP-MVIRT. We expect it can be a useful tool for the researchers of interest, facilitating the wide use of our method.


Assuntos
Algoritmos , Biologia Computacional/métodos , Aprendizado Profundo , Modelos Moleculares , Peptídeos/química , Estrutura Secundária de Proteína , Bases de Dados de Proteínas , Reprodutibilidade dos Testes , Navegador
6.
Bioinformatics ; 38(6): 1514-1524, 2022 03 04.
Artigo em Inglês | MEDLINE | ID: mdl-34999757

RESUMO

MOTIVATION: Recently, peptides have emerged as a promising class of pharmaceuticals for various diseases treatment poised between traditional small molecule drugs and therapeutic proteins. However, one of the key bottlenecks preventing them from therapeutic peptides is their toxicity toward human cells, and few available algorithms for predicting toxicity are specially designed for short-length peptides. RESULTS: We present ToxIBTL, a novel deep learning framework by utilizing the information bottleneck principle and transfer learning to predict the toxicity of peptides as well as proteins. Specifically, we use evolutionary information and physicochemical properties of peptide sequences and integrate the information bottleneck principle into a feature representation learning scheme, by which relevant information is retained and the redundant information is minimized in the obtained features. Moreover, transfer learning is introduced to transfer the common knowledge contained in proteins to peptides, which aims to improve the feature representation capability. Extensive experimental results demonstrate that ToxIBTL not only achieves a higher prediction performance than state-of-the-art methods on the peptide dataset, but also has a competitive performance on the protein dataset. Furthermore, a user-friendly online web server is established as the implementation of the proposed ToxIBTL. AVAILABILITY AND IMPLEMENTATION: The proposed ToxIBTL and data can be freely accessible at http://server.wei-group.net/ToxIBTL. Our source code is available at https://github.com/WLYLab/ToxIBTL. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Aprendizado de Máquina , Peptídeos , Humanos , Proteínas , Software , Algoritmos
7.
Methods ; 204: 418-427, 2022 08.
Artigo em Inglês | MEDLINE | ID: mdl-35114401

RESUMO

Elucidating the mechanisms of Compound-Protein Interactions (CPIs) plays an essential role in drug discovery and development. Many computational efforts have been done to accelerate the development of this field. However, the current predictive performance is still not satisfactory, and existing methods consider only protein and compound features, ignoring their interactive information. In this study, we propose a multi-view deep learning method named MDL-CPI for CPI prediction. To sufficiently extract discriminative information, we introduce a hybrid architecture that leverages BERT (Bidirectional Encoder Representations from Transformers) and CNN (Convolutional Neural Network) to extract protein features from a sequential perspective, use the GNN (Graph Neural Networks) to extract compound features from a structural perspective, and generate a unified feature space by using AE2 (Autoencoder in Autoencoder Networks) network to learn the interactive information between BERT-CNN and Graph embeddings. Comparative results on benchmark datasets show that our proposed method exhibits better performance compared to existing CPI prediction methods, demonstrating the strong predictive ability of our model. Importantly, we demonstrate that the learned interactive information between compounds and proteins is critical to improve predictive performance. We release our source code and dataset at: https://github.com/Longwt123/MDL-CPI.


Assuntos
Aprendizado Profundo , Ciclopropanos , Indóis , Redes Neurais de Computação , Proteínas/química , Software
8.
Methods ; 207: 103-109, 2022 11.
Artigo em Inglês | MEDLINE | ID: mdl-36155250

RESUMO

The task of predicting drug-target affinity (DTA) plays an increasingly important role in the early stage of in silico drug discovery and development. Currently, a variety of machine learning-based methods have been presented for DTA prediction and achieved outstanding performance, which is beneficial for speeding up the development of new drugs. However, most convolutional neural networks (CNNs) based methods ignore the significance of information from CNN layers with different scales for DTA prediction. In addition, each feature provides different contributions to the final task. Therefore, in this study, we propose a novel end-to-end deep learning-based framework, MultiscaleDTA, to predict drug-target binding affinity. MultiscaleDTA incorporates multi-scale CNNs and a self-attention mechanism to capture multi-scale and comprehensive features for characterizing the intrinsic properties of drugs and targets. Extensive experimental results on both regression and binary classification tasks demonstrate that MultiscaleDTA achieves competitive performance compared to state-of-the-art methods.


Assuntos
Aprendizado de Máquina , Redes Neurais de Computação , Desenvolvimento de Medicamentos , Descoberta de Drogas
9.
Comput Biol Med ; 150: 106145, 2022 11.
Artigo em Inglês | MEDLINE | ID: mdl-37859276

RESUMO

Identifying drug-target affinity (DTA) has great practical importance in the process of designing efficacious drugs for known diseases. Recently, numerous deep learning-based computational methods have been developed to predict drug-target affinity and achieved impressive performance. However, most of them construct the molecule (drug or target) encoder without considering the weights of features of each node (atom or residue). Besides, they generally combine drug and target representations directly, which may contain irrelevant-task information. In this study, we develop GSAML-DTA, an interpretable deep learning framework for DTA prediction. GSAML-DTA integrates a self-attention mechanism and graph neural networks (GNNs) to build representations of drugs and target proteins from the structural information. In addition, mutual information is introduced to filter out redundant information and retain relevant information in the combined representations of drugs and targets. Extensive experimental results demonstrate that GSAML-DTA outperforms state-of-the-art methods for DTA prediction on two benchmark datasets. Furthermore, GSAML-DTA has the interpretation ability to analyze binding atoms and residues, which may be conducive to chemical biology studies from data. Overall, GSAML-DTA can serve as a powerful and interpretable tool suitable for DTA modelling.


Assuntos
Benchmarking , Desenho de Fármacos , Sistemas de Liberação de Medicamentos , Redes Neurais de Computação
10.
Curr Med Chem ; 29(5): 881-893, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-34544332

RESUMO

Owing to its superior performance, the Transformer model, based on the 'Encoder- Decoder' paradigm, has become the mainstream model in natural language processing. However, bioinformatics has embraced machine learning and has led to remarkable progress in drug design and protein property prediction. Cell-penetrating peptides (CPPs) are a type of permeable protein that is a convenient 'postman' in drug penetration tasks. However, only a few CPPs have been discovered, limiting their practical applications in drug permeability. CPPs have led to a new approach that enables the uptake of only macromolecules into cells (i.e., without other potentially harmful materials found in the drug). Most previous studies have utilized trivial machine learning techniques and hand-crafted features to construct a simple classifier. CPPFormer was constructed by implementing the attention structure of the Transformer, rebuilding the network based on the characteristics of CPPs according to their short length, and using an automatic feature extractor with a few manually engineered features to co-direct the predicted results. Compared to all previous methods and other classic text classification models, the empirical results show that our proposed deep model-based method achieves the best performance, with an accuracy of 92.16% in the CPP924 dataset, and passes various index tests.


Assuntos
Peptídeos Penetradores de Células , Transporte Biológico , Peptídeos Penetradores de Células/química , Biologia Computacional/métodos , Desenho de Fármacos , Aprendizado de Máquina
11.
Front Genet ; 12: 663572, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33868390

RESUMO

MOTIVATION: DNA N4-methylcytosine (4mC) and N6-methyladenine (6mA) are two important DNA modifications and play crucial roles in a variety of biological processes. Accurate identification of the modifications is essential to better understand their biological functions and mechanisms. However, existing methods to identify 4mA or 6mC sites are all single tasks, which demonstrates that they can identify only a certain modification in one species. Therefore, it is desirable to develop a novel computational method to identify the modification sites in multiple species simultaneously. RESULTS: In this study, we proposed a computational method, called iDNA-MT, to identify 4mC sites and 6mA sites in multiple species, respectively. The proposed iDNA-MT mainly employed multi-task learning coupled with the bidirectional gated recurrent units (BGRU) to capture the sharing information among different species directly from DNA primary sequences. Experimental comparative results on two benchmark datasets, containing different species respectively, show that either for identifying 4mA or for 6mC site in multiple species, the proposed iDNA-MT outperforms other state-of-the-art single-task methods. The promising results have demonstrated that iDNA-MT has great potential to be a powerful and practically useful tool to accurately identify DNA modifications.

SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa