Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 40
Filtrar
1.
Inflammation ; 2024 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-38954260

RESUMO

BACKGROUND: Non-alcoholic steatohepatitis (NASH) is a metabolic dysregulation-related disorder that is generally characterized by lipid metabolism dysfunction and an excessive inflammatory response. Currently, there are no authorized pharmacological interventions specifically designed to manage NASH. It has been reported that Ginkgolide C exhibits anti-inflammatory effects and modulates lipid metabolism. However, the impact and function of Ginkgolide C in diet-induced NASH are unclear. METHODS: In this study, mice were induced by a Western Diet (WD) with different doses of Ginkgolide C with or without Compound C (adenosine 5 '-monophosphate (AMP)-activated protein kinase (AMPK) inhibitor). The effects of Ginkgolide C were evaluated by assessing liver damage, steatosis, fibrosis, and AMPK expression. RESULTS: The results showed that Ginkgolide C significantly alleviated liver damage, steatosis, and fibrosis in the WD-induced mice. In addition, Ginkgolide C markedly improved insulin resistance and attenuated hepatic inflammation. Importantly, Ginkgolide C exerted protective effects by activating the AMPK signaling pathway, which was reversed by AMPK inhibition. CONCLUSION: Ginkgolide C alleviated NASH induced by WD in mice, potentially via activating the AMPK signaling pathway.

2.
Bioinformatics ; 40(2)2024 Feb 01.
Artigo em Inglês | MEDLINE | ID: mdl-38305458

RESUMO

MOTIVATION: Diabetes is a chronic metabolic disorder that has been a major cause of blindness, kidney failure, heart attacks, stroke, and lower limb amputation across the world. To alleviate the impact of diabetes, researchers have developed the next generation of anti-diabetic drugs, known as dipeptidyl peptidase IV inhibitory peptides (DPP-IV-IPs). However, the discovery of these promising drugs has been restricted due to the lack of effective peptide-mining tools. RESULTS: Here, we presented StructuralDPPIV, a deep learning model designed for DPP-IV-IP identification, which takes advantage of both molecular graph features in amino acid and sequence information. Experimental results on the independent test dataset and two wet experiment datasets show that our model outperforms the other state-of-art methods. Moreover, to better study what StructuralDPPIV learns, we used CAM technology and perturbation experiment to analyze our model, which yielded interpretable insights into the reasoning behind prediction results. AVAILABILITY AND IMPLEMENTATION: The project code is available at https://github.com/WeiLab-BioChem/Structural-DPP-IV.


Assuntos
Aprendizado Profundo , Diabetes Mellitus , Humanos , Dipeptidil Peptidase 4 , Aminoácidos , Peptídeos
3.
J Chem Inf Model ; 64(7): 2854-2862, 2024 Apr 08.
Artigo em Inglês | MEDLINE | ID: mdl-37565997

RESUMO

Identifying synergistic drug combinations is fundamentally important to treat a variety of complex diseases while avoiding severe adverse drug-drug interactions. Although several computational methods have been proposed, they highly rely on handcrafted feature engineering and cannot learn better interactive information between drug pairs, easily resulting in relatively low performance. Recently, deep-learning methods, especially graph neural networks, have been widely developed in this area and demonstrated their ability to address complex biological problems. In this study, we proposed AttenSyn, an attention-based deep graph neural network for accurately predicting synergistic drug combinations. In particular, we adopted a graph neural network module to extract high-latent features based on the molecular graphs only and exploited the attention-based pooling module to learn interactive information between drug pairs to strengthen the representations of drug pairs. Comparative results on the benchmark datasets demonstrated that our AttenSyn performs better than the state-of-the-art methods in the prediction of anticancer synergistic drug combinations. Additionally, to provide good interpretability of our model, we explored and visualized some crucial substructures in drugs through attention mechanisms. Furthermore, we also verified the effectiveness of our proposed AttenSyn on two cell lines by visualizing the features of drug combinations learnt from our model, exhibiting satisfactory generalization ability.


Assuntos
Benchmarking , Aprendizagem , Linhagem Celular , Redes Neurais de Computação
4.
J Chem Inf Model ; 64(7): 2807-2816, 2024 Apr 08.
Artigo em Inglês | MEDLINE | ID: mdl-37252890

RESUMO

Anticancer peptides (ACPs) recently have been receiving increasing attention in cancer therapy due to their low consumption, few adverse side effects, and easy accessibility. However, it remains a great challenge to identify anticancer peptides via experimental approaches, requiring expensive and time-consuming experimental studies. In addition, traditional machine-learning-based methods are proposed for ACP prediction mainly depending on hand-crafted feature engineering, which normally achieves low prediction performance. In this study, we propose CACPP (Contrastive ACP Predictor), a deep learning framework based on the convolutional neural network (CNN) and contrastive learning for accurately predicting anticancer peptides. In particular, we introduce the TextCNN model to extract the high-latent features based on the peptide sequences only and exploit the contrastive learning module to learn more distinguishable feature representations to make better predictions. Comparative results on the benchmark data sets indicate that CACPP outperforms all the state-of-the-art methods in the prediction of anticancer peptides. Moreover, to intuitively show that our model has good classification ability, we visualize the dimension reduction of the features from our model and explore the relationship between ACP sequences and anticancer functions. Furthermore, we also discuss the influence of data set construction on model prediction and explore our model performance on the data sets with verified negative samples.


Assuntos
Benchmarking , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos , Humanos , Aprendizado de Máquina , Redes Neurais de Computação , Peptídeos/farmacologia
5.
J Chem Inf Model ; 64(1): 316-326, 2024 Jan 08.
Artigo em Inglês | MEDLINE | ID: mdl-38135439

RESUMO

Antimicrobial peptides are peptides that are effective against bacteria and viruses, and the discovery of new antimicrobial peptides is of great importance to human life and health. Although the design of antimicrobial peptides using machine learning methods has achieved good results in recent years, it remains a challenge to learn and design novel antimicrobial peptides with multiple properties of interest from peptide data with certain property labels. To this end, we propose Multi-CGAN, a deep generative model-based architecture that can learn from single-attribute peptide data and generate antimicrobial peptide sequences with multiple attributes that we need, which may have a potentially wide range of uses in drug discovery. In particular, we verified that our Multi-CGAN generated peptides with the desired properties have good performance in terms of generation rate. Moreover, a comprehensive statistical analysis demonstrated that our generated peptides are diverse and have a low probability of being homologous to the training data. Interestingly, we found that the performance of many popular deep learning methods on the antimicrobial peptide prediction task can be improved by using Multi-CGAN to expand the data on the training set of the original task, indicating the high quality of our generated peptides and the robust ability of our method. In addition, we also investigated whether it is possible to directionally generate peptide sequences with specified properties by controlling the input noise sampling for our model.


Assuntos
Peptídeos Antimicrobianos , Peptídeos , Humanos , Peptídeos/farmacologia , Peptídeos/química , Aprendizado de Máquina , Descoberta de Drogas
6.
Comput Biol Med ; 167: 107631, 2023 12.
Artigo em Inglês | MEDLINE | ID: mdl-37948966

RESUMO

The accurate prediction of peptide contact maps remains a challenging task due to the difficulty in obtaining the interactive information between residues on short sequences. To address this challenge, we propose ConPep, a deep learning framework designed for predicting the contact map of peptides based on sequences only. To sufficiently incorporate the sequential semantic information between residues in peptide sequences, we use a pre-trained biological language model and transfer prior knowledge from large scale databases. Additionally, to extract and integrate sequential local information and residue-based global correlations, our model incorporates Bidirectional Gated Recurrent Unit and attention mechanisms. They can obtain multi-view features and thus enhance the accuracy and robustness of our prediction. Comparative results on independent tests demonstrate that our proposed method significantly outperforms state-of-the-art methods even with short peptides. Notably, our method exhibits superior performance at the sequence level, suggesting the robust ability of our model compared with the multiple sequence alignment (MSA) analysis-based methods. We expect it can be meaningful research for facilitating the wide use of our method.


Assuntos
Algoritmos , Proteínas , Proteínas/química , Biologia Computacional/métodos , Peptídeos , Idioma , Bases de Dados de Proteínas
7.
Bioinformatics ; 39(12)2023 12 01.
Artigo em Inglês | MEDLINE | ID: mdl-38015872

RESUMO

MOTIVATION: Identifying the functional sites of a protein, such as the binding sites of proteins, peptides, or other biological components, is crucial for understanding related biological processes and drug design. However, existing sequence-based methods have limited predictive accuracy, as they only consider sequence-adjacent contextual features and lack structural information. RESULTS: In this study, DeepProSite is presented as a new framework for identifying protein binding site that utilizes protein structure and sequence information. DeepProSite first generates protein structures from ESMFold and sequence representations from pretrained language models. It then uses Graph Transformer and formulates binding site predictions as graph node classifications. In predicting protein-protein/peptide binding sites, DeepProSite outperforms state-of-the-art sequence- and structure-based methods on most metrics. Moreover, DeepProSite maintains its performance when predicting unbound structures, in contrast to competing structure-based prediction methods. DeepProSite is also extended to the prediction of binding sites for nucleic acids and other ligands, verifying its generalization capability. Finally, an online server for predicting multiple types of residue is established as the implementation of the proposed DeepProSite. AVAILABILITY AND IMPLEMENTATION: The datasets and source codes can be accessed at https://github.com/WeiLab-Biology/DeepProSite. The proposed DeepProSite can be accessed at https://inner.wei-group.net/DeepProSite/.


Assuntos
Peptídeos , Proteínas , Ligação Proteica , Proteínas/química , Sítios de Ligação , Software
8.
Brief Bioinform ; 24(6)2023 09 22.
Artigo em Inglês | MEDLINE | ID: mdl-37861173

RESUMO

NcRNA-encoded small peptides (ncPEPs) have recently emerged as promising targets and biomarkers for cancer immunotherapy. Therefore, identifying cancer-associated ncPEPs is crucial for cancer research. In this work, we propose CoraL, a novel supervised contrastive meta-learning framework for predicting cancer-associated ncPEPs. Specifically, the proposed meta-learning strategy enables our model to learn meta-knowledge from different types of peptides and train a promising predictive model even with few labeled samples. The results show that our model is capable of making high-confidence predictions on unseen cancer biomarkers with only five samples, potentially accelerating the discovery of novel cancer biomarkers for immunotherapy. Moreover, our approach remarkably outperforms existing deep learning models on 15 cancer-associated ncPEPs datasets, demonstrating its effectiveness and robustness. Interestingly, our model exhibits outstanding performance when extended for the identification of short open reading frames derived from ncPEPs, demonstrating the strong prediction ability of CoraL at the transcriptome level. Importantly, our feature interpretation analysis discovers unique sequential patterns as the fingerprint for each cancer-associated ncPEPs, revealing the relationship among certain cancer biomarkers that are validated by relevant literature and motif comparison. Overall, we expect CoraL to be a useful tool to decipher the pathogenesis of cancer and provide valuable information for cancer research. The dataset and source code of our proposed method can be found at https://github.com/Johnsunnn/CoraL.


Assuntos
Antozoários , Neoplasias , Animais , Antozoários/genética , Neoplasias/genética , Biomarcadores Tumorais/genética , Imunoterapia , Peptídeos/genética , RNA não Traduzido
9.
Int J Biol Macromol ; 246: 125412, 2023 Aug 15.
Artigo em Inglês | MEDLINE | ID: mdl-37327922

RESUMO

Interleukin-6 (IL-6) is a potential therapeutic target for many diseases, and it is of great significance in accurately predicting IL-6-induced peptides for IL-6 research. However, the cost of traditional wet experiments to detect IL-6-induced peptides is huge, and the discovery and design of peptides by computer before the experimental stage have become a promising technology. In this study, we developed a deep learning model called MVIL6 for predicting IL-6-inducing peptides. Comparative results demonstrated the outstanding performance and robustness of MVIL6. Specifically, we employ a pre-trained protein language model MG-BERT and the Transformer model to process two different sequence-based descriptors and integrate them with a fusion module to improve the prediction performance. The ablation experiment demonstrated the effectiveness of our fusion strategy for the two models. In addition, to provide good interpretability of our model, we explored and visualized the amino acids considered important for IL-6-induced peptide prediction by our model. Finally, a case study presented using MVIL6 to predict IL-6-induced peptides in the SARS-CoV-2 spike protein shows that MVIL6 achieves higher performance than existing methods and can be useful for identifying potential IL-6-induced peptides in viral proteins.


Assuntos
COVID-19 , Interleucina-6 , Humanos , SARS-CoV-2 , Peptídeos/farmacologia
10.
Brief Bioinform ; 24(4)2023 07 20.
Artigo em Inglês | MEDLINE | ID: mdl-37225420

RESUMO

Enzymatic reactions are crucial to explore the mechanistic function of metabolites and proteins in cellular processes and to understand the etiology of diseases. The increasing number of interconnected metabolic reactions allows the development of in silico deep learning-based methods to discover new enzymatic reaction links between metabolites and proteins to further expand the landscape of existing metabolite-protein interactome. Computational approaches to predict the enzymatic reaction link by metabolite-protein interaction (MPI) prediction are still very limited. In this study, we developed a Variational Graph Autoencoders (VGAE)-based framework to predict MPI in genome-scale heterogeneous enzymatic reaction networks across ten organisms. By incorporating molecular features of metabolites and proteins as well as neighboring information in the MPI networks, our MPI-VGAE predictor achieved the best predictive performance compared to other machine learning methods. Moreover, when applying the MPI-VGAE framework to reconstruct hundreds of metabolic pathways, functional enzymatic reaction networks and a metabolite-metabolite interaction network, our method showed the most robust performance among all scenarios. To the best of our knowledge, this is the first MPI predictor by VGAE for enzymatic reaction link prediction. Furthermore, we implemented the MPI-VGAE framework to reconstruct the disease-specific MPI network based on the disrupted metabolites and proteins in Alzheimer's disease and colorectal cancer, respectively. A substantial number of novel enzymatic reaction links were identified. We further validated and explored the interactions of these enzymatic reactions using molecular docking. These results highlight the potential of the MPI-VGAE framework for the discovery of novel disease-related enzymatic reactions and facilitate the study of the disrupted metabolisms in diseases.


Assuntos
Aprendizado de Máquina , Redes e Vias Metabólicas , Simulação de Acoplamento Molecular , Fenômenos Fisiológicos Celulares
11.
Bioinformatics ; 39(3)2023 03 01.
Artigo em Inglês | MEDLINE | ID: mdl-36897030

RESUMO

MOTIVATION: Plant Small Secreted Peptides (SSPs) play an important role in plant growth, development, and plant-microbe interactions. Therefore, the identification of SSPs is essential for revealing the functional mechanisms. Over the last few decades, machine learning-based methods have been developed, accelerating the discovery of SSPs to some extent. However, existing methods highly depend on handcrafted feature engineering, which easily ignores the latent feature representations and impacts the predictive performance. RESULTS: Here, we propose ExamPle, a novel deep learning model using Siamese network and multi-view representation for the explainable prediction of the plant SSPs. Benchmarking comparison results show that our ExamPle performs significantly better than existing methods in the prediction of plant SSPs. Also, our model shows excellent feature extraction ability. Importantly, by utilizing in silicomutagenesis experiment, ExamPle can discover sequential characteristics and identify the contribution of each amino acid for the predictions. The key novel principle learned by our model is that the head region of the peptide and some specific sequential patterns are strongly associated with the SSPs' functions. Thus, ExamPle is expected to be a useful tool for predicting plant SSPs and designing effective plant SSPs. AVAILABILITY AND IMPLEMENTATION: Our codes and datasets are available at https://github.com/Johnsunnn/ExamPle.


Assuntos
Aprendizado Profundo , Peptídeos , Aprendizado de Máquina , Aminoácidos , Benchmarking
12.
Adv Sci (Weinh) ; 10(11): e2206151, 2023 04.
Artigo em Inglês | MEDLINE | ID: mdl-36794291

RESUMO

Accurately predicting peptide secondary structures remains a challenging task due to the lack of discriminative information in short peptides. In this study, PHAT is proposed, a deep hypergraph learning framework for the prediction of peptide secondary structures and the exploration of downstream tasks. The framework includes a novel interpretable deep hypergraph multi-head attention network that uses residue-based reasoning for structure prediction. The algorithm can incorporate sequential semantic information from large-scale biological corpus and structural semantic information from multi-scale structural segmentation, leading to better accuracy and interpretability even with extremely short peptides. The interpretable models are able to highlight the reasoning of structural feature representations and the classification of secondary substructures. The importance of secondary structures in peptide tertiary structure reconstruction and downstream functional analysis is further demonstrated, highlighting the versatility of our models. To facilitate the use of the model, an online server is established which is accessible via http://inner.wei-group.net/PHAT/. The work is expected to assist in the design of functional peptides and contribute to the advancement of structural biology research.


Assuntos
Algoritmos , Peptídeos , Estrutura Secundária de Proteína , Peptídeos/química
13.
Brief Bioinform ; 24(1)2023 01 19.
Artigo em Inglês | MEDLINE | ID: mdl-36631407

RESUMO

Recently, peptide-based drugs have gained unprecedented interest in discovering and developing antifungal drugs due to their high efficacy, broad-spectrum activity, low toxicity and few side effects. However, it is time-consuming and expensive to identify antifungal peptides (AFPs) experimentally. Therefore, computational methods for accurately predicting AFPs are highly required. In this work, we develop AFP-MFL, a novel deep learning model that predicts AFPs only relying on peptide sequences without using any structural information. AFP-MFL first constructs comprehensive feature profiles of AFPs, including contextual semantic information derived from a pre-trained protein language model, evolutionary information, and physicochemical properties. Subsequently, the co-attention mechanism is utilized to integrate contextual semantic information with evolutionary information and physicochemical properties separately. Extensive experiments show that AFP-MFL outperforms state-of-the-art models on four independent test datasets. Furthermore, the SHAP method is employed to explore each feature contribution to the AFPs prediction. Finally, a user-friendly web server of the proposed AFP-MFL is developed and freely accessible at http://inner.wei-group.net/AFPMFL/, which can be considered as a powerful tool for the rapid screening and identification of novel AFPs.


Assuntos
Antifúngicos , alfa-Fetoproteínas , Antifúngicos/farmacologia , Algoritmos , Peptídeos/química , Biologia Computacional/métodos
14.
Poult Sci ; 102(3): 102479, 2023 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-36669355

RESUMO

This study was conducted to investigate the protective effects of chlorogenic acid (CGA) on broilers subjected to (DQ)-induced oxidative stress. In experiment 1, one hundred and ninety-two male one-day-old Ross 308 broiler chicks were distributed into 4 groups and fed a basal diet supplemented with 0, 250, 500, or 1,000 mg/kg CGA for 21 d. In experiment 2, an equivalent number of male one-day-old chicks were allocated to 4 treatments for a 21-d trial: 1) Control group, normal birds fed a basal diet; 2) DQ group, DQ-challenged birds fed a basal diet; and 3) and 4) CGA-treated groups: DQ-challenged birds fed a basal diet supplemented with 500 or 1,000 mg/kg CGA. The intraperitoneal DQ challenge was performed at 20 d. In experiment 1, CGA administration linearly increased 21-d body weight, and weight gain and feed intake during 1 to 21 d (P < 0.05). CGA linearly and/or quadratically increased total antioxidant capacity, catalase, superoxide dismutase, and glutathione peroxidase activities, elevated glutathione level, and reduced malondialdehyde accumulation in serum, liver, and/or jejunum (P < 0.05). In experiment 2, compared with the control group, DQ challenge reduced body weight ratio (P < 0.05), which was reversed by CGA administration (P < 0.05). DQ challenge increased serum total protein level, aspartate aminotransferase activity, and total bilirubin concentration (P < 0.05), which were normalized when supplementing 500 mg/kg and/or 1,000 mg/kg CGA (P < 0.05). DQ administration elevated hepatic interleukin-1ß, tumor necrosis factor-α, and interleukin-6 levels (P < 0.05), and the values of interleukin-1ß were normalized to control values when supplementing CGA (P < 0.05). DQ injection decreased serum superoxide dismutase activity, hepatic catalase activity, and serum and hepatic glutathione level, but increased malondialdehyde concentration in serum and liver (P < 0.05), and the values of these parameters (except hepatic catalase activity) were reversed by 500 and/or 1,000 mg/kg CGA. The results suggested that CGA could improve growth performance, alleviate oxidative stress, and ameliorate hepatic inflammation in DQ-challenged broilers.


Assuntos
Antioxidantes , Galinhas , Ácido Clorogênico , Animais , Masculino , Ração Animal/análise , Antioxidantes/metabolismo , Peso Corporal , Catalase/metabolismo , Galinhas/metabolismo , Ácido Clorogênico/farmacologia , Dieta/veterinária , Suplementos Nutricionais , Diquat/toxicidade , Glutationa/metabolismo , Inflamação/induzido quimicamente , Inflamação/veterinária , Interleucina-1beta , Malondialdeído , Estresse Oxidativo , Superóxido Dismutase/metabolismo
15.
Bioinformatics ; 38(13): 3351-3360, 2022 06 27.
Artigo em Inglês | MEDLINE | ID: mdl-35604077

RESUMO

SUMMARY: Identifying the protein-peptide binding residues is fundamentally important to understand the mechanisms of protein functions and explore drug discovery. Although several computational methods have been developed, most of them highly rely on third-party tools or complex data preprocessing for feature design, easily resulting in low computational efficacy and suffering from low predictive performance. To address the limitations, we propose PepBCL, a novel BERT (Bidirectional Encoder Representation from Transformers) -based contrastive learning framework to predict the protein-peptide binding residues based on protein sequences only. PepBCL is an end-to-end predictive model that is independent of feature engineering. Specifically, we introduce a well pre-trained protein language model that can automatically extract and learn high-latent representations of protein sequences relevant for protein structures and functions. Further, we design a novel contrastive learning module to optimize the feature representations of binding residues underlying the imbalanced dataset. We demonstrate that our proposed method significantly outperforms the state-of-the-art methods under benchmarking comparison, and achieves more robust performance. Moreover, we found that we further improve the performance via the integration of traditional features and our learnt features. Interestingly, the interpretable analysis of our model highlights the flexibility and adaptability of deep learning-based protein language model to capture both conserved and non-conserved sequential characteristics of peptide-binding residues. Finally, to facilitate the use of our method, we establish an online predictive platform as the implementation of the proposed PepBCL, which is now available at http://server.wei-group.net/PepBCL/. AVAILABILITY AND IMPLEMENTATION: https://github.com/Ruheng-W/PepBCL. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Aprendizado Profundo , Proteínas/química , Peptídeos , Ligação Proteica , Sequência de Aminoácidos
16.
Bioinformatics ; 38(9): 2602-2611, 2022 04 28.
Artigo em Inglês | MEDLINE | ID: mdl-35212728

RESUMO

MOTIVATION: The development of microscopic imaging techniques enables us to study protein subcellular locations from the tissue level down to the cell level, contributing to the rapid development of image-based protein subcellular location prediction approaches. However, existing methods suffer from intrinsic limitations, such as poor feature representation ability, data imbalanced issue, and multi-label classification problem, greatly impacting the model performance and generalization. RESULTS: In this study, we propose MSTLoc, a novel multi-scale end-to-end deep learning model to identify protein subcellular locations in the imbalanced multi-label immunohistochemistry (IHC) images dataset. In our MSTLoc, we deploy a deep convolution neural network to extract multi-scale features from the IHC images, aggregate the high-level features and low-level features via feature fusion to sufficiently exploit the dependencies amongst various subcellular locations, and utilize Vision Transformer (ViT) to model the relationship amongst the features and enhance the feature representation ability. We demonstrate that the proposed MSTLoc achieves better performance than current state-of-the-art models in multi-label subcellular location prediction. Through feature visualization and interpretation analysis, we demonstrate that as compared with the hand-crafted features, the multi-scale deep features learnt from our model exhibit better ability in capturing discriminative patterns underlying protein subcellular locations, and the features from different scales are complementary for the improvement in performance. Finally, case study results indicate that our MSTLoc can successfully identify some biomarkers from proteins that are closely involved with cancer development. AVAILABILITY AND IMPLEMENTATION: For the convenient use of our method, we establish a user-friendly webserver available at http://server.wei-group.net/MSTLoc. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Aprendizado Profundo , Imuno-Histoquímica , Transporte Proteico , Proteínas/metabolismo , Redes Neurais de Computação
17.
Bioinformatics ; 38(6): 1514-1524, 2022 03 04.
Artigo em Inglês | MEDLINE | ID: mdl-34999757

RESUMO

MOTIVATION: Recently, peptides have emerged as a promising class of pharmaceuticals for various diseases treatment poised between traditional small molecule drugs and therapeutic proteins. However, one of the key bottlenecks preventing them from therapeutic peptides is their toxicity toward human cells, and few available algorithms for predicting toxicity are specially designed for short-length peptides. RESULTS: We present ToxIBTL, a novel deep learning framework by utilizing the information bottleneck principle and transfer learning to predict the toxicity of peptides as well as proteins. Specifically, we use evolutionary information and physicochemical properties of peptide sequences and integrate the information bottleneck principle into a feature representation learning scheme, by which relevant information is retained and the redundant information is minimized in the obtained features. Moreover, transfer learning is introduced to transfer the common knowledge contained in proteins to peptides, which aims to improve the feature representation capability. Extensive experimental results demonstrate that ToxIBTL not only achieves a higher prediction performance than state-of-the-art methods on the peptide dataset, but also has a competitive performance on the protein dataset. Furthermore, a user-friendly online web server is established as the implementation of the proposed ToxIBTL. AVAILABILITY AND IMPLEMENTATION: The proposed ToxIBTL and data can be freely accessible at http://server.wei-group.net/ToxIBTL. Our source code is available at https://github.com/WLYLab/ToxIBTL. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Aprendizado de Máquina , Peptídeos , Humanos , Proteínas , Software , Algoritmos
18.
Brief Bioinform ; 23(2)2022 03 10.
Artigo em Inglês | MEDLINE | ID: mdl-35043144

RESUMO

Predicting the response of cancer patients to a particular treatment is a major goal of modern oncology and an important step toward personalized treatment. In the practical clinics, the clinicians prefer to obtain the most-suited drugs for a particular patient instead of knowing the exact values of drug sensitivity. Instead of predicting the exact value of drug response, we proposed a deep learning-based method, named Siamese Response Deep Factorization Machines (SRDFM) Network, for personalized anti-cancer drug recommendation, which directly ranks the drugs and provides the most effective drugs. A Siamese network (SN), a type of deep learning network that is composed of identical subnetworks that share the same architecture, parameters and weights, was used to measure the relative position (RP) between drugs for each cell line. Through minimizing the difference between the real RP and the predicted RP, an optimal SN model was established to provide the rank for all the candidate drugs. Specifically, the subnetwork in each side of the SN consists of a feature generation level and a predictor construction level. On the feature generation level, both drug property and gene expression, were adopted to build a concatenated feature vector, which even enables the recommendation for newly designed drugs with only chemical property known. Particularly, we developed a response unit here to generate weighted genetic feature vector to simulate the biological interaction mechanism between a specific drug and the genes. For the predictor construction level, we built this level integrating a factorization machine (FM) component with a deep neural network component. The FM can well handle the discrete chemical information and both low-order and high-order feature interactions could be sufficiently learned. Impressively, the SRDFM works well on both single-drug recommendation and synergic drug combination. Experiment result on both single-drug and synergetic drug data sets have shown the efficiency of the SRDFM. The Python implementation for the proposed SRDFM is available at at https://github.com/RanSuLab/SRDFM Contact: ran.su@tju.edu.cn, gbx@mju.edu.cn and weileyi@sdu.edu.cn.


Assuntos
Antineoplásicos , Neoplasias , Antineoplásicos/farmacologia , Antineoplásicos/uso terapêutico , Humanos , Neoplasias/tratamento farmacológico , Neoplasias/genética , Redes Neurais de Computação
19.
Brief Bioinform ; 23(1)2022 01 17.
Artigo em Inglês | MEDLINE | ID: mdl-34882198

RESUMO

Metastasis is a major cause of cancer morbidity and mortality, and most cancer deaths are caused by cancer metastasis rather than by the primary tumor. The prediction of metastasis based on computational methods has not been explored much in the previous research. In this study, we proposed a graph convolutional network embedded with a graph learning (GL) module, named glmGCN, to predict the distant metastasis of cancer. Both the mRNA and lncRNA expressions were used to provide more genetic information than using the mRNA alone and we used them to construct gene interaction graph representation to consider the effect of genetic interaction. Then, the prediction of the cancer metastasis was performed under a GCN framework, which extracted informative and advanced features from the built non-regular graph structures. Particularly, a GL module was embedded in the proposed glmGCN to learn an optimal graph representation of the gene interaction. We firstly constructed the protein-protein interaction network to represent the initial gene(node) relationship graph. Then, through the GL module, a new graph representation was built which optimally learned the gene interaction strength. Finally, the GCN was adopted to identify the distant metastasis cases. It is worth mentioning that the proposed method pays more attentions on the gene-gene relation than the previous GCN-based method, so more accurate prediction performance can be obtained. The glmGCN was trained based on two types of cancer and was further validated using two other cancer types. A series of experiments have shown that the effectiveness of the proposed method. The implementation for the proposed method is available at https://github.com/RanSuLab/Metastasis-glmGCN.


Assuntos
Neoplasias , RNA Longo não Codificante , Humanos , Aprendizado de Máquina , Neoplasias/genética , Redes Neurais de Computação , RNA Longo não Codificante/genética
20.
Brief Bioinform ; 23(1)2022 01 17.
Artigo em Inglês | MEDLINE | ID: mdl-34882225

RESUMO

Recently, machine learning methods have been developed to identify various peptide bio-activities. However, due to the lack of experimentally validated peptides, machine learning methods cannot provide a sufficiently trained model, easily resulting in poor generalizability. Furthermore, there is no generic computational framework to predict the bioactivities of different peptides. Thus, a natural question is whether we can use limited samples to build an effective predictive model for different kinds of peptides. To address this question, we propose Mutual Information Maximization Meta-Learning (MIMML), a novel meta-learning-based predictive model for bioactive peptide discovery. Using few samples from various functional peptides, MIMML can sufficiently learn the discriminative information amongst various functions and characterize functional differences. Experimental results show excellent performance of MIMML though using far fewer training samples as compared to the state-of-the-art methods. We also decipher the latent relationships among different kinds of functions to understand what meta-model learned to improve a specific task. In summary, this study is a pioneering work in the field of functional peptide mining and provides the first-of-its-kind solution for few-sample learning problems in biological sequence analysis, accelerating the new functional peptide discovery. The source codes and datasets are available on https://github.com/TearsWaiting/MIMML.


Assuntos
Aprendizado de Máquina , Peptídeos , Peptídeos/química , Software
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA