Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
1.
Bioinformatics ; 40(2)2024 Feb 01.
Artículo en Inglés | MEDLINE | ID: mdl-38305458

RESUMEN

MOTIVATION: Diabetes is a chronic metabolic disorder that has been a major cause of blindness, kidney failure, heart attacks, stroke, and lower limb amputation across the world. To alleviate the impact of diabetes, researchers have developed the next generation of anti-diabetic drugs, known as dipeptidyl peptidase IV inhibitory peptides (DPP-IV-IPs). However, the discovery of these promising drugs has been restricted due to the lack of effective peptide-mining tools. RESULTS: Here, we presented StructuralDPPIV, a deep learning model designed for DPP-IV-IP identification, which takes advantage of both molecular graph features in amino acid and sequence information. Experimental results on the independent test dataset and two wet experiment datasets show that our model outperforms the other state-of-art methods. Moreover, to better study what StructuralDPPIV learns, we used CAM technology and perturbation experiment to analyze our model, which yielded interpretable insights into the reasoning behind prediction results. AVAILABILITY AND IMPLEMENTATION: The project code is available at https://github.com/WeiLab-BioChem/Structural-DPP-IV.


Asunto(s)
Aprendizaje Profundo , Diabetes Mellitus , Humanos , Dipeptidil Peptidasa 4 , Aminoácidos , Péptidos
2.
J Chem Inf Model ; 64(7): 2807-2816, 2024 Apr 08.
Artículo en Inglés | MEDLINE | ID: mdl-37252890

RESUMEN

Anticancer peptides (ACPs) recently have been receiving increasing attention in cancer therapy due to their low consumption, few adverse side effects, and easy accessibility. However, it remains a great challenge to identify anticancer peptides via experimental approaches, requiring expensive and time-consuming experimental studies. In addition, traditional machine-learning-based methods are proposed for ACP prediction mainly depending on hand-crafted feature engineering, which normally achieves low prediction performance. In this study, we propose CACPP (Contrastive ACP Predictor), a deep learning framework based on the convolutional neural network (CNN) and contrastive learning for accurately predicting anticancer peptides. In particular, we introduce the TextCNN model to extract the high-latent features based on the peptide sequences only and exploit the contrastive learning module to learn more distinguishable feature representations to make better predictions. Comparative results on the benchmark data sets indicate that CACPP outperforms all the state-of-the-art methods in the prediction of anticancer peptides. Moreover, to intuitively show that our model has good classification ability, we visualize the dimension reduction of the features from our model and explore the relationship between ACP sequences and anticancer functions. Furthermore, we also discuss the influence of data set construction on model prediction and explore our model performance on the data sets with verified negative samples.


Asunto(s)
Benchmarking , Efectos Colaterales y Reacciones Adversas Relacionados con Medicamentos , Humanos , Aprendizaje Automático , Redes Neurales de la Computación , Péptidos/farmacología
3.
Brief Bioinform ; 24(6)2023 09 22.
Artículo en Inglés | MEDLINE | ID: mdl-37861173

RESUMEN

NcRNA-encoded small peptides (ncPEPs) have recently emerged as promising targets and biomarkers for cancer immunotherapy. Therefore, identifying cancer-associated ncPEPs is crucial for cancer research. In this work, we propose CoraL, a novel supervised contrastive meta-learning framework for predicting cancer-associated ncPEPs. Specifically, the proposed meta-learning strategy enables our model to learn meta-knowledge from different types of peptides and train a promising predictive model even with few labeled samples. The results show that our model is capable of making high-confidence predictions on unseen cancer biomarkers with only five samples, potentially accelerating the discovery of novel cancer biomarkers for immunotherapy. Moreover, our approach remarkably outperforms existing deep learning models on 15 cancer-associated ncPEPs datasets, demonstrating its effectiveness and robustness. Interestingly, our model exhibits outstanding performance when extended for the identification of short open reading frames derived from ncPEPs, demonstrating the strong prediction ability of CoraL at the transcriptome level. Importantly, our feature interpretation analysis discovers unique sequential patterns as the fingerprint for each cancer-associated ncPEPs, revealing the relationship among certain cancer biomarkers that are validated by relevant literature and motif comparison. Overall, we expect CoraL to be a useful tool to decipher the pathogenesis of cancer and provide valuable information for cancer research. The dataset and source code of our proposed method can be found at https://github.com/Johnsunnn/CoraL.


Asunto(s)
Antozoos , Neoplasias , Animales , Antozoos/genética , Neoplasias/genética , Biomarcadores de Tumor/genética , Inmunoterapia , Péptidos/genética , ARN no Traducido
4.
Comput Biol Med ; 164: 107260, 2023 09.
Artículo en Inglés | MEDLINE | ID: mdl-37557052

RESUMEN

The promoter region, positioned proximal to the transcription start sites, exerts control over the initiation of gene transcription by modulating the interaction with RNA polymerase. Consequently, the accurate recognition of promoter regions represents a critical focus within the bioinformatics domain. Although some methods leveraging pre-trained language models (PLMs) for promoter prediction have been proposed, the full potential of such PLMs remains largely untapped. In this study, we introduce PLPMpro, a model that capitalizes on prompt-learning and the pre-trained language model to enhance the prediction of promoter sequences. PLPMpro effectively harnesses the prompt learning paradigm to fully exploit the inherent capacities of the PLM, resulting in substantial improvements in prediction performance. Experiment results unequivocally demonstrate the efficacy of prompt learning in bolstering the capabilities of the pre-trained model. Consequently, PLPMpro surpasses both typical pre-trained model-based methods for promoter prediction and typical deep learning methods. Furthermore, we conduct various experiments to meticulously scrutinize the effects of different prompt learning settings and different numbers of soft modules on the model performance. More importantly, the interpretation experiment reveals that the pre-trained model captures biological semantics. Collectively, this research imparts a novel perspective on the optimal utilization of PLMs for addressing biological problems.


Asunto(s)
Biología Computacional , Semántica , Regiones Promotoras Genéticas/genética , Biología Computacional/métodos
5.
Int J Biol Macromol ; 246: 125412, 2023 Aug 15.
Artículo en Inglés | MEDLINE | ID: mdl-37327922

RESUMEN

Interleukin-6 (IL-6) is a potential therapeutic target for many diseases, and it is of great significance in accurately predicting IL-6-induced peptides for IL-6 research. However, the cost of traditional wet experiments to detect IL-6-induced peptides is huge, and the discovery and design of peptides by computer before the experimental stage have become a promising technology. In this study, we developed a deep learning model called MVIL6 for predicting IL-6-inducing peptides. Comparative results demonstrated the outstanding performance and robustness of MVIL6. Specifically, we employ a pre-trained protein language model MG-BERT and the Transformer model to process two different sequence-based descriptors and integrate them with a fusion module to improve the prediction performance. The ablation experiment demonstrated the effectiveness of our fusion strategy for the two models. In addition, to provide good interpretability of our model, we explored and visualized the amino acids considered important for IL-6-induced peptide prediction by our model. Finally, a case study presented using MVIL6 to predict IL-6-induced peptides in the SARS-CoV-2 spike protein shows that MVIL6 achieves higher performance than existing methods and can be useful for identifying potential IL-6-induced peptides in viral proteins.


Asunto(s)
COVID-19 , Interleucina-6 , Humanos , SARS-CoV-2 , Péptidos/farmacología
6.
Bioinformatics ; 39(3)2023 03 01.
Artículo en Inglés | MEDLINE | ID: mdl-36897030

RESUMEN

MOTIVATION: Plant Small Secreted Peptides (SSPs) play an important role in plant growth, development, and plant-microbe interactions. Therefore, the identification of SSPs is essential for revealing the functional mechanisms. Over the last few decades, machine learning-based methods have been developed, accelerating the discovery of SSPs to some extent. However, existing methods highly depend on handcrafted feature engineering, which easily ignores the latent feature representations and impacts the predictive performance. RESULTS: Here, we propose ExamPle, a novel deep learning model using Siamese network and multi-view representation for the explainable prediction of the plant SSPs. Benchmarking comparison results show that our ExamPle performs significantly better than existing methods in the prediction of plant SSPs. Also, our model shows excellent feature extraction ability. Importantly, by utilizing in silicomutagenesis experiment, ExamPle can discover sequential characteristics and identify the contribution of each amino acid for the predictions. The key novel principle learned by our model is that the head region of the peptide and some specific sequential patterns are strongly associated with the SSPs' functions. Thus, ExamPle is expected to be a useful tool for predicting plant SSPs and designing effective plant SSPs. AVAILABILITY AND IMPLEMENTATION: Our codes and datasets are available at https://github.com/Johnsunnn/ExamPle.


Asunto(s)
Aprendizaje Profundo , Péptidos , Aprendizaje Automático , Aminoácidos , Benchmarking
7.
Adv Sci (Weinh) ; 10(11): e2206151, 2023 04.
Artículo en Inglés | MEDLINE | ID: mdl-36794291

RESUMEN

Accurately predicting peptide secondary structures remains a challenging task due to the lack of discriminative information in short peptides. In this study, PHAT is proposed, a deep hypergraph learning framework for the prediction of peptide secondary structures and the exploration of downstream tasks. The framework includes a novel interpretable deep hypergraph multi-head attention network that uses residue-based reasoning for structure prediction. The algorithm can incorporate sequential semantic information from large-scale biological corpus and structural semantic information from multi-scale structural segmentation, leading to better accuracy and interpretability even with extremely short peptides. The interpretable models are able to highlight the reasoning of structural feature representations and the classification of secondary substructures. The importance of secondary structures in peptide tertiary structure reconstruction and downstream functional analysis is further demonstrated, highlighting the versatility of our models. To facilitate the use of the model, an online server is established which is accessible via http://inner.wei-group.net/PHAT/. The work is expected to assist in the design of functional peptides and contribute to the advancement of structural biology research.


Asunto(s)
Algoritmos , Péptidos , Estructura Secundaria de Proteína , Péptidos/química
8.
Brief Bioinform ; 24(1)2023 01 19.
Artículo en Inglés | MEDLINE | ID: mdl-36562719

RESUMEN

BACKGROUND: Cell-penetrating peptides (CPPs) have received considerable attention as a means of transporting pharmacologically active molecules into living cells without damaging the cell membrane, and thus hold great promise as future therapeutics. Recently, several machine learning-based algorithms have been proposed for predicting CPPs. However, most existing predictive methods do not consider the agreement (disagreement) between similar (dissimilar) CPPs and depend heavily on expert knowledge-based handcrafted features. RESULTS: In this study, we present SiameseCPP, a novel deep learning framework for automated CPPs prediction. SiameseCPP learns discriminative representations of CPPs based on a well-pretrained model and a Siamese neural network consisting of a transformer and gated recurrent units. Contrastive learning is used for the first time to build a CPP predictive model. Comprehensive experiments demonstrate that our proposed SiameseCPP is superior to existing baseline models for predicting CPPs. Moreover, SiameseCPP also achieves good performance on other functional peptide datasets, exhibiting satisfactory generalization ability.


Asunto(s)
Péptidos de Penetración Celular , Péptidos de Penetración Celular/metabolismo , Algoritmos , Transporte Biológico , Redes Neurales de la Computación , Aprendizaje Automático
9.
Genome Biol ; 23(1): 219, 2022 10 17.
Artículo en Inglés | MEDLINE | ID: mdl-36253864

RESUMEN

In this study, we propose iDNA-ABF, a multi-scale deep biological language learning model that enables the interpretable prediction of DNA methylations based on genomic sequences only. Benchmarking comparisons show that our iDNA-ABF outperforms state-of-the-art methods for different methylation predictions. Importantly, we show the power of deep language learning in capturing both sequential and functional semantics information from background genomes. Moreover, by integrating the interpretable analysis mechanism, we well explain what the model learns, helping us build the mapping from the discovery of important sequential determinants to the in-depth analysis of their biological functions.


Asunto(s)
Metilación de ADN , Lenguaje , Genómica , Modelos Biológicos
10.
Micromachines (Basel) ; 13(5)2022 Apr 25.
Artículo en Inglés | MEDLINE | ID: mdl-35630138

RESUMEN

In this paper, we propose an end-to-end automatic walking system for construction machinery, which uses binocular cameras to capture images of construction machinery for environmental perception, detects target information in binocular images, estimates the relative distance between the current target and cameras, and predicts the real-time control signal of construction machinery. This system consists of two parts: the binocular recognition ranging model and the control model. Objects within 5 m can be quickly detected by the recognition ranging model, and at the same time, the distance of the object can be accurately ranged to ensure the full perception of the surrounding environment of the construction machinery. The distance information of the object, the feature information of the binocular image, and the control signal of the previous stage are sent to the control model; then, the prediction of the control signal of the construction machinery can be output in the next stage. In this way, the automatic walking experiment of the construction machinery in a specific scenario is completed, which proves that the model can control the machinery to complete the walking task smoothly and safely.

11.
Brief Bioinform ; 23(1)2022 01 17.
Artículo en Inglés | MEDLINE | ID: mdl-34882225

RESUMEN

Recently, machine learning methods have been developed to identify various peptide bio-activities. However, due to the lack of experimentally validated peptides, machine learning methods cannot provide a sufficiently trained model, easily resulting in poor generalizability. Furthermore, there is no generic computational framework to predict the bioactivities of different peptides. Thus, a natural question is whether we can use limited samples to build an effective predictive model for different kinds of peptides. To address this question, we propose Mutual Information Maximization Meta-Learning (MIMML), a novel meta-learning-based predictive model for bioactive peptide discovery. Using few samples from various functional peptides, MIMML can sufficiently learn the discriminative information amongst various functions and characterize functional differences. Experimental results show excellent performance of MIMML though using far fewer training samples as compared to the state-of-the-art methods. We also decipher the latent relationships among different kinds of functions to understand what meta-model learned to improve a specific task. In summary, this study is a pioneering work in the field of functional peptide mining and provides the first-of-its-kind solution for few-sample learning problems in biological sequence analysis, accelerating the new functional peptide discovery. The source codes and datasets are available on https://github.com/TearsWaiting/MIMML.


Asunto(s)
Aprendizaje Automático , Péptidos , Péptidos/química , Programas Informáticos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA