Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 8 de 8
Filter
Add more filters











Database
Language
Publication year range
1.
J Bioinform Comput Biol ; 22(3): 2450010, 2024 Jun.
Article in English | MEDLINE | ID: mdl-39030668

ABSTRACT

Drugs often target specific metabolic pathways to produce a therapeutic effect. However, these pathways are complex and interconnected, making it challenging to predict a drug's potential effects on an organism's overall metabolism. The mapping of drugs with targeting metabolic pathways in the organisms can provide a more complete understanding of the metabolic effects of a drug and help to identify potential drug-drug interactions. In this study, we proposed a machine learning hybrid model Graph Transformer Integrated Encoder (GTIE-RT) for mapping drugs to target metabolic pathways in human. The proposed model is a composite of a Graph Convolution Network (GCN) and transformer encoder for graph embedding and attention mechanism. The output of the transformer encoder is then fed into the Extremely Randomized Trees Classifier to predict target metabolic pathways. The evaluation of the GTIE-RT on drugs dataset demonstrates excellent performance metrics, including accuracy (>95%), recall (>92%), precision (>93%) and F1-score (>92%). Compared to other variants and machine learning methods, GTIE-RT consistently shows more reliable results.


Subject(s)
Computational Biology , Machine Learning , Metabolic Networks and Pathways , Humans , Computational Biology/methods , Pharmaceutical Preparations/metabolism , Algorithms , Models, Biological , Neural Networks, Computer , Drug Interactions
2.
Comput Biol Chem ; 111: 108106, 2024 Aug.
Article in English | MEDLINE | ID: mdl-38833912

ABSTRACT

Bioretrosynthesis problem is to predict synthetic routes using substrates for given natural products (NPs). However, the huge number of metabolic reactions leads to a combinatorial explosion of searching space, which is high time-consuming and costly. Here, we propose a framework called BioRetro to predict bioretrosynthesis pathways using a one-step bioretrosynthesis network, termed HybridMLP combined with AND-OR tree heuristic search. The HybridMLP predicts precursors that will produce the target NPs, while the AND-OR tree generates the iterative multi-step biosynthetic pathways. The one-step bioretrosynthesis prediction experiments are conducted on MetaNetX dataset by using HybridMLP, which achieves 46.5%, 74.6%, 81.6% in terms of the top-1, top-5, top-10 accuracies. The great performance demonstrates the effectiveness of HybridMLP in one-step bioretrosynthesis. Besides, the evaluation of two benchmark datasets reveals that BioRetro can significantly improve the speed and success rate in predicting biosynthesis pathways. In addition, the BioRetro is further shown to find the synthetic pathway of compounds, such as ginsenoside F1 with the same substrates as reported but different enzymes, which may be the novel potential enzyme to have better catalytic performance.


Subject(s)
Biological Products , Biological Products/metabolism , Biological Products/chemistry , Biosynthetic Pathways , Computational Biology
3.
J Bioinform Comput Biol ; 21(4): 2350017, 2023 08.
Article in English | MEDLINE | ID: mdl-37632195

ABSTRACT

Metabolic pathways play a crucial role in understanding the biochemistry of organisms. In metabolic pathways, modules refer to clusters of interconnected reactions or sub-networks representing specific functional units or biological processes within the overall pathway. In pathway modules, compounds are major elements and refer to the various molecules that participate in the biochemical reactions within the pathway modules. These molecules can include substrates, intermediates and final products. Determining the presence relation of compounds and pathway modules is essential for synthesizing new molecules and predicting hidden reactions. To date, several computational methods have been proposed to address this problem. However, all methods only predict the metabolic pathways and their types, not the pathway modules. To address this issue, we proposed a novel deep learning model, DeepRT that integrates message passing neural networks (MPNNs) and transformer encoder. This combination allows DeepRT to effectively extract global and local structure information from the molecular graph. The model is designed to perform two tasks: first, determining the present relation of the compound with the pathway module, and second, predicting the relation of query compound and module classes. The proposed DeepRT model evaluated on a dataset comprising compounds and pathway modules, and it outperforms existing approaches.


Subject(s)
Deep Learning , Metabolic Networks and Pathways , Neural Networks, Computer
4.
Front Comput Sci ; 17(5): 175903, 2023.
Article in English | MEDLINE | ID: mdl-36532946

ABSTRACT

Prediction of drug-protein binding is critical for virtual drug screening. Many deep learning methods have been proposed to predict the drug-protein binding based on protein sequences and drug representation sequences. However, most existing methods extract features from protein and drug sequences separately. As a result, they can not learn the features characterizing the drug-protein interactions. In addition, the existing methods encode the protein (drug) sequence usually based on the assumption that each amino acid (atom) has the same contribution to the binding, ignoring different impacts of different amino acids (atoms) on the binding. However, the event of drug-protein binding usually occurs between conserved residue fragments in the protein sequence and atom fragments of the drug molecule. Therefore, a more comprehensive encoding strategy is required to extract information from the conserved fragments. In this paper, we propose a novel model, named FragDPI, to predict the drug-protein binding affinity. Unlike other methods, we encode the sequences based on the conserved fragments and encode the protein and drug into a unified vector. Moreover, we adopt a novel two-step training strategy to train FragDPI. The pre-training step is to learn the interactions between different fragments using unsupervised learning. The fine-tuning step is for predicting the binding affinities using supervised learning. The experiment results have illustrated the superiority of FragDPI. Electronic Supplementary Material: Supplementary material is available for this article at 10.1007/s11704-022-2163-9 and is accessible for authorized users.

5.
BMC Bioinformatics ; 23(Suppl 5): 329, 2022 Sep 28.
Article in English | MEDLINE | ID: mdl-36171550

ABSTRACT

BACKGROUND: Making clear what kinds of metabolic pathways a drug compound involves in can help researchers understand how the drug is absorbed, distributed, metabolized, and excreted. The characteristics of a compound such as structure, composition and so on directly determine the metabolic pathways it participates in. METHODS: We developed a novel hybrid framework based on the graph attention network (GAT) to predict the metabolic pathway classes that a compound involves in, named HFGAT, by making use of its global and local characteristics. The framework mainly consists of a two-branch feature extracting layer and a fully connected (FC) layer. In the two-branch feature extracting layer, one branch is responsible to extract global features of the compound; and the other branch introduces a GAT consisting of two graph attention layers to extract local structural features of the compound. Both the global and the local features of the compound are then integrated into the FC layer which outputs the predicted result of metabolic pathway categories that the compound belongs to. RESULTS: We compared the multi-class classification performance of HFGAT with six other representative methods, including five classic machine learning methods and one graph convolutional network (GCN) based deep learning method, on the benchmark dataset containing 6999 compounds belonging to 11 pathway categories. The results showed that the deep learning-based methods (HFGAT, GCN-based method) outperformed the traditional machine learning methods in the prediction of metabolic pathways and our proposed HFGAT method performed better than the GCN-based method. Moreover, HFGAT achieved higher [Formula: see text] scores on 8 of 11 classes than the GCN-based method. CONCLUSIONS: Our proposed HFGAT makes use of both the global and local information of the compounds to predict their metabolic pathway categories and has achieved a significant performance. Compared with the GCN model, the introduction of the GAT can help our model pay more attention to substructures of the compound that are useful for the prediction task. The study provided a potential method for drug discovery with all types of metabolic reactions that may be involved in the decomposition and synthesis of pharmaceutical compounds in the organism.


Subject(s)
Deep Learning , Drug Discovery , Machine Learning , Metabolic Networks and Pathways , Pharmaceutical Preparations
6.
Comput Biol Med ; 147: 105756, 2022 08.
Article in English | MEDLINE | ID: mdl-35759992

ABSTRACT

The rapid increase of metabolomics has led to an increasing focus on metabolic pathway modeling and reconstruction. In particular, reconstructing an organism's metabolic network based on its genome sequence is a key challenge in systems biology. The method used to address this problem predicts the presence or absence of metabolic pathways from known pathways in a reference database. However, this method is based on manual metabolic pathway construction and cannot be used for large genome sequencing data. To address such problems, we apply a supervised machine learning approach consisting of deep neural networks to learn feature representations of metabolic pathways and feed these representations into random forests to predict metabolic pathways. The supervised learning model, DeepRF, predicts all known and unknown metabolic pathways in an organism. Evaluation of DeepRF on over 318,016 instances shows that the model can predict metabolic pathways with high-performance metrics accuracy (>97%), recall (>95%), and precision (>99%). Comparing DeepRF with other methods in the literature shows that DeepRF produces more reliable results than other methods.


Subject(s)
Deep Learning , Databases, Factual , Genome , Metabolic Networks and Pathways/genetics , Neural Networks, Computer
7.
Front Mol Biosci ; 8: 634141, 2021.
Article in English | MEDLINE | ID: mdl-34222327

ABSTRACT

Prediction and reconstruction of metabolic pathways play significant roles in many fields such as genetic engineering, metabolic engineering, drug discovery, and are becoming the most active research topics in synthetic biology. With the increase of related data and with the development of machine learning techniques, there have many machine leaning based methods been proposed for prediction or reconstruction of metabolic pathways. Machine learning techniques are showing state-of-the-art performance to handle the rapidly increasing volume of data in synthetic biology. To support researchers in this field, we briefly review the research progress of metabolic pathway reconstruction and prediction based on machine learning. Some challenging issues in the reconstruction of metabolic pathways are also discussed in this paper.

8.
Evol Bioinform Online ; 15: 1176934318821080, 2019.
Article in English | MEDLINE | ID: mdl-30733625

ABSTRACT

Simulated alignments are alternatives to manually constructed multiple sequence alignments for evaluating performance of multiple sequence alignment tools. The importance of simulated sequences is recognized because their true evolutionary history is known, which is very helpful for reconstructing accurate phylogenetic trees and alignments. However, generating simulated alignments require expertise to use bioinformatics tools and consume several hours for reconstructing even a few hundreds of simulated sequences. It becomes a tedious job for an end user who needs a few datasets of variety of simulated sequences. Currently, there is no databank available which may help researchers to download simulated sequences/alignments for their study. Major focus of our study was to develop a database of simulated protein sequences (SAliBASE) based on different varying parameters such as insertion rate, deletion rate, sequence length, number of sequences, and indel size. Each dataset has corresponding alignment as well. This repository is very useful for evaluating multiple alignment methods.

SELECTION OF CITATIONS
SEARCH DETAIL