Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 56
Filter
1.
Comput Biol Med ; 178: 108699, 2024 Jun 04.
Article in English | MEDLINE | ID: mdl-38870725

ABSTRACT

Accurate prediction of drug-target binding affinity (DTA) plays a pivotal role in drug discovery and repositioning. Although deep learning methods are widely used in DTA prediction, two significant challenges persist: (i) how to effectively represent the complex structural information of proteins and drugs; (ii) how to precisely model the mutual interactions between protein binding sites and key drug substructures. To address these challenges, we propose a MSFFDTA (Multi-scale feature fusion for predicting drug target affinity) model, in which multi-scale encoders effectively capture multi-level structural information of drugs and proteins are designed. And then a Selective Cross Attention (SCA) mechanism is developed to filter out the trivial interactions between drug-protein substructure pairs and retain the important ones, which will make the proposed model better focusing on these key interactions and offering insights into their underlying mechanism. Experimental results on two benchmark datasets demonstrate that MSFFDTA is superior to several state-of-the-art methods across almost all comparison metrics. Finally, we provide the ablation and case studies with visualizations to verify the effectiveness and the interpretability of MSFFDTA. The source code is freely available at https://github.com/whitehat32/MSFF-DTA/.

2.
Comput Struct Biotechnol J ; 23: 1978-1989, 2024 Dec.
Article in English | MEDLINE | ID: mdl-38765608

ABSTRACT

With both the advancement of technology and the decline in costs, single-cell transcriptomics sequencing has become widespread in the biomedical area in recent years. It can facilitate the pathogenic characteristics at the single-cell level, which will assist clinical researchers in exploring the mechanism of diseases. As a result, single-cell transcriptome data based on clinical samples grew exponentially. However, there is still a lack of a comprehensive database about immunocytes in inflammatory-associated diseases. To address this deficiency, we propose a human inflammatory-associated disease-based single-cell transcriptome database, NTCdb (www.ntcdb.org.cn). NTCdb integrates the open-source data of 1,023,166 cells derived from 11 tissues of 17 inflammatory-associated diseases in a uniform pipeline. It provides a set of analyzing results, including cell communication analysis, enrichment analysis, and Pseudo-Time analysis, to obtain various characteristics of immune cells in inflammatory-associated disease. Taking COVID-19 as a case study, NTCdb displays important information including potentially significant functions of certain cells, genes, and signaling pathways, as well as the commonalities of specific immunocytes between different inflammatory-associated disease.

3.
Brief Bioinform ; 25(3)2024 Mar 27.
Article in English | MEDLINE | ID: mdl-38701419

ABSTRACT

It is a vital step to recognize cyanobacteria promoters on a genome-wide scale. Computational methods are promising to assist in difficult biological identification. When building recognition models, these methods rely on non-promoter generation to cope with the lack of real non-promoters. Nevertheless, the factitious significant difference between promoters and non-promoters causes over-optimistic prediction. Moreover, designed for E. coli or B. subtilis, existing methods cannot uncover novel, distinct motifs among cyanobacterial promoters. To address these issues, this work first proposes a novel non-promoter generation strategy called phantom sampling, which can eliminate the factitious difference between promoters and generated non-promoters. Furthermore, it elaborates a novel promoter prediction model based on the Siamese network (SiamProm), which can amplify the hidden difference between promoters and non-promoters through a joint characterization of global associations, upstream and downstream contexts, and neighboring associations w.r.t. k-mer tokens. The comparison with state-of-the-art methods demonstrates the superiority of our phantom sampling and SiamProm. Both comprehensive ablation studies and feature space illustrations also validate the effectiveness of the Siamese network and its components. More importantly, SiamProm, upon our phantom sampling, finds a novel cyanobacterial promoter motif ('GCGATCGC'), which is palindrome-patterned, content-conserved, but position-shifted.


Subject(s)
Cyanobacteria , Promoter Regions, Genetic , Cyanobacteria/genetics , Computational Biology/methods , Algorithms
4.
Methods ; 222: 51-56, 2024 Feb.
Article in English | MEDLINE | ID: mdl-38184219

ABSTRACT

The interaction between human microbes and drugs can significantly impact human physiological functions. It is crucial to identify potential microbe-drug associations (MDAs) before drug administration. However, conventional biological experiments to predict MDAs are plagued by drawbacks such as time-consuming, high costs, and potential risks. On the contrary, computational approaches can speed up the screening of MDAs at a low cost. Most computational models usually use a drug similarity matrix as the initial feature representation of drugs and stack the graph neural network layers to extract the features of network nodes. However, different calculation methods result in distinct similarity matrices, and message passing in graph neural networks (GNNs) induces phenomena of over-smoothing and over-squashing, thereby impacting the performance of the model. To address these issues, we proposed a novel graph representation learning model, dual-modal graph learning for microbe-drug association prediction (DMGL-MDA). It comprises a dual-modal embedding module, a bipartite graph network embedding module, and a predictor module. To assess the performance of DMGL-MDA, we compared it against state-of-the-art methods using two benchmark datasets. Through cross-validation, we illustrated the superiority of DMGL-MDA. Furthermore, we conducted ablation experiments and case studies to validate the effective performance of the model.


Subject(s)
Benchmarking , Neural Networks, Computer , Humans , Research Design
5.
J Chem Inf Model ; 64(1): 96-109, 2024 Jan 08.
Article in English | MEDLINE | ID: mdl-38132638

ABSTRACT

Detecting drug-drug interactions (DDIs) is an essential step in drug development and drug administration. Given the shortcomings of current experimental methods, the machine learning (ML) approach has become a reliable alternative, attracting extensive attention from the academic and industrial fields. With the rapid development of computational science and the growing popularity of cross-disciplinary research, a large number of DDI prediction studies based on ML methods have been published in recent years. To give an insight into the current situation and future direction of DDI prediction research, we systemically review these studies from three aspects: (1) the classic DDI databases, mainly including databases of drugs, side effects, and DDI information; (2) commonly used drug attributes, which focus on chemical, biological, and phenotypic attributes for representing drugs; (3) popular ML approaches, such as shallow learning-based, deep learning-based, recommender system-based, and knowledge graph-based methods for DDI detection. For each section, related studies are described, summarized, and compared, respectively. In the end, we conclude the research status of DDI prediction based on ML methods and point out the existing issues, future challenges, potential opportunities, and subsequent research direction.


Subject(s)
Knowledge Bases , Machine Learning , Drug Interactions , Pharmaceutical Preparations , Databases, Factual
6.
EClinicalMedicine ; 65: 102270, 2023 Nov.
Article in English | MEDLINE | ID: mdl-38106558

ABSTRACT

Background: Prognosis is crucial for personalized treatment and surveillance suggestion of the resected non-small-cell lung cancer (NSCLC) patients in stage I-III. Although the tumor-node-metastasis (TNM) staging system is a powerful predictor, it is not perfect enough to accurately distinguish all the patients, especially within the same TNM stage. In this study, we developed an intelligent prognosis evaluation system (IPES) using pre-therapy CT images to assist the traditional TNM staging system for more accurate prognosis prediction of resected NSCLC patients. Methods: 20,333 CT images of 6371 patients from June 12, 2009 to March 24, 2022 in West China Hospital of Sichuan University, Mianzhu People's Hospital, Peking University People's Hospital, Chengdu Shangjin Nanfu Hospital and Guangan Peoples' Hospital were included in this retrospective study. We developed the IPES based on self-supervised pre-training and multi-task learning, which aimed to predict an overall survival (OS) risk for each patient. We further evaluated the prognostic accuracy of the IPES and its ability to stratify NSCLC patients with the same TNM stage and with the same EGFR genotype. Findings: The IPES was able to predict OS risk for stage I-III resected NSCLC patients in the training set (C-index 0.806; 95% CI: 0.744-0.846), internal validation set (0.783; 95% CI: 0.744-0.825) and external validation set (0.817; 95% CI: 0.786-0.849). In addition, IPES performed well in early-stage (stage I) and EGFR genotype prediction. Furthermore, by adopting IPES-based survival score (IPES-score), resected NSCLC patients in the same stage or with the same EGFR genotype could be divided into low- and high-risk subgroups with good and poor prognosis, respectively (p < 0.05 for all). Interpretation: The IPES provided a non-invasive way to obtain prognosis-related information from patients. The identification of IPES for resected NSCLC patients with low and high prognostic risk in the same TNM stage or with the same EGFR genotype suggests that IPES have potential to offer more personalized treatment and surveillance suggestion for NSCLC patients. Funding: This study was funded by the National Natural Science Foundation of China (grant 62272055, 92259303, 92059203), New Cornerstone Science Foundation through the XPLORER PRIZE, Young Elite Scientists Sponsorship Program by CAST (2021QNRC001), Clinical Medicine Plus X - Young Scholars Project, Peking University, the Fundamental Research Funds for the Central Universities (K.C.), Research Unit of Intelligence Diagnosis and Treatment in Early Non-small Cell Lung Cancer, Chinese Academy of Medical Sciences (2021RU002), BUPT Excellent Ph.D. Students Foundation (CX2022104).

7.
iScience ; 26(11): 108285, 2023 Nov 17.
Article in English | MEDLINE | ID: mdl-38026198

ABSTRACT

It is a critical step in lead optimization to evaluate the absorption, distribution, metabolism, excretion, and toxicity (ADMET) properties of drug-like compounds. Classical single-task learning (STL) has effectively predicted individual ADMET endpoints with abundant labels. Conversely, multi-task learning (MTL) can predict multiple ADMET endpoints with fewer labels, but ensuring task synergy and highlighting key molecular substructures remain challenges. To tackle these issues, this work elaborates a multi-task graph learning framework for predicting multiple ADMET properties of drug-like small molecules (MTGL-ADMET) by holding a new paradigm of MTL, "one primary, multiple auxiliaries." It first adeptly combines status theory with maximum flow for auxiliary task selection. The subsequent phase introduces a primary-task-centric MTL model with integrated modules. MTGL-ADMET not only outstrips existing STL and MTL methods but also offers a transparent lens into crucial molecular substructures. It is anticipated that this work can promote lead compound finding and optimization in drug discovery.

8.
Bioinformatics ; 39(8)2023 08 01.
Article in English | MEDLINE | ID: mdl-37572298

ABSTRACT

MOTIVATION: Metabolic stability plays a crucial role in the early stages of drug discovery and development. Accurately modeling and predicting molecular metabolic stability has great potential for the efficient screening of drug candidates as well as the optimization of lead compounds. Considering wet-lab experiment is time-consuming, laborious, and expensive, in silico prediction of metabolic stability is an alternative choice. However, few computational methods have been developed to address this task. In addition, it remains a significant challenge to explain key functional groups determining metabolic stability. RESULTS: To address these issues, we develop a novel cross-modality graph contrastive learning model named CMMS-GCL for predicting the metabolic stability of drug candidates. In our framework, we design deep learning methods to extract features for molecules from two modality data, i.e. SMILES sequence and molecule graph. In particular, for the sequence data, we design a multihead attention BiGRU-based encoder to preserve the context of symbols to learn sequence representations of molecules. For the graph data, we propose a graph contrastive learning-based encoder to learn structure representations by effectively capturing the consistencies between local and global structures. We further exploit fully connected neural networks to combine the sequence and structure representations for model training. Extensive experimental results on two datasets demonstrate that our CMMS-GCL consistently outperforms seven state-of-the-art methods. Furthermore, a collection of case studies on sequence data and statistical analyses of the graph structure module strengthens the validation of the interpretability of crucial functional groups recognized by CMMS-GCL. Overall, CMMS-GCL can serve as an effective and interpretable tool for predicting metabolic stability, identifying critical functional groups, and thus facilitating the drug discovery process and lead compound optimization. AVAILABILITY AND IMPLEMENTATION: The code and data underlying this article are freely available at https://github.com/dubingxue/CMMS-GCL.


Subject(s)
Drug Discovery , Neural Networks, Computer , Research Design
9.
Biosensors (Basel) ; 13(7)2023 Jul 06.
Article in English | MEDLINE | ID: mdl-37504111

ABSTRACT

Spatial profiling technologies fill the gap left by the loss of spatial information in traditional single-cell sequencing, showing great application prospects. After just a few years of quick development, spatial profiling technologies have made great progress in resolution and simplicity. This review introduces the development of spatial omics sequencing based on microfluidic array chips and describes barcoding strategies using various microfluidic designs with simplicity and efficiency. At the same time, the pros and cons of each strategy are compared. Moreover, commercialized solutions for spatial profiling are also introduced. In the end, the future perspective of spatial omics sequencing and research directions are discussed.


Subject(s)
Microfluidics
10.
Brief Bioinform ; 24(4)2023 07 20.
Article in English | MEDLINE | ID: mdl-37401373

ABSTRACT

Recent advances and achievements of artificial intelligence (AI) as well as deep and graph learning models have established their usefulness in biomedical applications, especially in drug-drug interactions (DDIs). DDIs refer to a change in the effect of one drug to the presence of another drug in the human body, which plays an essential role in drug discovery and clinical research. DDIs prediction through traditional clinical trials and experiments is an expensive and time-consuming process. To correctly apply the advanced AI and deep learning, the developer and user meet various challenges such as the availability and encoding of data resources, and the design of computational methods. This review summarizes chemical structure based, network based, natural language processing based and hybrid methods, providing an updated and accessible guide to the broad researchers and development community with different domain knowledge. We introduce widely used molecular representation and describe the theoretical frameworks of graph neural network models for representing molecular structures. We present the advantages and disadvantages of deep and graph learning methods by performing comparative experiments. We discuss the potential technical challenges and highlight future directions of deep and graph learning models for accelerating DDIs prediction.


Subject(s)
Artificial Intelligence , Neural Networks, Computer , Humans , Drug Interactions , Natural Language Processing , Drug Discovery
11.
Bioinformatics ; 39(39 Suppl 1): i326-i336, 2023 06 30.
Article in English | MEDLINE | ID: mdl-37387157

ABSTRACT

MOTIVATION: Deep learning-based molecule generation becomes a new paradigm of de novo molecule design since it enables fast and directional exploration in the vast chemical space. However, it is still an open issue to generate molecules, which bind to specific proteins with high-binding affinities while owning desired drug-like physicochemical properties. RESULTS: To address these issues, we elaborate a novel framework for controllable protein-oriented molecule generation, named CProMG, which contains a 3D protein embedding module, a dual-view protein encoder, a molecule embedding module, and a novel drug-like molecule decoder. Based on fusing the hierarchical views of proteins, it enhances the representation of protein binding pockets significantly by associating amino acid residues with their comprising atoms. Through jointly embedding molecule sequences, their drug-like properties, and binding affinities w.r.t. proteins, it autoregressively generates novel molecules having specific properties in a controllable manner by measuring the proximity of molecule tokens to protein residues and atoms. The comparison with state-of-the-art deep generative methods demonstrates the superiority of our CProMG. Furthermore, the progressive control of properties demonstrates the effectiveness of CProMG when controlling binding affinity and drug-like properties. After that, the ablation studies reveal how its crucial components contribute to the model respectively, including hierarchical protein views, Laplacian position encoding as well as property control. Last, a case study w.r.t. protein illustrates the novelty of CProMG and the ability to capture crucial interactions between protein pockets and molecules. It's anticipated that this work can boost de novo molecule design. AVAILABILITY AND IMPLEMENTATION: The code and data underlying this article are freely available at https://github.com/lijianing0902/CProMG.


Subject(s)
Amino Acids , Deep Learning , Protein Engineering
12.
Phytomedicine ; 117: 154929, 2023 Aug.
Article in English | MEDLINE | ID: mdl-37329754

ABSTRACT

BACKGROUND: Triptolide (TP) is a highly active natural medicinal ingredient with significant potential in anticancer. The strong cytotoxicity of this compound suggests that it may have a wide range of targets within cells. However, further target screening is required at this stage. Traditional drug target screening methods can be significantly optimized using artificial intelligence (AI). PURPOSE:  This study aimed to identify the direct protein targets and explain the multitarget action mechanism of the anti-tumor effect of TP with the help of AI. METHODS:  The CCK8, scratch test, and flow cytometry analysis were used to examine cell proliferation, migration, cell cycle, and apoptosis in tumor cells treated with TP in vitro. The anti-tumor effect of TP in vivo was evaluated by constructing a tumor model in nude mice. Furthermore, we established a simplified thermal proteome analysis (TPP) method based on XGBoost (X-TPP) to rapidly screen the direct targets of TP. RESULTS: We validated the effects of TP on protein targets through RNA immunoprecipitation and pathways by qPCR and Western blotting. TP significantly inhibited tumor cell proliferation and migration and promoted apoptosis in vitro. Continuous administration of TP to tumor mice can significantly suppress tumor tissue size. We verified that TP can affect the thermal stability of HnRNP A2/B1 and exert anti-tumor effects by inhibiting HnRNP A2/B1-PI3K-AKT pathway. Adding siRNA to silence HnRNP A2/B1 also significantly down-regulated expression of AKT and PI3K. CONCLUSION: The X-TPP method was used to show that TP regulates tumor cell activity through its potential interaction with HnRNP A2/B1.


Subject(s)
Lung Neoplasms , Proteome , Animals , Mice , Proto-Oncogene Proteins c-akt/metabolism , Mice, Nude , Phosphatidylinositol 3-Kinases/metabolism , Artificial Intelligence , Lung Neoplasms/pathology
13.
Brief Bioinform ; 24(4)2023 07 20.
Article in English | MEDLINE | ID: mdl-37195815

ABSTRACT

Drug-drug interactions (DDI) may lead to adverse reactions in human body and accurate prediction of DDI can mitigate the medical risk. Currently, most of computer-aided DDI prediction methods construct models based on drug-associated features or DDI network, ignoring the potential information contained in drug-related biological entities such as targets and genes. Besides, existing DDI network-based models could not make effective predictions for drugs without any known DDI records. To address the above limitations, we propose an attention-based cross domain graph neural network (ACDGNN) for DDI prediction, which considers the drug-related different entities and propagate information through cross domain operation. Different from the existing methods, ACDGNN not only considers rich information contained in drug-related biomedical entities in biological heterogeneous network, but also adopts cross-domain transformation to eliminate heterogeneity between different types of entities. ACDGNN can be used in the prediction of DDIs in both transductive and inductive setting. By conducting experiments on real-world dataset, we compare the performance of ACDGNN with several state-of-the-art methods. The experimental results show that ACDGNN can effectively predict DDIs and outperform the comparison models.


Subject(s)
Neural Networks, Computer , Humans , Drug Interactions
14.
IEEE Trans Pattern Anal Mach Intell ; 45(8): 9709-9725, 2023 Aug.
Article in English | MEDLINE | ID: mdl-37027608

ABSTRACT

Predicting drug synergy is critical to tailoring feasible drug combination treatment regimens for cancer patients. However, most of the existing computational methods only focus on data-rich cell lines, and hardly work on data-poor cell lines. To this end, here we proposed a novel few-shot drug synergy prediction method (called HyperSynergy) for data-poor cell lines by designing a prior-guided Hypernetwork architecture, in which the meta-generative network based on the task embedding of each cell line generates cell line dependent parameters for the drug synergy prediction network. In HyperSynergy model, we designed a deep Bayesian variational inference model to infer the prior distribution over the task embedding to quickly update the task embedding with a few labeled drug synergy samples, and presented a three-stage learning strategy to train HyperSynergy for quickly updating the prior distribution by a few labeled drug synergy samples of each data-poor cell line. Moreover, we proved theoretically that HyperSynergy aims to maximize the lower bound of log-likelihood of the marginal distribution over each data-poor cell line. The experimental results show that our HyperSynergy outperforms other state-of-the-art methods not only on data-poor cell lines with a few samples (e.g., 10, 5, 0), but also on data-rich cell lines.


Subject(s)
Computational Biology , Neoplasms , Humans , Computational Biology/methods , Algorithms , Bayes Theorem , Neoplasms/drug therapy
15.
Brief Bioinform ; 24(1)2023 01 19.
Article in English | MEDLINE | ID: mdl-36642408

ABSTRACT

Current machine learning-based methods have achieved inspiring predictions in the scenarios of mono-type and multi-type drug-drug interactions (DDIs), but they all ignore enhancive and depressive pharmacological changes triggered by DDIs. In addition, these pharmacological changes are asymmetric since the roles of two drugs in an interaction are different. More importantly, these pharmacological changes imply significant topological patterns among DDIs. To address the above issues, we first leverage Balance theory and Status theory in social networks to reveal the topological patterns among directed pharmacological DDIs, which are modeled as a signed and directed network. Then, we design a novel graph representation learning model named SGRL-DDI (social theory-enhanced graph representation learning for DDI) to realize the multitask prediction of DDIs. SGRL-DDI model can capture the task-joint information by integrating relation graph convolutional networks with Balance and Status patterns. Moreover, we utilize task-specific deep neural networks to perform two tasks, including the prediction of enhancive/depressive DDIs and the prediction of directed DDIs. Based on DDI entries collected from DrugBank, the superiority of our model is demonstrated by the comparison with other state-of-the-art methods. Furthermore, the ablation study verifies that Balance and Status patterns help characterize directed pharmacological DDIs, and that the joint of two tasks provides better DDI representations than individual tasks. Last, we demonstrate the practical effectiveness of our model by a version-dependent test, where 88.47 and 81.38% DDI out of newly added entries provided by the latest release of DrugBank are validated in two predicting tasks respectively.


Subject(s)
Machine Learning , Neural Networks, Computer , Drug Interactions
16.
IEEE/ACM Trans Comput Biol Bioinform ; 20(3): 1854-1863, 2023.
Article in English | MEDLINE | ID: mdl-36423315

ABSTRACT

Co-administration of multiple drugs may cause adverse drug interactions and side effects that damage the body. Therefore, accurate prediction of drug-drug interaction (DDI) events is of great importance. Recently, many computational methods have been proposed for predicting DDI associated events. However, most existing methods merely considered drug associated attribute information or topological information in DDI network, ignoring the complementary knowledge between them. Therefore, to effectively explore the complementarity of drug attribute and topological information of DDI network, we propose a deep learning model based adversarial learning strategy, which is named as DGANDDI. In DGANDDI, we design a two-GAN architecture to deeply capture the complementary knowledge between drug attribute and topological information of DDI network, thus more comprehensive drug representations can be learned. We conduct extensive experiments on real world dataset. The experimental results show that DGANDDI can effectively predict DDI occurrence and outperforms the comparison of the state-of-the-art models. We also perform ablation studies that demonstrate that DGANDDI is effective and that it is robust in DDI prediction tasks, even in the case of a scarcity of labeled DDIs.


Subject(s)
Drug-Related Side Effects and Adverse Reactions , Humans , Drug Interactions
17.
Brief Bioinform ; 23(4)2022 07 18.
Article in English | MEDLINE | ID: mdl-35667078

ABSTRACT

Computational prediction of multiple-type drug-drug interaction (DDI) helps reduce unexpected side effects in poly-drug treatments. Although existing computational approaches achieve inspiring results, they ignore to study which local structures of drugs cause DDIs, and their interpretability is still weak. In this paper, by supposing that the interactions between two given drugs are caused by their local chemical structures (substructures) and their DDI types are determined by the linkages between different substructure sets, we design a novel Substructure-aware Tensor Neural Network model for DDI prediction (STNN-DDI). The proposed model learns a 3-D tensor of $\langle $  substructure, substructure, interaction type  $\rangle $ triplets, which characterizes a substructure-substructure interaction (SSI) space. According to a list of predefined substructures with specific chemical meanings, the mapping of drugs into this SSI space enables STNN-DDI to perform the multiple-type DDI prediction in both transductive and inductive scenarios in a unified form with an explicable manner. The comparison with deep learning-based state-of-the-art baselines demonstrates the superiority of STNN-DDI with the significant improvement of AUC, AUPR, Accuracy and Precision. More importantly, case studies illustrate its interpretability by both revealing an important substructure pair across drugs regarding a DDI type of interest and uncovering interaction type-specific substructure pairs in a given DDI. In summary, STNN-DDI provides an effective approach to predicting DDIs as well as explaining the interaction mechanisms among drugs. Source code is freely available at https://github.com/zsy-9/STNN-DDI.


Subject(s)
Drug-Related Side Effects and Adverse Reactions , Neural Networks, Computer , Data Collection , Drug Interactions , Humans , Software
18.
Front Microbiol ; 13: 944952, 2022.
Article in English | MEDLINE | ID: mdl-35707165

ABSTRACT

[This corrects the article DOI: 10.3389/fmicb.2022.846915.].

19.
Bioinformatics ; 38(Suppl 1): i325-i332, 2022 06 24.
Article in English | MEDLINE | ID: mdl-35758801

ABSTRACT

MOTIVATION: During lead compound optimization, it is crucial to identify pathways where a drug-like compound is metabolized. Recently, machine learning-based methods have achieved inspiring progress to predict potential metabolic pathways for drug-like compounds. However, they neglect the knowledge that metabolic pathways are dependent on each other. Moreover, they are inadequate to elucidate why compounds participate in specific pathways. RESULTS: To address these issues, we propose a novel Multi-Label Graph Learning framework of Metabolic Pathway prediction boosted by pathway interdependence, called MLGL-MP, which contains a compound encoder, a pathway encoder and a multi-label predictor. The compound encoder learns compound embedding representations by graph neural networks. After constructing a pathway dependence graph by re-trained word embeddings and pathway co-occurrences, the pathway encoder learns pathway embeddings by graph convolutional networks. Moreover, after adapting the compound embedding space into the pathway embedding space, the multi-label predictor measures the proximity of two spaces to discriminate which pathways a compound participates in. The comparison with state-of-the-art methods on KEGG pathways demonstrates the superiority of our MLGL-MP. Also, the ablation studies reveal how its three components contribute to the model, including the pathway dependence, the adapter between compound embeddings and pathway embeddings, as well as the pre-training strategy. Furthermore, a case study illustrates the interpretability of MLGL-MP by indicating crucial substructures in a compound, which are significantly associated with the attending metabolic pathways. It is anticipated that this work can boost metabolic pathway predictions in drug discovery. AVAILABILITY AND IMPLEMENTATION: The code and data underlying this article are freely available at https://github.com/dubingxue/MLGL-MP.


Subject(s)
Machine Learning , Neural Networks, Computer , Drug Discovery , Metabolic Networks and Pathways , Software
20.
Front Microbiol ; 13: 846915, 2022.
Article in English | MEDLINE | ID: mdl-35479616

ABSTRACT

Many drugs can be metabolized by human microbes; the drug metabolites would significantly alter pharmacological effects and result in low therapeutic efficacy for patients. Hence, it is crucial to identify potential drug-microbe associations (DMAs) before the drug administrations. Nevertheless, traditional DMA determination cannot be applied in a wide range due to the tremendous number of microbe species, high costs, and the fact that it is time-consuming. Thus, predicting possible DMAs in computer technology is an essential topic. Inspired by other issues addressed by deep learning, we designed a deep learning-based model named Nearest Neighbor Attention Network (NNAN). The proposed model consists of four components, namely, a similarity network constructor, a nearest-neighbor aggregator, a feature attention block, and a predictor. In brief, the similarity block contains a microbe similarity network and a drug similarity network. The nearest-neighbor aggregator generates the embedding representations of drug-microbe pairs by integrating drug neighbors and microbe neighbors of each drug-microbe pair in the network. The feature attention block evaluates the importance of each dimension of drug-microbe pair embedding by a set of ordinary multi-layer neural networks. The predictor is an ordinary fully-connected deep neural network that functions as a binary classifier to distinguish potential DMAs among unlabeled drug-microbe pairs. Several experiments on two benchmark databases are performed to evaluate the performance of NNAN. First, the comparison with state-of-the-art baseline approaches demonstrates the superiority of NNAN under cross-validation in terms of predicting performance. Moreover, the interpretability inspection reveals that a drug tends to associate with a microbe if it finds its top-l most similar neighbors that associate with the microbe.

SELECTION OF CITATIONS
SEARCH DETAIL
...