Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 154
Filtrar
1.
Bioinformatics ; 40(2)2024 02 01.
Artigo em Inglês | MEDLINE | ID: mdl-38269610

RESUMO

MOTIVATION: The human microbiome may impact the effectiveness of drugs by modulating their activities and toxicities. Predicting candidate microbes for drugs can facilitate the exploration of the therapeutic effects of drugs. Most recent methods concentrate on constructing of the prediction models based on graph reasoning. They fail to sufficiently exploit the topology and position information, the heterogeneity of multiple types of nodes and connections, and the long-distance correlations among nodes in microbe-drug heterogeneous graph. RESULTS: We propose a new microbe-drug association prediction model, NGMDA, to encode the position and topological features of microbe (drug) nodes, and fuse the different types of features from neighbors and the whole heterogeneous graph. First, we formulate the position and topology features of microbe (drug) nodes by t-step random walks, and the features reveal the topological neighborhoods at multiple scales and the position of each node. Second, as the features of nodes are high-dimensional and sparse, we designed an embedding enhancement strategy based on supervised fully connected autoencoders to form the embeddings with representative features and the more discriminative node distributions. Third, we propose an adaptive neighbor feature fusion module, which fuses features of neighbors by the constructed position- and topology-sensitive heterogeneous graph neural networks. A novel self-attention mechanism is developed to estimate the importance of the position and topology of each neighbor to a target node. Finally, a heterogeneous graph feature fusion module is constructed to learn the long-distance correlations among the nodes in the whole heterogeneous graph by a relationship-aware graph transformer. Relationship-aware graph transformer contains the strategy for encoding the connection relationship types among the nodes, which is helpful for integrating the diverse semantics of these connections. The extensive comparison experimental results demonstrate NGMDA's superior performance over five state-of-the-art prediction methods. The ablation experiment shows the contributions of the multi-scale topology and position feature learning, the embedding enhancement strategy, the neighbor feature fusion, and the heterogeneous graph feature fusion. Case studies over three drugs further indicate that NGMDA has ability in discovering the potential drug-related microbes. AVAILABILITY AND IMPLEMENTATION: Source codes and Supplementary Material are available at https://github.com/pingxuan-hlju/NGMDA.


Assuntos
Redes Neurais de Computação , Semântica , Humanos , Software
2.
Bioinformatics ; 40(4)2024 Mar 29.
Artigo em Inglês | MEDLINE | ID: mdl-38561176

RESUMO

MOTIVATION: Understanding the intermolecular interactions of ligand-target pairs is key to guiding the optimization of drug research on cancers, which can greatly mitigate overburden workloads for wet labs. Several improved computational methods have been introduced and exhibit promising performance for these identification tasks, but some pitfalls restrict their practical applications: (i) first, existing methods do not sufficiently consider how multigranular molecule representations influence interaction patterns between proteins and compounds; and (ii) second, existing methods seldom explicitly model the binding sites when an interaction occurs to enable better prediction and interpretation, which may lead to unexpected obstacles to biological researchers. RESULTS: To address these issues, we here present DrugMGR, a deep multigranular drug representation model capable of predicting binding affinities and regions for each ligand-target pair. We conduct consistent experiments on three benchmark datasets using existing methods and introduce a new specific dataset to better validate the prediction of binding sites. For practical application, target-specific compound identification tasks are also carried out to validate the capability of real-world compound screen. Moreover, the visualization of some practical interaction scenarios provides interpretable insights from the results of the predictions. The proposed DrugMGR achieves excellent overall performance in these datasets, exhibiting its advantages and merits against state-of-the-art methods. Thus, the downstream task of DrugMGR can be fine-tuned for identifying the potential compounds that target proteins for clinical treatment. AVAILABILITY AND IMPLEMENTATION: https://github.com/lixiaokun2020/DrugMGR.


Assuntos
Proteínas , Ligantes , Proteínas/química , Sítios de Ligação
3.
Brief Bioinform ; 23(5)2022 09 20.
Artigo em Inglês | MEDLINE | ID: mdl-35514190

RESUMO

MOTIVATION: Accurate identification of proteins interacted with drugs helps reduce the time and cost of drug development. Most of previous methods focused on integrating multisource data about drugs and proteins for predicting drug-target interactions (DTIs). There are both similarity connection and interaction connection between two drugs, and these connections reflect their relationships from different perspectives. Similarly, two proteins have various connections from multiple perspectives. However, most of previous methods failed to deeply integrate these connections. In addition, multiple drug-protein heterogeneous networks can be constructed based on multiple kinds of connections. The diverse topological structures of these networks are still not exploited completely. RESULTS: We propose a novel model to extract and integrate multi-type neighbor topology information, diverse similarities and interactions related to drugs and proteins. Firstly, multiple drug-protein heterogeneous networks are constructed according to multiple kinds of connections among drugs and those among proteins. The multi-type neighbor node sequences of a drug node (or a protein node) are formed by random walks on each network and they reflect the hidden neighbor topological structure of the node. Secondly, a module based on graph neural network (GNN) is proposed to learn the multi-type neighbor topologies of each node. We propose attention mechanisms at neighbor node level and at neighbor type level to learn more informative neighbor nodes and neighbor types. A network-level attention is also designed to enhance the context dependency among multiple neighbor topologies of a pair of drug and protein nodes. Finally, the attribute embedding of the drug-protein pair is formulated by a proposed embedding strategy, and the embedding covers the similarities and interactions about the pair. A module based on three-dimensional convolutional neural networks (CNN) is constructed to deeply integrate pairwise attributes. Extensive experiments have been performed and the results indicate GCDTI outperforms several state-of-the-art prediction methods. The recall rate estimation over the top-ranked candidates and case studies on 5 drugs further demonstrate GCDTI's ability in discovering potential drug-protein interactions.


Assuntos
Algoritmos , Redes Neurais de Computação , Desenvolvimento de Medicamentos , Interações Medicamentosas , Aprendizagem
4.
Brief Bioinform ; 23(3)2022 05 13.
Artigo em Inglês | MEDLINE | ID: mdl-35470853

RESUMO

MOTIVATION: Computerized methods for drug-related side effect identification can help reduce costs and speed up drug development. Multisource data about drug and side effects are widely used to predict potential drug-related side effects. Heterogeneous graphs are commonly used to associate multisourced data of drugs and side effects which can reflect similarities of the drugs from different perspectives. Effective integration and formulation of diverse similarities, however, are challenging. In addition, the specific topology of each heterogeneous graph and the common topology of multiple graphs are neglected. RESULTS: We propose a drug-side effect association prediction model, GCRS, to encode and integrate specific topologies, common topologies and pairwise attributes of drugs and side effects. First, multiple drug-side effect heterogeneous graphs are constructed using various kinds of similarities and associations related to drugs and side effects. As each heterogeneous graph has its specific topology, we establish separate module based on graph convolutional autoencoder (GCA) to learn the particular topology representation of each drug node and each side effect node, respectively. Since multiple graphs reflect the complex relationships among the drug and side effect nodes and contain common topologies, we construct a module based on GCA with sharing parameters to learn the common topology representations of each node. Afterwards, we design an attention mechanism to obtain more informative topology representations at the representation level. Finally, multi-layer convolutional neural networks with attribute-level attention are constructed to deeply integrate the similarity and association attributes of a pair of drug-side effect nodes. Comprehensive experiments show that GCRS's prediction performance is superior to other comparing state-of-the-art methods for predicting drug-side effect associations. The recall rates in top-ranked candidates and case studies on five drugs further demonstrate GCRS's ability in discovering potential drug-related side effects. CONTACT: zhang@hlju.edu.cn.


Assuntos
Algoritmos , Redes Neurais de Computação , Desenvolvimento de Medicamentos/métodos
5.
Brief Bioinform ; 23(3)2022 05 13.
Artigo em Inglês | MEDLINE | ID: mdl-35362511

RESUMO

Since abnormal expression of long noncoding RNAs (lncRNAs) is often closely related to various human diseases, identification of disease-associated lncRNAs is helpful for exploring the complex pathogenesis. Most of recent methods concentrate on exploiting multiple kinds of data related to lncRNAs and diseases for predicting candidate disease-related lncRNAs. These methods, however, failed to deeply integrate the topology information from the meta-paths that are composed of lncRNA, disease and microRNA (miRNA) nodes. We proposed a new method based on fully connected autoencoders and convolutional neural networks, called ACLDA, for inferring potential disease-related lncRNA candidates. A heterogeneous graph that consists of lncRNA, disease and miRNA nodes were firstly constructed to integrate similarities, associations and interactions among them. Fully connected autoencoder-based module was established to extract the low-dimensional features of lncRNA, disease and miRNA nodes in the heterogeneous graph. We designed the attention mechanisms at the node feature level and at the meta-path level to learn more informative features and meta-paths. A module based on convolutional neural networks was constructed to encode the local topologies of lncRNA and disease nodes from multiple meta-path perspectives. The comprehensive experimental results demonstrated ACLDA achieves superior performance than several state-of-the-art prediction methods. Case studies on breast, lung and colon cancers demonstrated that ACLDA is able to discover the potential disease-related lncRNAs.


Assuntos
MicroRNAs , RNA Longo não Codificante , Algoritmos , Biologia Computacional/métodos , Humanos , MicroRNAs/genética , Redes Neurais de Computação , RNA Longo não Codificante/genética
6.
Brief Bioinform ; 23(2)2022 03 10.
Artigo em Inglês | MEDLINE | ID: mdl-35136910

RESUMO

MOTIVATION: Identifying new therapeutic effects for the approved drugs is beneficial for effectively reducing the drug development cost and time. Most of the recent computational methods concentrate on exploiting multiple kinds of information about drugs and disease to predict the candidate associations between drugs and diseases. However, the drug and disease nodes have neighboring topologies with multiple scales, and the previous methods did not fully exploit and deeply integrate these topologies. RESULTS: We present a prediction method, multi-scale topology learning for drug-disease (MTRD), to integrate and learn multi-scale neighboring topologies and the attributes of a pair of drug and disease nodes. First, for multiple kinds of drug similarities, multiple drug-disease heterogenous networks are constructed respectively to integrate the similarities and associations related to drugs and diseases. Moreover, each heterogenous network has its specific topology structure, which is helpful for learning the corresponding specific topology representation. We formulate the topology embeddings for each drug node and disease node by random walking on each heterogeneous network, and the embeddings cover the neighboring topologies with different scopes. Because the multi-scale topology embeddings have context relationships, we construct Bi-directional long short-term memory-based module to encode these embeddings and their relationships and learn the neighboring topology representation. We also design the attention mechanisms at feature level and at scale level to obtain the more informative pairwise features and topology embeddings. A module based on multi-layer convolutional networks is constructed to learn the representative attributes of the drug-disease node pair according to their related similarity and association information. Comprehensive experimental results indicate that MTRD achieves the superior performance than several state-of-the-art methods for predicting drug-disease associations. MTRD also retrieves more actual drug-disease associations in the top-ranked candidates of the prediction result. Case studies on five drugs further demonstrate MTRD's ability in discovering the potential candidate diseases for the interested drugs.


Assuntos
Algoritmos , Redes Neurais de Computação , Desenvolvimento de Medicamentos
7.
Brief Bioinform ; 23(2)2022 03 10.
Artigo em Inglês | MEDLINE | ID: mdl-35108355

RESUMO

MOTIVATION: Predicting disease-related long non-coding RNAs (lncRNAs) can be used as the biomarkers for disease diagnosis and treatment. The development of effective computational prediction approaches to predict lncRNA-disease associations (LDAs) can provide insights into the pathogenesis of complex human diseases and reduce experimental costs. However, few of the existing methods use microRNA (miRNA) information and consider the complex relationship between inter-graph and intra-graph in complex-graph for assisting prediction. RESULTS: In this paper, the relationships between the same types of nodes and different types of nodes in complex-graph are introduced. We propose a multi-channel graph attention autoencoder model to predict LDAs, called MGATE. First, an lncRNA-miRNA-disease complex-graph is established based on the similarity and correlation among lncRNA, miRNA and diseases to integrate the complex association among them. Secondly, in order to fully extract the comprehensive information of the nodes, we use graph autoencoder networks to learn multiple representations from complex-graph, inter-graph and intra-graph. Thirdly, a graph-level attention mechanism integration module is adopted to adaptively merge the three representations, and a combined training strategy is performed to optimize the whole model to ensure the complementary and consistency among the multi-graph embedding representations. Finally, multiple classifiers are explored, and Random Forest is used to predict the association score between lncRNA and disease. Experimental results on the public dataset show that the area under receiver operating characteristic curve and area under precision-recall curve of MGATE are 0.964 and 0.413, respectively. MGATE performance significantly outperformed seven state-of-the-art methods. Furthermore, the case studies of three cancers further demonstrate the ability of MGATE to identify potential disease-correlated candidate lncRNAs. The source code and supplementary data are available at https://github.com/sheng-n/MGATE. CONTACT: huanglan@jlu.edu.cn, wy6868@jlu.edu.cn.


Assuntos
MicroRNAs , RNA Longo não Codificante , Algoritmos , Biologia Computacional/métodos , Humanos , MicroRNAs/genética , Redes Neurais de Computação , RNA Longo não Codificante/genética
8.
Brief Bioinform ; 23(2)2022 03 10.
Artigo em Inglês | MEDLINE | ID: mdl-35108362

RESUMO

MOTIVATION: Effective computational methods to predict drug-protein interactions (DPIs) are vital for drug discovery in reducing the time and cost of drug development. Recent DPI prediction methods mainly exploit graph data composed of multiple kinds of connections among drugs and proteins. Each node in the graph usually has topological structures with multiple scales formed by its first-order neighbors and multi-order neighbors. However, most of the previous methods do not consider the topological structures of multi-order neighbors. In addition, deep integration of the multi-modality similarities of drugs and proteins is also a challenging task. RESULTS: We propose a model called ALDPI to adaptively learn the multi-scale topologies and multi-modality similarities with various significance levels. We first construct a drug-protein heterogeneous graph, which is composed of the interactions and the similarities with multiple modalities among drugs and proteins. An adaptive graph learning module is then designed to learn important kinds of connections in heterogeneous graph and generate new topology graphs. A module based on graph convolutional autoencoders is established to learn multiple representations, which imply the node attributes and multiple-scale topologies composed of one-order and multi-order neighbors, respectively. We also design an attention mechanism at neighbor topology level to distinguish the importance of these representations. Finally, since each similarity modality has its specific features, we construct a multi-layer convolutional neural network-based module to learn and fuse multi-modality features to obtain the attribute representation of each drug-protein node pair. Comprehensive experimental results show ALDPI's superior performance over six state-of-the-art methods. The results of recall rates of top-ranked candidates and case studies on five drugs further demonstrate the ability of ALDPI to discover potential drug-related protein candidates. CONTACT: zhang@hlju.edu.cn.


Assuntos
Algoritmos , Redes Neurais de Computação , Desenvolvimento de Medicamentos/métodos , Interações Medicamentosas , Proteínas
9.
Brief Bioinform ; 23(1)2022 01 17.
Artigo em Inglês | MEDLINE | ID: mdl-34634106

RESUMO

Identifying disease-related microRNAs (miRNAs) assists the understanding of disease pathogenesis. Existing research methods integrate multiple kinds of data related to miRNAs and diseases to infer candidate disease-related miRNAs. The attributes of miRNA nodes including their family and cluster belonging information, however, have not been deeply integrated. Besides, the learning of neighbor topology representation of a pair of miRNA and disease is a challenging issue. We present a disease-related miRNA prediction method by encoding and integrating multiple representations of miRNA and disease nodes learnt from the generative and adversarial perspective. We firstly construct a bilayer heterogeneous network of miRNA and disease nodes, and it contains multiple types of connections among these nodes, which reflect neighbor topology of miRNA-disease pairs, and the attributes of miRNA nodes, especially miRNA-related families and clusters. To learn enhanced pairwise neighbor topology, we propose a generative and adversarial model with a convolutional autoencoder-based generator to encode the low-dimensional topological representation of the miRNA-disease pair and multi-layer convolutional neural network-based discriminator to discriminate between the true and false neighbor topology embeddings. Besides, we design a novel feature category-level attention mechanism to learn the various importance of different features for final adaptive fusion and prediction. Comparison results with five miRNA-disease association methods demonstrated the superior performance of our model and technical contributions in terms of area under the receiver operating characteristic curve and area under the precision-recall curve. The results of recall rates confirmed that our model can find more actual miRNA-disease associations among top-ranked candidates. Case studies on three cancers further proved the ability to detect potential candidate miRNAs.


Assuntos
MicroRNAs , Algoritmos , Biologia Computacional/métodos , Humanos , MicroRNAs/genética , Redes Neurais de Computação , Curva ROC
10.
Brief Bioinform ; 23(3)2022 05 13.
Artigo em Inglês | MEDLINE | ID: mdl-35393616

RESUMO

MOTIVATION: Identifying new uses of approved drugs is an effective way to reduce the time and cost of drug development. Recent computational approaches for predicting drug-disease associations have integrated multi-sourced data on drugs and diseases. However, neighboring topologies of various scales in multiple heterogeneous drug-disease networks have yet to be exploited and fully integrated. RESULTS: We propose a novel method for drug-disease association prediction, called MGPred, used to encode and learn multi-scale neighboring topologies of drug and disease nodes and pairwise attributes from heterogeneous networks. First, we constructed three heterogeneous networks based on multiple kinds of drug similarities. Each network comprises drug and disease nodes and edges created based on node-wise similarities and associations that reflect specific topological structures. We also propose an embedding mechanism to formulate topologies that cover different ranges of neighbors. To encode the embeddings and derive multi-scale neighboring topology representations of drug and disease nodes, we propose a module based on graph convolutional autoencoders with shared parameters for each heterogeneous network. We also propose scale-level attention to obtain an adaptive fusion of informative topological representations at different scales. Finally, a learning module based on a convolutional neural network with various receptive fields is proposed to learn multi-view attribute representations of a pair of drug and disease nodes. Comprehensive experiment results demonstrate that MGPred outperforms other state-of-the-art methods in comparison to drug-related disease prediction, and the recall rates for the top-ranked candidates and case studies on five drugs further demonstrate the ability of MGPred to retrieve potential drug-disease associations.


Assuntos
Algoritmos , Redes Neurais de Computação , Desenvolvimento de Medicamentos/métodos
11.
Brief Bioinform ; 23(1)2022 01 17.
Artigo em Inglês | MEDLINE | ID: mdl-34850815

RESUMO

MOTIVATION: The development process of a new drug is time-consuming and costly. Thus, identifying new uses for approved drugs, named drug repositioning, is helpful for speeding up the drug development process and reducing development costs. Existing drug-related disease prediction methods mainly focus on single or multiple drug-disease heterogeneous networks. However, heterogeneous networks, and drug subnets and disease subnet contained in heterogeneous networks cover the common topology information between drug and disease nodes, the specific information between drug nodes and the specific information between disease nodes, respectively. RESULTS: We design a novel model, CTST, to extract and integrate common and specific topologies in multiple heterogeneous networks and subnets. Multiple heterogeneous networks composed of drug and disease nodes are established to integrate multiple kinds of similarities and associations among drug and disease nodes. These heterogeneous networks contain multiple drug subnets and a disease subnet. For multiple heterogeneous networks and subnets, we then define the common and specific representations of drug and disease nodes. The common representations of drug and disease nodes are encoded by a graph convolutional autoencoder with sharing parameters and they integrate the topological relationships of all nodes in heterogeneous networks. The specific representations of nodes are learned by specific graph convolutional autoencoders, respectively, and they fuse the topology and attributes of the nodes in each subnet. We then propose attention mechanisms at common representation level and specific representation level to learn more informative common and specific representations, respectively. Finally, an integration module with representation feature level attention is built to adaptively integrate these two representations for final association prediction. Extensive experimental results confirm the effectiveness of CTST. Comparison with six latest methods and case studies on five drugs further verify CTST has the ability to discover potential candidate diseases.


Assuntos
Algoritmos , Redes Neurais de Computação , Reposicionamento de Medicamentos/métodos
12.
Brief Bioinform ; 23(1)2022 01 17.
Artigo em Inglês | MEDLINE | ID: mdl-34718408

RESUMO

MOTIVATION: Identifying proteins that interact with drugs plays an important role in the initial period of developing drugs, which helps to reduce the development cost and time. Recent methods for predicting drug-protein interactions mainly focus on exploiting various data about drugs and proteins. These methods failed to completely learn and integrate the attribute information of a pair of drug and protein nodes and their attribute distribution. RESULTS: We present a new prediction method, GVDTI, to encode multiple pairwise representations, including attention-enhanced topological representation, attribute representation and attribute distribution. First, a framework based on graph convolutional autoencoder is constructed to learn attention-enhanced topological embedding that integrates the topology structure of a drug-protein network for each drug and protein nodes. The topological embeddings of each drug and each protein are then combined and fused by multi-layer convolution neural networks to obtain the pairwise topological representation, which reveals the hidden topological relationships between drug and protein nodes. The proposed attribute-wise attention mechanism learns and adjusts the importance of individual attribute in each topological embedding of drug and protein nodes. Secondly, a tri-layer heterogeneous network composed of drug, protein and disease nodes is created to associate the similarities, interactions and associations across the heterogeneous nodes. The attribute distribution of the drug-protein node pair is encoded by a variational autoencoder. The pairwise attribute representation is learned via a multi-layer convolutional neural network to deeply integrate the attributes of drug and protein nodes. Finally, the three pairwise representations are fused by convolutional and fully connected neural networks for drug-protein interaction prediction. The experimental results show that GVDTI outperformed other seven state-of-the-art methods in comparison. The improved recall rates indicate that GVDTI retrieved more actual drug-protein interactions in the top ranked candidates than conventional methods. Case studies on five drugs further confirm GVDTI's ability in discovering the potential candidate drug-related proteins. CONTACT: zhang@hlju.edu.cn Supplementary information: Supplementary data are available at Briefings in Bioinformatics online.


Assuntos
Redes Neurais de Computação , Proteínas , Interações Medicamentosas
13.
Brief Bioinform ; 23(5)2022 09 20.
Artigo em Inglês | MEDLINE | ID: mdl-36088549

RESUMO

MOTIVATION: Long noncoding RNAs (lncRNAs) play an important role in the occurrence and development of diseases. Predicting disease-related lncRNAs can help to understand the pathogenesis of diseases deeply. The existing methods mainly rely on multi-source data related to lncRNAs and diseases when predicting the associations between lncRNAs and diseases. There are interdependencies among node attributes in a heterogeneous graph composed of all lncRNAs, diseases and micro RNAs. The meta-paths composed of various connections between them also contain rich semantic information. However, the existing methods neglect to integrate attribute information of intermediate nodes in meta-paths. RESULTS: We propose a novel association prediction model, GSMV, to learn and deeply integrate the global dependencies, semantic information of meta-paths and node-pair multi-view features related to lncRNAs and diseases. We firstly formulate the global representations of the lncRNA and disease nodes by establishing a self-attention mechanism to capture and learn the global dependencies among node attributes. Second, starting from the lncRNA and disease nodes, respectively, multiple meta-pathways are established to reveal different semantic information. Considering that each meta-path contains specific semantics and has multiple meta-path instances which have different contributions to revealing meta-path semantics, we design a graph neural network based module which consists of a meta-path instance encoding strategy and two novel attention mechanisms. The proposed meta-path instance encoding strategy is used to learn the contextual connections between nodes within a meta-path instance. One of the two new attention mechanisms is at the meta-path instance level, which learns rich and informative meta-path instances. The other attention mechanism integrates various semantic information from multiple meta-paths to learn the semantic representation of lncRNA and disease nodes. Finally, a dilated convolution-based learning module with adjustable receptive fields is proposed to learn multi-view features of lncRNA-disease node pairs. The experimental results prove that our method outperforms seven state-of-the-art comparing methods for lncRNA-disease association prediction. Ablation experiments demonstrate the contributions of the proposed global representation learning, semantic information learning, pairwise multi-view feature learning and the meta-path instance encoding strategy. Case studies on three cancers further demonstrate our method's ability to discover potential disease-related lncRNA candidates. CONTACT: zhang@hlju.edu.cn or peiliangwu@ysu.edu.cn. SUPPLEMENTARY INFORMATION: Supplementary data are available at Briefings in Bioinformatics online.


Assuntos
RNA Longo não Codificante , Algoritmos , Biologia Computacional/métodos , Redes Neurais de Computação , RNA Longo não Codificante/genética , Semântica
14.
J Chem Inf Model ; 64(8): 3569-3578, 2024 Apr 22.
Artigo em Inglês | MEDLINE | ID: mdl-38523267

RESUMO

As the long non-coding RNAs (lncRNAs) play important roles during the incurrence and development of various human diseases, identifying disease-related lncRNAs can contribute to clarifying the pathogenesis of diseases. Most of the recent lncRNA-disease association prediction methods utilized the multi-source data about the lncRNAs and diseases. A single lncRNA may participate in multiple disease processes, and multiple lncRNAs usually are involved in the same disease process synergistically. However, the previous methods did not completely exploit the biological characteristics to construct the informative prediction models. We construct a prediction model based on adaptive hypergraph and gated convolution for lncRNA-disease association prediction (AGLDA), to embed and encode the biological characteristics about lncRNA-disease associations, the topological features from the entire heterogeneous graph perspective, and the gated enhanced pairwise features. First, the strategy for constructing hyperedges is designed to reflect the biological characteristic that multiple lncRNAs are involved in multiple disease processes. Furthermore, each hyperedge has its own biological perspective, and multiple hyperedges are beneficial for revealing the diverse relationships among multiple lncRNAs and diseases. Second, we encode the biological features of each lncRNA (disease) node using a strategy based on dynamic hypergraph convolutional networks. The strategy may adaptively learn the features of the hyperedges and formulate the dynamically evolved hypergraph topological structure. Third, a group convolutional network is established to integrate the entire heterogeneous topological structure and multiple types of node attributes within an lncRNA-disease-miRNA graph. Finally, a gated convolutional strategy is proposed to enhance the informative features of the lncRNA-disease node pairs. The comparison experiments indicate that AGLDA outperforms seven advanced prediction methods. The ablation studies confirm the effectiveness of major innovations, and the case studies validate AGLDA's ability in application for discovering potential disease-related lncRNA candidates.


Assuntos
RNA Longo não Codificante , RNA Longo não Codificante/genética , Humanos , Biologia Computacional/métodos , Predisposição Genética para Doença , Doença/genética , Aprendizado de Máquina
15.
Brief Bioinform ; 22(3)2021 05 20.
Artigo em Inglês | MEDLINE | ID: mdl-32444875

RESUMO

As the abnormalities of long non-coding RNAs (lncRNAs) are closely related to various human diseases, identifying disease-related lncRNAs is important for understanding the pathogenesis of complex diseases. Most of current data-driven methods for disease-related lncRNA candidate prediction are based on diseases and lncRNAs. Those methods, however, fail to consider the deeply embedded node attributes of lncRNA-disease pairs, which contain multiple relations and representations across lncRNAs, diseases and miRNAs. Moreover, the low-dimensional feature distribution at the pairwise level has not been taken into account. We propose a prediction model, VADLP, to extract, encode and adaptively integrate multi-level representations. Firstly, a triple-layer heterogeneous graph is constructed with weighted inter-layer and intra-layer edges to integrate the similarities and correlations among lncRNAs, diseases and miRNAs. We then define three representations including node attributes, pairwise topology and feature distribution. Node attributes are derived from the graph by an embedding strategy to represent the lncRNA-disease associations, which are inferred via their common lncRNAs, diseases and miRNAs. Pairwise topology is formulated by random walk algorithm and encoded by a convolutional autoencoder to represent the hidden topological structural relations between a pair of lncRNA and disease. The new feature distribution is modeled by a variance autoencoder to reveal the underlying lncRNA-disease relationship. Finally, an attentional representation-level integration module is constructed to adaptively fuse the three representations for lncRNA-disease association prediction. The proposed model is tested over a public dataset with a comprehensive list of evaluations. Our model outperforms six state-of-the-art lncRNA-disease prediction models with statistical significance. The ablation study showed the important contributions of three representations. In particular, the improved recall rates under different top $k$ values demonstrate that our model is powerful in discovering true disease-related lncRNAs in the top-ranked candidates. Case studies of three cancers further proved the capacity of our model to discover potential disease-related lncRNAs.


Assuntos
RNA Longo não Codificante/metabolismo , Algoritmos , Biologia Computacional/métodos , Conjuntos de Dados como Assunto , Aprendizado Profundo , Humanos , Neoplasias/patologia , Redes Neurais de Computação
16.
Brief Bioinform ; 22(5)2021 09 02.
Artigo em Inglês | MEDLINE | ID: mdl-33839743

RESUMO

MOTIVATION: Identifying the proteins that interact with drugs can reduce the cost and time of drug development. Existing computerized methods focus on integrating drug-related and protein-related data from multiple sources to predict candidate drug-target interactions (DTIs). However, multi-scale neighboring node sequences and various kinds of drug and protein similarities are neither fully explored nor considered in decision making. RESULTS: We propose a drug-target interaction prediction method, DTIP, to encode and integrate multi-scale neighbouring topologies, multiple kinds of similarities, associations, interactions related to drugs and proteins. We firstly construct a three-layer heterogeneous network to represent interactions and associations across drug, protein, and disease nodes. Then a learning framework based on fully-connected autoencoder is proposed to learn the nodes' low-dimensional feature representations within the heterogeneous network. Secondly, multi-scale neighbouring sequences of drug and protein nodes are formulated by random walks. A module based on bidirectional gated recurrent unit is designed to learn the neighbouring sequential information and integrate the low-dimensional features of nodes. Finally, we propose attention mechanisms at feature level, neighbouring topological level and similarity level to learn more informative features, topologies and similarities. The prediction results are obtained by integrating neighbouring topologies, similarities and feature attributes using a multiple layer CNN. Comprehensive experimental results over public dataset demonstrated the effectiveness of our innovative features and modules. Comparison with other state-of-the-art methods and case studies of five drugs further validated DTIP's ability in discovering the potential candidate drug-related proteins.


Assuntos
Algoritmos , Biologia Computacional/métodos , Aprendizado de Máquina , Modelos Teóricos , Preparações Farmacêuticas/metabolismo , Proteínas/metabolismo , Desenvolvimento de Medicamentos/métodos , Humanos , Preparações Farmacêuticas/química , Ligação Proteica , Proteínas/química , Reprodutibilidade dos Testes , Máquina de Vetores de Suporte
17.
J Chem Inf Model ; 63(21): 6947-6958, 2023 11 13.
Artigo em Inglês | MEDLINE | ID: mdl-37906529

RESUMO

An increasing number of studies have shown that dysregulation of lncRNAs is related to the occurrence of various diseases. Most of the previous methods, however, are designed based on homogeneity assumption that the representation of a target lncRNA (or disease) node should be updated by aggregating the attributes of its neighbor nodes. However, the assumption ignores the affinity nodes that are far from the target node. We present a novel prediction method, GAIRD, to fully leverage the heterogeneous information in the network and the decoupled node features. The first major innovation is a random walk strategy based on width-first searching and depth-first searching. Different from previous methods that only focus on homogeneous information, our new strategy learns both the homogeneous information within local neighborhoods and the heterogeneous information within higher-order neighborhoods. The second innovation is a representation decoupling module to extract the purer attributes and the purer topologies. Third, a module based on group convolution and deep separable convolution is developed to promote the pairwise intrachannel and interchannel feature learning. The experimental results show that GAIRD outperforms comparing state-of-the-art methods, and the ablation studies prove the contributions of major innovations. We also performed case studies on 3 diseases to further demonstrate the effectiveness of the GAIRD model in applications.


Assuntos
RNA Longo não Codificante , RNA Longo não Codificante/genética , Aprendizagem , Algoritmos
18.
Molecules ; 28(18)2023 Sep 09.
Artigo em Inglês | MEDLINE | ID: mdl-37764319

RESUMO

Since side-effects of drugs are one of the primary reasons for their failure in clinical trials, predicting their side-effects can help reduce drug development costs. We proposed a method based on heterogeneous graph transformer and capsule networks for side-effect-drug-association prediction (TCSD). The method encodes and integrates attributes from multiple types of neighbor nodes, connection semantics, and multi-view pairwise information. In each drug-side-effect heterogeneous graph, a target node has two types of neighbor nodes, the drug nodes and the side-effect ones. We proposed a new heterogeneous graph transformer-based context representation learning module. The module is able to encode specific topology and the contextual relations among multiple kinds of nodes. There are similarity and association connections between the target node and its various types of neighbor nodes, and these connections imply semantic diversity. Therefore, we designed a new strategy to measure the importance of a neighboring node to the target node and incorporate different semantics of the connections between the target node and its multi-type neighbors. Furthermore, we designed attentions at the neighbor node type level and at the graph level, respectively, to obtain enhanced informative neighbor node features and multi-graph features. Finally, a pairwise multi-view feature learning module based on capsule networks was built to learn the pairwise attributes from the heterogeneous graphs. Our prediction model was evaluated using a public dataset, and the cross-validation results showed it achieved superior performance to several state-of-the-art methods. Ablation experiments undertaken demonstrated the effectiveness of heterogeneous graph transformer-based context encoding, the position enhanced pairwise attribute learning, and the neighborhood node category-level attention. Case studies on five drugs further showed TCSD's ability in retrieving potential drug-related side-effect candidates, and TCSD inferred the candidate side-effects for 708 drugs.


Assuntos
Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos , Semântica , Humanos , Aprendizagem , Desenvolvimento de Medicamentos , Fontes de Energia Elétrica
19.
Int J Mol Sci ; 23(7)2022 Mar 31.
Artigo em Inglês | MEDLINE | ID: mdl-35409235

RESUMO

Identifying new disease indications for existing drugs can help facilitate drug development and reduce development cost. The previous drug-disease association prediction methods focused on data about drugs and diseases from multiple sources. However, they did not deeply integrate the neighbor topological information of drug and disease nodes from various meta-path perspectives. We propose a prediction method called NAPred to encode and integrate meta-path-level neighbor topologies, multiple kinds of drug attributes, and drug-related and disease-related similarities and associations. The multiple kinds of similarities between drugs reflect the degrees of similarity between two drugs from different perspectives. Therefore, we constructed three drug-disease heterogeneous networks according to these drug similarities, respectively. A learning framework based on fully connected neural networks and a convolutional neural network with an attention mechanism is proposed to learn information of the neighbor nodes of a pair of drug and disease nodes. The multiple neighbor sets composed of different kinds of nodes were formed respectively based on meta-paths with different semantics and different scales. We established the attention mechanisms at the neighbor-scale level and at the neighbor topology level to learn enhanced neighbor feature representations and enhanced neighbor topological representations. A convolutional-autoencoder-based module is proposed to encode the attributes of the drug-disease pair in three heterogeneous networks. Extensive experimental results indicated that NAPred outperformed several state-of-the-art methods for drug-disease association prediction, and the improved recall rates demonstrated that NAPred was able to retrieve more actual drug-disease associations from the top-ranked candidates. Case studies on five drugs further demonstrated the ability of NAPred to identify potential drug-related disease candidates.


Assuntos
Algoritmos , Redes Neurais de Computação , Biologia Computacional/métodos , Desenvolvimento de Medicamentos/métodos , Rememoração Mental
20.
Cell Biol Int ; 45(8): 1644-1653, 2021 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-33760350

RESUMO

Overexpression of breast cancer resistance protein (BCRP) plays a crucial role in the acquired multidrug resistance (MDR) in breast cancer. The elucidation of molecular events that confer BCRP-mediated MDR is of major therapeutic importance in breast cancer. Epithelial cell adhesion molecule (EpCAM) has been implicated in tumor progression and drug resistance in various types of cancers, including breast cancer. However, the role of EpCAM in BCRP-mediated MDR in breast cancer remains unknown. In the present study, we revealed that EpCAM expression was upregulated in BCRP-overexpressing breast cancer MCF-7/MX cells, and EpCAM knockdown using siRNA reduced BCRP expression and increased the sensitivity of MCF-7/MX cells to mitoxantrone (MX). The epithelial-mesenchymal transition (EMT) promoted BCRP-mediated MDR in breast cancer cells, and EpCAM knockdown partially suppressed EMT progression in MCF-7/MX cells. In addition, Wnt/ß-catenin signaling was activated in MCF-7/MX cells, and the inhibition of this signaling attenuated EpCAM and BCRP expression and partially reversed EMT. Together, this study illustrates that EpCAM upregulation by Wnt/ß-catenin signaling induces partial EMT to promote BCRP-mediated MDR resistance in breast cancer cells. EpCAM may be a potential therapeutic target for overcoming BCRP-mediated resistance in human breast cancer.


Assuntos
Membro 2 da Subfamília G de Transportadores de Cassetes de Ligação de ATP/biossíntese , Neoplasias da Mama/metabolismo , Resistência a Múltiplos Medicamentos/fisiologia , Resistencia a Medicamentos Antineoplásicos/fisiologia , Molécula de Adesão da Célula Epitelial/biossíntese , Transição Epitelial-Mesenquimal/fisiologia , Proteínas de Neoplasias/biossíntese , Membro 2 da Subfamília G de Transportadores de Cassetes de Ligação de ATP/genética , Antineoplásicos/farmacologia , Neoplasias da Mama/genética , Sobrevivência Celular/efeitos dos fármacos , Sobrevivência Celular/fisiologia , Relação Dose-Resposta a Droga , Resistência a Múltiplos Medicamentos/efeitos dos fármacos , Resistencia a Medicamentos Antineoplásicos/efeitos dos fármacos , Molécula de Adesão da Célula Epitelial/antagonistas & inibidores , Molécula de Adesão da Célula Epitelial/genética , Transição Epitelial-Mesenquimal/efeitos dos fármacos , Feminino , Humanos , Células MCF-7 , Mitoxantrona/farmacologia , Proteínas de Neoplasias/genética , RNA Interferente Pequeno/administração & dosagem
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa