Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 81
Filtrar
1.
Brief Bioinform ; 25(1)2023 11 22.
Artigo em Inglês | MEDLINE | ID: mdl-38084923

RESUMO

The stability of the gut microenvironment is inextricably linked to human health, with the onset of many diseases accompanied by dysbiosis of the gut microbiota. It has been reported that there are differences in the microbial community composition between patients and healthy individuals, and many microbes are considered potential biomarkers. Accurately identifying these biomarkers can lead to more precise and reliable clinical decision-making. To improve the accuracy of microbial biomarker identification, this study introduces WSGMB, a computational framework that uses the relative abundance of microbial taxa and health status as inputs. This method has two main contributions: (1) viewing the microbial co-occurrence network as a weighted signed graph and applying graph convolutional neural network techniques for graph classification; (2) designing a new architecture to compute the role transitions of each microbial taxon between health and disease networks, thereby identifying disease-related microbial biomarkers. The weighted signed graph neural network enhances the quality of graph embeddings; quantifying the importance of microbes in different co-occurrence networks better identifies those microbes critical to health. Microbes are ranked according to their importance change scores, and when this score exceeds a set threshold, the microbe is considered a biomarker. This framework's identification performance is validated by comparing the biomarkers identified by WSGMB with actual microbial biomarkers associated with specific diseases from public literature databases. The study tests the proposed computational framework using actual microbial community data from colorectal cancer and Crohn's disease samples. It compares it with the most advanced microbial biomarker identification methods. The results show that the WSGMB method outperforms similar approaches in the accuracy of microbial biomarker identification.


Assuntos
Doença de Crohn , Microbioma Gastrointestinal , Microbiota , Humanos , Redes Neurais de Computação , Biomarcadores
2.
Brief Bioinform ; 24(6)2023 09 22.
Artigo em Inglês | MEDLINE | ID: mdl-37742053

RESUMO

Identifying the potential bacteriophages (phage) candidate to treat bacterial infections plays an essential role in the research of human pathogens. Computational approaches are recognized as a valid way to predict bacteria and target phages. However, most of the current methods only utilize lower-order biological information without considering the higher-order connectivity patterns, which helps to improve the predictive accuracy. Therefore, we developed a novel microbial heterogeneous interaction network (MHIN)-based model called PTBGRP to predict new phages for bacterial hosts. Specifically, PTBGRP first constructs an MHIN by integrating phage-bacteria interaction (PBI) and six bacteria-bacteria interaction networks with their biological attributes. Then, different representation learning methods are deployed to extract higher-level biological features and lower-level topological features from MHIN. Finally, PTBGRP employs a deep neural network as the classifier to predict unknown PBI pairs based on the fused biological information. Experiment results demonstrated that PTBGRP achieves the best performance on the corresponding ESKAPE pathogens and PBI dataset when compared with state-of-art methods. In addition, case studies of Klebsiella pneumoniae and Staphylococcus aureus further indicate that the consideration of rich heterogeneous information enables PTBGRP to accurately predict PBI from a more comprehensive perspective. The webserver of the PTBGRP predictor is freely available at http://120.77.11.78/PTBGRP/.


Assuntos
Bacteriófagos , Infecções Estafilocócicas , Humanos , Aprendizagem , Bactérias , Redes Neurais de Computação
3.
Brief Bioinform ; 24(1)2023 01 19.
Artigo em Inglês | MEDLINE | ID: mdl-36642408

RESUMO

Current machine learning-based methods have achieved inspiring predictions in the scenarios of mono-type and multi-type drug-drug interactions (DDIs), but they all ignore enhancive and depressive pharmacological changes triggered by DDIs. In addition, these pharmacological changes are asymmetric since the roles of two drugs in an interaction are different. More importantly, these pharmacological changes imply significant topological patterns among DDIs. To address the above issues, we first leverage Balance theory and Status theory in social networks to reveal the topological patterns among directed pharmacological DDIs, which are modeled as a signed and directed network. Then, we design a novel graph representation learning model named SGRL-DDI (social theory-enhanced graph representation learning for DDI) to realize the multitask prediction of DDIs. SGRL-DDI model can capture the task-joint information by integrating relation graph convolutional networks with Balance and Status patterns. Moreover, we utilize task-specific deep neural networks to perform two tasks, including the prediction of enhancive/depressive DDIs and the prediction of directed DDIs. Based on DDI entries collected from DrugBank, the superiority of our model is demonstrated by the comparison with other state-of-the-art methods. Furthermore, the ablation study verifies that Balance and Status patterns help characterize directed pharmacological DDIs, and that the joint of two tasks provides better DDI representations than individual tasks. Last, we demonstrate the practical effectiveness of our model by a version-dependent test, where 88.47 and 81.38% DDI out of newly added entries provided by the latest release of DrugBank are validated in two predicting tasks respectively.


Assuntos
Aprendizado de Máquina , Redes Neurais de Computação , Interações Medicamentosas
4.
BMC Biol ; 22(1): 156, 2024 Jul 18.
Artigo em Inglês | MEDLINE | ID: mdl-39020316

RESUMO

BACKGROUND: Identification of potential drug-target interactions (DTIs) with high accuracy is a key step in drug discovery and repositioning, especially concerning specific drug targets. Traditional experimental methods for identifying the DTIs are arduous, time-intensive, and financially burdensome. In addition, robust computational methods have been developed for predicting the DTIs and are widely applied in drug discovery research. However, advancing more precise algorithms for predicting DTIs is essential to meet the stringent standards demanded by drug discovery. RESULTS: We proposed a novel method called GSRF-DTI, which integrates networks with a deep learning algorithm to identify DTIs. Firstly, GSRF-DTI learned the embedding representation of drugs and targets by integrating multiple drug association information and target association information, respectively. Then, GSRF-DTI considered the influence of drug-target pair (DTP) association on DTI prediction to construct a drug-target pair network (DTP-NET). Next, we utilized GraphSAGE on DTP-NET to learn the potential features of the network and applied random forest (RF) to predict the DTIs. Furthermore, we conducted ablation experiments to validate the necessity of integrating different types of network features for identifying DTIs. It is worth noting that GSRF-DTI proposed three novel DTIs. CONCLUSIONS: GSRF-DTI not only considered the influence of the interaction relationship between drug and target but also considered the impact of DTP association relationship on DTI prediction. We initially use GraphSAGE to aggregate the neighbor information of nodes for better identification. Experimental analysis on Luo's dataset and the newly constructed dataset revealed that the GSRF-DTI framework outperformed several state-of-the-art methods significantly.


Assuntos
Descoberta de Drogas , Descoberta de Drogas/métodos , Aprendizado Profundo , Biologia Computacional/métodos , Algoritmos , Preparações Farmacêuticas
5.
Brief Bioinform ; 23(1)2022 01 17.
Artigo em Inglês | MEDLINE | ID: mdl-34471921

RESUMO

Graph is a natural data structure for describing complex systems, which contains a set of objects and relationships. Ubiquitous real-life biomedical problems can be modeled as graph analytics tasks. Machine learning, especially deep learning, succeeds in vast bioinformatics scenarios with data represented in Euclidean domain. However, rich relational information between biological elements is retained in the non-Euclidean biomedical graphs, which is not learning friendly to classic machine learning methods. Graph representation learning aims to embed graph into a low-dimensional space while preserving graph topology and node properties. It bridges biomedical graphs and modern machine learning methods and has recently raised widespread interest in both machine learning and bioinformatics communities. In this work, we summarize the advances of graph representation learning and its representative applications in bioinformatics. To provide a comprehensive and structured analysis and perspective, we first categorize and analyze both graph embedding methods (homogeneous graph embedding, heterogeneous graph embedding, attribute graph embedding) and graph neural networks. Furthermore, we summarize their representative applications from molecular level to genomics, pharmaceutical and healthcare systems level. Moreover, we provide open resource platforms and libraries for implementing these graph representation learning methods and discuss the challenges and opportunities of graph representation learning in bioinformatics. This work provides a comprehensive survey of emerging graph representation learning algorithms and their applications in bioinformatics. It is anticipated that it could bring valuable insights for researchers to contribute their knowledge to graph representation learning and future-oriented bioinformatics studies.


Assuntos
Biologia Computacional , Redes Neurais de Computação , Algoritmos , Biologia Computacional/métodos , Conhecimento , Aprendizado de Máquina
6.
Brief Bioinform ; 23(1)2022 01 17.
Artigo em Inglês | MEDLINE | ID: mdl-34891172

RESUMO

Identifying new indications for drugs plays an essential role at many phases of drug research and development. Computational methods are regarded as an effective way to associate drugs with new indications. However, most of them complete their tasks by constructing a variety of heterogeneous networks without considering the biological knowledge of drugs and diseases, which are believed to be useful for improving the accuracy of drug repositioning. To this end, a novel heterogeneous information network (HIN) based model, namely HINGRL, is proposed to precisely identify new indications for drugs based on graph representation learning techniques. More specifically, HINGRL first constructs a HIN by integrating drug-disease, drug-protein and protein-disease biological networks with the biological knowledge of drugs and diseases. Then, different representation strategies are applied to learn the features of nodes in the HIN from the topological and biological perspectives. Finally, HINGRL adopts a Random Forest classifier to predict unknown drug-disease associations based on the integrated features of drugs and diseases obtained in the previous step. Experimental results demonstrate that HINGRL achieves the best performance on two real datasets when compared with state-of-the-art models. Besides, our case studies indicate that the simultaneous consideration of network topology and biological knowledge of drugs and diseases allows HINGRL to precisely predict drug-disease associations from a more comprehensive perspective. The promising performance of HINGRL also reveals that the utilization of rich heterogeneous information provides an alternative view for HINGRL to identify novel drug-disease associations especially for new diseases.


Assuntos
Serviços de Informação , Aprendizado de Máquina , Preparações Farmacêuticas , Algoritmos , Biologia Computacional/métodos , Doença , Reposicionamento de Medicamentos/métodos , Humanos , Modelos Teóricos , Redes Neurais de Computação
7.
Brief Bioinform ; 23(2)2022 03 10.
Artigo em Inglês | MEDLINE | ID: mdl-35062020

RESUMO

Accurate prediction of atomic partial charges with high-level quantum mechanics (QM) methods suffers from high computational cost. Numerous feature-engineered machine learning (ML)-based predictors with favorable computability and reliability have been developed as alternatives. However, extensive expertise effort was needed for feature engineering of atom chemical environment, which may consequently introduce domain bias. In this study, SuperAtomicCharge, a data-driven deep graph learning framework, was proposed to predict three important types of partial charges (i.e. RESP, DDEC4 and DDEC78) derived from high-level QM calculations based on the structures of molecules. SuperAtomicCharge was designed to simultaneously exploit the 2D and 3D structural information of molecules, which was proved to be an effective way to improve the prediction accuracy of the model. Moreover, a simple transfer learning strategy and a multitask learning strategy based on self-supervised descriptors were also employed to further improve the prediction accuracy of the proposed model. Compared with the latest baselines, including one GNN-based predictor and two ML-based predictors, SuperAtomicCharge showed better performance on all the three external test sets and had better usability and portability. Furthermore, the QM partial charges of new molecules predicted by SuperAtomicCharge can be efficiently used in drug design applications such as structure-based virtual screening, where the predicted RESP and DDEC4 charges of new molecules showed more robust scoring and screening power than the commonly used partial charges. Finally, two tools including an online server (http://cadd.zju.edu.cn/deepchargepredictor) and the source code command lines (https://github.com/zjujdj/SuperAtomicCharge) were developed for the easy access of the SuperAtomicCharge services.


Assuntos
Aprendizado Profundo , Desenho de Fármacos , Aprendizado de Máquina , Reprodutibilidade dos Testes , Software
8.
Methods ; 211: 31-41, 2023 03.
Artigo em Inglês | MEDLINE | ID: mdl-36792041

RESUMO

Self-supervised learning has shown superior performance on graph-related tasks in recent years. The most advanced methods are based on contrast learning, which severely limited by structured data augmentation techniques and complex training methods. Generative self-supervised learning, especially graph autoencoders (GAEs), can prevent the above dependence and has been demonstrated as an effective approach. In addition, most previous works only reconstruct the graph topological structure or node features. Few works consider both and combine them together to obtain their complementary information. To overcome these problems, we propose a generative self-supervised graph representation learning methodology named Multi-View Dual-decoder Graph Autoencoder (MDGA). Specifically, we first design a multi-sample graph learning strategy which benefits the generalization of the dual-decoder graph autoencoder. Moreover, the proposed model reconstructs the graph topological structure with a traditional GAE and extracts node attributes by masked feature reconstruction. Experimental results on five public benchmark datasets demonstrate that MDGA outperforms state-of-the-art methods in both node classification and link prediction tasks.


Assuntos
Benchmarking , Galactosamina , Compostos de Dansil
9.
J Biomed Inform ; 151: 104616, 2024 03.
Artigo em Inglês | MEDLINE | ID: mdl-38423267

RESUMO

OBJECTIVE: This study aims to comprehensively review the use of graph neural networks (GNNs) for clinical risk prediction based on electronic health records (EHRs). The primary goal is to provide an overview of the state-of-the-art of this subject, highlighting ongoing research efforts and identifying existing challenges in developing effective GNNs for improved prediction of clinical risks. METHODS: A search was conducted in the Scopus, PubMed, ACM Digital Library, and Embase databases to identify relevant English-language papers that used GNNs for clinical risk prediction based on EHR data. The study includes original research papers published between January 2009 and May 2023. RESULTS: Following the initial screening process, 50 articles were included in the data collection. A significant increase in publications from 2020 was observed, with most selected papers focusing on diagnosis prediction (n = 36). The study revealed that the graph attention network (GAT) (n = 19) was the most prevalent architecture, and MIMIC-III (n = 23) was the most common data resource. CONCLUSION: GNNs are relevant tools for predicting clinical risk by accounting for the relational aspects among medical events and entities and managing large volumes of EHR data. Future studies in this area may address challenges such as EHR data heterogeneity, multimodality, and model interpretability, aiming to develop more holistic GNN models that can produce more accurate predictions, be effectively implemented in clinical settings, and ultimately improve patient care.


Assuntos
Registros Eletrônicos de Saúde , Idioma , Humanos , Coleta de Dados , Bases de Dados Factuais , Redes Neurais de Computação
10.
Sensors (Basel) ; 24(7)2024 Mar 26.
Artigo em Inglês | MEDLINE | ID: mdl-38610318

RESUMO

Sound classification plays a crucial role in enhancing the interpretation, analysis, and use of acoustic data, leading to a wide range of practical applications, of which environmental sound analysis is one of the most important. In this paper, we explore the representation of audio data as graphs in the context of sound classification. We propose a methodology that leverages pre-trained audio models to extract deep features from audio files, which are then employed as node information to build graphs. Subsequently, we train various graph neural networks (GNNs), specifically graph convolutional networks (GCNs), GraphSAGE, and graph attention networks (GATs), to solve multi-class audio classification problems. Our findings underscore the effectiveness of employing graphs to represent audio data. Moreover, they highlight the competitive performance of GNNs in sound classification endeavors, with the GAT model emerging as the top performer, achieving a mean accuracy of 83% in classifying environmental sounds and 91% in identifying the land cover of a site based on its audio recording. In conclusion, this study provides novel insights into the potential of graph representation learning techniques for analyzing audio data.

11.
BMC Bioinformatics ; 24(1): 162, 2023 Apr 21.
Artigo em Inglês | MEDLINE | ID: mdl-37085750

RESUMO

BACKGROUND: The identification of disease-related genes is of great significance for the diagnosis and treatment of human disease. Most studies have focused on developing efficient and accurate computational methods to predict disease-causing genes. Due to the sparsity and complexity of biomedical data, it is still a challenge to develop an effective multi-feature fusion model to identify disease genes. RESULTS: This paper proposes an approach to predict the pathogenic gene based on multi-head attention fusion (MHAGP). Firstly, the heterogeneous biological information networks of disease genes are constructed by integrating multiple biomedical knowledge databases. Secondly, two graph representation learning algorithms are used to capture the feature vectors of gene-disease pairs from the network, and the features are fused by introducing multi-head attention. Finally, multi-layer perceptron model is used to predict the gene-disease association. CONCLUSIONS: The MHAGP model outperforms all of other methods in comparative experiments. Case studies also show that MHAGP is able to predict genes potentially associated with diseases. In the future, more biological entity association data, such as gene-drug, disease phenotype-gene ontology and so on, can be added to expand the information in heterogeneous biological networks and achieve more accurate predictions. In addition, MHAGP with strong expansibility can be used for potential tasks such as gene-drug association and drug-disease association prediction.


Assuntos
Biologia Computacional , Redes Neurais de Computação , Humanos , Biologia Computacional/métodos , Algoritmos , Conhecimento
12.
Brief Bioinform ; 22(4)2021 07 20.
Artigo em Inglês | MEDLINE | ID: mdl-33276376

RESUMO

Disease-gene association through genome-wide association study (GWAS) is an arduous task for researchers. Investigating single nucleotide polymorphisms that correlate with specific diseases needs statistical analysis of associations. Considering the huge number of possible mutations, in addition to its high cost, another important drawback of GWAS analysis is the large number of false positives. Thus, researchers search for more evidence to cross-check their results through different sources. To provide the researchers with alternative and complementary low-cost disease-gene association evidence, computational approaches come into play. Since molecular networks are able to capture complex interplay among molecules in diseases, they become one of the most extensively used data for disease-gene association prediction. In this survey, we aim to provide a comprehensive and up-to-date review of network-based methods for disease gene prediction. We also conduct an empirical analysis on 14 state-of-the-art methods. To summarize, we first elucidate the task definition for disease gene prediction. Secondly, we categorize existing network-based efforts into network diffusion methods, traditional machine learning methods with handcrafted graph features and graph representation learning methods. Thirdly, an empirical analysis is conducted to evaluate the performance of the selected methods across seven diseases. We also provide distinguishing findings about the discussed methods based on our empirical analysis. Finally, we highlight potential research directions for future studies on disease gene prediction.


Assuntos
Bases de Dados de Ácidos Nucleicos , Doença/genética , Aprendizado de Máquina , Polimorfismo de Nucleotídeo Único , Estudo de Associação Genômica Ampla , Humanos
13.
Sensors (Basel) ; 23(8)2023 Apr 21.
Artigo em Inglês | MEDLINE | ID: mdl-37112507

RESUMO

Graphs are data structures that effectively represent relational data in the real world. Graph representation learning is a significant task since it could facilitate various downstream tasks, such as node classification, link prediction, etc. Graph representation learning aims to map graph entities to low-dimensional vectors while preserving graph structure and entity relationships. Over the decades, many models have been proposed for graph representation learning. This paper aims to show a comprehensive picture of graph representation learning models, including traditional and state-of-the-art models on various graphs in different geometric spaces. First, we begin with five types of graph embedding models: graph kernels, matrix factorization models, shallow models, deep-learning models, and non-Euclidean models. In addition, we also discuss graph transformer models and Gaussian embedding models. Second, we present practical applications of graph embedding models, from constructing graphs for specific domains to applying models to solve tasks. Finally, we discuss challenges for existing models and future research directions in detail. As a result, this paper provides a structured overview of the diversity of graph embedding models.

14.
Entropy (Basel) ; 25(4)2023 Mar 26.
Artigo em Inglês | MEDLINE | ID: mdl-37190356

RESUMO

The graph autoencoder (GAE) is a powerful graph representation learning tool in an unsupervised learning manner for graph data. However, most existing GAE-based methods typically focus on preserving the graph topological structure by reconstructing the adjacency matrix while ignoring the preservation of the attribute information of nodes. Thus, the node attributes cannot be fully learned and the ability of the GAE to learn higher-quality representations is weakened. To address the issue, this paper proposes a novel GAE model that preserves node attribute similarity. The structural graph and the attribute neighbor graph, which is constructed based on the attribute similarity between nodes, are integrated as the encoder input using an effective fusion strategy. In the encoder, the attributes of the nodes can be aggregated both in their structural neighborhood and by their attribute similarity in their attribute neighborhood. This allows performing the fusion of the structural and node attribute information in the node representation by sharing the same encoder. In the decoder module, the adjacency matrix and the attribute similarity matrix of the nodes are reconstructed using dual decoders. The cross-entropy loss of the reconstructed adjacency matrix and the mean-squared error loss of the reconstructed node attribute similarity matrix are used to update the model parameters and ensure that the node representation preserves the original structural and node attribute similarity information. Extensive experiments on three citation networks show that the proposed method outperforms state-of-the-art algorithms in link prediction and node clustering tasks.

15.
Entropy (Basel) ; 25(2)2023 Jan 31.
Artigo em Inglês | MEDLINE | ID: mdl-36832623

RESUMO

Understanding the evolutionary patterns of real-world complex systems such as human interactions, biological interactions, transport networks, and computer networks is important for our daily lives. Predicting future links among the nodes in these dynamic networks has many practical implications. This research aims to enhance our understanding of the evolution of networks by formulating and solving the link-prediction problem for temporal networks using graph representation learning as an advanced machine learning approach. Learning useful representations of nodes in these networks provides greater predictive power with less computational complexity and facilitates the use of machine learning methods. Considering that existing models fail to consider the temporal dimensions of the networks, this research proposes a novel temporal network-embedding algorithm for graph representation learning. This algorithm generates low-dimensional features from large, high-dimensional networks to predict temporal patterns in dynamic networks. The proposed algorithm includes a new dynamic node-embedding algorithm that exploits the evolving nature of the networks by considering a simple three-layer graph neural network at each time step and extracting node orientation by using Given's angle method. Our proposed temporal network-embedding algorithm, TempNodeEmb, is validated by comparing it to seven state-of-the-art benchmark network-embedding models. These models are applied to eight dynamic protein-protein interaction networks and three other real-world networks, including dynamic email networks, online college text message networks, and human real contact datasets. To improve our model, we have considered time encoding and proposed another extension to our model, TempNodeEmb++. The results show that our proposed models outperform the state-of-the-art models in most cases based on two evaluation metrics.

16.
Entropy (Basel) ; 25(2)2023 Feb 10.
Artigo em Inglês | MEDLINE | ID: mdl-36832697

RESUMO

Topological Data Analysis (TDA) is an approach to analyzing the shape of data using techniques from algebraic topology. The staple of TDA is Persistent Homology (PH). Recent years have seen a trend of combining PH and Graph Neural Networks (GNNs) in an end-to-end manner to capture topological features from graph data. Though effective, these methods are limited by the shortcomings of PH: incomplete topological information and irregular output format. Extended Persistent Homology (EPH), as a variant of PH, addresses these problems elegantly. In this paper, we propose a plug-in topological layer for GNNs, termed Topological Representation with Extended Persistent Homology (TREPH). Taking advantage of the uniformity of EPH, a novel aggregation mechanism is designed to collate topological features of different dimensions to the local positions determining their living processes. The proposed layer is provably differentiable and more expressive than PH-based representations, which in turn is strictly stronger than message-passing GNNs in expressive power. Experiments on real-world graph classification tasks demonstrate the competitiveness of TREPH compared with the state-of-the-art approaches.

17.
BMC Bioinformatics ; 23(Suppl 4): 560, 2022 Dec 23.
Artigo em Inglês | MEDLINE | ID: mdl-36564705

RESUMO

BACKGROUND: Anticancer peptide (ACP) inhibits and kills tumor cells. Research on ACP is of great significance for the development of new drugs, and the prediction of ACPs and non-ACPs is the new hotspot. RESULTS: We propose a new machine learning-based method named GCNCPR-ACPs (a Graph Convolutional Neural Network Method based on collapse pooling and residual network to predict the ACPs), which automatically and accurately predicts ACPs using residual graph convolution networks, differentiable graph pooling, and features extracted using peptide sequence information extraction. The GCNCPR-ACPs method can effectively capture different levels of node attributes for amino acid node representation learning, GCNCPR-ACPs uses node2vec and one-hot embedding methods to extract initial amino acid features for ACP prediction. CONCLUSIONS: Experimental results of ten-fold cross-validation and independent validation based on different metrics showed that GCNCPR-ACPs significantly outperformed state-of-the-art methods. Specifically, the evaluation indicators of Matthews Correlation Coefficient (MCC) and AUC of our predicator were 69.5% and 90%, respectively, which were 4.3% and 2% higher than those of the other predictors, respectively, in ten-fold cross-validation. And in the independent test, the scores of MCC and SP were 69.6% and 93.9%, respectively, which were 37.6% and 5.5% higher than those of the other predictors, respectively. The overall results showed that the GCNCPR-ACPs method proposed in the current paper can effectively predict ACPs.


Assuntos
Aminoácidos , Projetos de Pesquisa , Sequência de Aminoácidos , Benchmarking , Armazenamento e Recuperação da Informação
18.
BMC Bioinformatics ; 23(1): 9, 2022 Jan 04.
Artigo em Inglês | MEDLINE | ID: mdl-34983364

RESUMO

BACKGROUND: Drug-disease associations (DDAs) can provide important information for exploring the potential efficacy of drugs. However, up to now, there are still few DDAs verified by experiments. Previous evidence indicates that the combination of information would be conducive to the discovery of new DDAs. How to integrate different biological data sources and identify the most effective drugs for a certain disease based on drug-disease coupled mechanisms is still a challenging problem. RESULTS: In this paper, we proposed a novel computation model for DDA predictions based on graph representation learning over multi-biomolecular network (GRLMN). More specifically, we firstly constructed a large-scale molecular association network (MAN) by integrating the associations among drugs, diseases, proteins, miRNAs, and lncRNAs. Then, a graph embedding model was used to learn vector representations for all drugs and diseases in MAN. Finally, the combined features were fed to a random forest (RF) model to predict new DDAs. The proposed model was evaluated on the SCMFDD-S data set using five-fold cross-validation. Experiment results showed that GRLMN model was very accurate with the area under the ROC curve (AUC) of 87.9%, which outperformed all previous works in terms of both accuracy and AUC in benchmark dataset. To further verify the high performance of GRLMN, we carried out two case studies for two common diseases. As a result, in the ranking of drugs that were predicted to be related to certain diseases (such as kidney disease and fever), 15 of the top 20 drugs have been experimentally confirmed. CONCLUSIONS: The experimental results show that our model has good performance in the prediction of DDA. GRLMN is an effective prioritization tool for screening the reliable DDAs for follow-up studies concerning their participation in drug reposition.


Assuntos
MicroRNAs , Preparações Farmacêuticas , RNA Longo não Codificante , Área Sob a Curva , Humanos , MicroRNAs/metabolismo , Proteínas , RNA Longo não Codificante/metabolismo
19.
BMC Bioinformatics ; 23(1): 516, 2022 Dec 01.
Artigo em Inglês | MEDLINE | ID: mdl-36456957

RESUMO

BACKGROUND: Drug repositioning is a very important task that provides critical information for exploring the potential efficacy of drugs. Yet developing computational models that can effectively predict drug-disease associations (DDAs) is still a challenging task. Previous studies suggest that the accuracy of DDA prediction can be improved by integrating different types of biological features. But how to conduct an effective integration remains a challenging problem for accurately discovering new indications for approved drugs. METHODS: In this paper, we propose a novel meta-path based graph representation learning model, namely RLFDDA, to predict potential DDAs on heterogeneous biological networks. RLFDDA first calculates drug-drug similarities and disease-disease similarities as the intrinsic biological features of drugs and diseases. A heterogeneous network is then constructed by integrating DDAs, disease-protein associations and drug-protein associations. With such a network, RLFDDA adopts a meta-path random walk model to learn the latent representations of drugs and diseases, which are concatenated to construct joint representations of drug-disease associations. As the last step, we employ the random forest classifier to predict potential DDAs with their joint representations. RESULTS: To demonstrate the effectiveness of RLFDDA, we have conducted a series of experiments on two benchmark datasets by following a ten-fold cross-validation scheme. The results show that RLFDDA yields the best performance in terms of AUC and F1-score when compared with several state-of-the-art DDAs prediction models. We have also conducted a case study on two common diseases, i.e., paclitaxel and lung tumors, and found that 7 out of top-10 diseases and 8 out of top-10 drugs have already been validated for paclitaxel and lung tumors respectively with literature evidence. Hence, the promising performance of RLFDDA may provide a new perspective for novel DDAs discovery over heterogeneous networks.


Assuntos
Aprendizagem , Neoplasias Pulmonares , Humanos , Benchmarking , Descoberta de Drogas , Paclitaxel
20.
BMC Bioinformatics ; 23(1): 429, 2022 Oct 17.
Artigo em Inglês | MEDLINE | ID: mdl-36245002

RESUMO

BACKGROUND: Gene expression is regulated at different molecular levels, including chromatin accessibility, transcription, RNA maturation, and transport. These regulatory mechanisms have strong connections with cellular metabolism. In order to study the cellular system and its functioning, omics data at each molecular level can be generated and efficiently integrated. Here, we propose BRANENET, a novel multi-omics integration framework for multilayer heterogeneous networks. BRANENET is an expressive, scalable, and versatile method to learn node embeddings, leveraging random walk information within a matrix factorization framework. Our goal is to efficiently integrate multi-omics data to study different regulatory aspects of multilayered processes that occur in organisms. We evaluate our framework using multi-omics data of Saccharomyces cerevisiae, a well-studied yeast model organism. RESULTS: We test BRANENET on transcriptomics (RNA-seq) and targeted metabolomics (NMR) data for wild-type yeast strain during a heat-shock time course of 0, 20, and 120 min. Our framework learns features for differentially expressed bio-molecules showing heat stress response. We demonstrate the applicability of the learned features for targeted omics inference tasks: transcription factor (TF)-target prediction, integrated omics network (ION) inference, and module identification. The performance of BRANENET is compared to existing network integration methods. Our model outperforms baseline methods by achieving high prediction scores for a variety of downstream tasks.


Assuntos
RNA , Saccharomyces cerevisiae , Cromatina , RNA-Seq , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa