Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 264
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Brief Bioinform ; 25(2)2024 Jan 22.
Artigo em Inglês | MEDLINE | ID: mdl-38426324

RESUMO

Emerging clinical evidence suggests that sophisticated associations with circular ribonucleic acids (RNAs) (circRNAs) and microRNAs (miRNAs) are a critical regulatory factor of various pathological processes and play a critical role in most intricate human diseases. Nonetheless, the above correlations via wet experiments are error-prone and labor-intensive, and the underlying novel circRNA-miRNA association (CMA) has been validated by numerous existing computational methods that rely only on single correlation data. Considering the inadequacy of existing machine learning models, we propose a new model named BGF-CMAP, which combines the gradient boosting decision tree with natural language processing and graph embedding methods to infer associations between circRNAs and miRNAs. Specifically, BGF-CMAP extracts sequence attribute features and interaction behavior features by Word2vec and two homogeneous graph embedding algorithms, large-scale information network embedding and graph factorization, respectively. Multitudinous comprehensive experimental analysis revealed that BGF-CMAP successfully predicted the complex relationship between circRNAs and miRNAs with an accuracy of 82.90% and an area under receiver operating characteristic of 0.9075. Furthermore, 23 of the top 30 miRNA-associated circRNAs of the studies on data were confirmed in relevant experiences, showing that the BGF-CMAP model is superior to others. BGF-CMAP can serve as a helpful model to provide a scientific theoretical basis for the study of CMA prediction.


Assuntos
MicroRNAs , Humanos , MicroRNAs/genética , RNA Circular/genética , Curva ROC , Aprendizado de Máquina , Algoritmos , Biologia Computacional/métodos
2.
Brief Bioinform ; 25(2)2024 Jan 22.
Artigo em Inglês | MEDLINE | ID: mdl-38324624

RESUMO

Connections between circular RNAs (circRNAs) and microRNAs (miRNAs) assume a pivotal position in the onset, evolution, diagnosis and treatment of diseases and tumors. Selecting the most potential circRNA-related miRNAs and taking advantage of them as the biological markers or drug targets could be conducive to dealing with complex human diseases through preventive strategies, diagnostic procedures and therapeutic approaches. Compared to traditional biological experiments, leveraging computational models to integrate diverse biological data in order to infer potential associations proves to be a more efficient and cost-effective approach. This paper developed a model of Convolutional Autoencoder for CircRNA-MiRNA Associations (CA-CMA) prediction. Initially, this model merged the natural language characteristics of the circRNA and miRNA sequence with the features of circRNA-miRNA interactions. Subsequently, it utilized all circRNA-miRNA pairs to construct a molecular association network, which was then fine-tuned by labeled samples to optimize the network parameters. Finally, the prediction outcome is obtained by utilizing the deep neural networks classifier. This model innovatively combines the likelihood objective that preserves the neighborhood through optimization, to learn the continuous feature representation of words and preserve the spatial information of two-dimensional signals. During the process of 5-fold cross-validation, CA-CMA exhibited exceptional performance compared to numerous prior computational approaches, as evidenced by its mean area under the receiver operating characteristic curve of 0.9138 and a minimal SD of 0.0024. Furthermore, recent literature has confirmed the accuracy of 25 out of the top 30 circRNA-miRNA pairs identified with the highest CA-CMA scores during case studies. The results of these experiments highlight the robustness and versatility of our model.


Assuntos
MicroRNAs , Neoplasias , Humanos , MicroRNAs/genética , RNA Circular/genética , Funções Verossimilhança , Redes Neurais de Computação , Neoplasias/genética , Biologia Computacional/métodos
3.
Brief Bioinform ; 24(1)2023 01 19.
Artigo em Inglês | MEDLINE | ID: mdl-36445194

RESUMO

piRNA and PIWI proteins have been confirmed for disease diagnosis and treatment as novel biomarkers due to its abnormal expression in various cancers. However, the current research is not strong enough to further clarify the functions of piRNA in cancer and its underlying mechanism. Therefore, how to provide large-scale and serious piRNA candidates for biological research has grown up to be a pressing issue. In this study, a novel computational model based on the structural perturbation method is proposed to predict potential disease-associated piRNAs, called SPRDA. Notably, SPRDA belongs to positive-unlabeled learning, which is unaffected by negative examples in contrast to previous approaches. In the 5-fold cross-validation, SPRDA shows high performance on the benchmark dataset piRDisease, with an AUC of 0.9529. Furthermore, the predictive performance of SPRDA for 10 diseases shows the robustness of the proposed method. Overall, the proposed approach can provide unique insights into the pathogenesis of the disease and will advance the field of oncology diagnosis and treatment.


Assuntos
Neoplasias , RNA de Interação com Piwi , Humanos , RNA Interferente Pequeno/genética , RNA Interferente Pequeno/metabolismo , Neoplasias/genética , Neoplasias/metabolismo
4.
Brief Bioinform ; 24(3)2023 05 19.
Artigo em Inglês | MEDLINE | ID: mdl-36971393

RESUMO

MOTIVATION: A large number of studies have shown that circular RNA (circRNA) affects biological processes by competitively binding miRNA, providing a new perspective for the diagnosis, and treatment of human diseases. Therefore, exploring the potential circRNA-miRNA interactions (CMIs) is an important and urgent task at present. Although some computational methods have been tried, their performance is limited by the incompleteness of feature extraction in sparse networks and the low computational efficiency of lengthy data. RESULTS: In this paper, we proposed JSNDCMI, which combines the multi-structure feature extraction framework and Denoising Autoencoder (DAE) to meet the challenge of CMI prediction in sparse networks. In detail, JSNDCMI integrates functional similarity and local topological structure similarity in the CMI network through the multi-structure feature extraction framework, then forces the neural network to learn the robust representation of features through DAE and finally uses the Gradient Boosting Decision Tree classifier to predict the potential CMIs. JSNDCMI produces the best performance in the 5-fold cross-validation of all data sets. In the case study, seven of the top 10 CMIs with the highest score were verified in PubMed. AVAILABILITY: The data and source code can be found at https://github.com/1axin/JSNDCMI.


Assuntos
MicroRNAs , Humanos , MicroRNAs/genética , RNA Circular , Redes Neurais de Computação , Software , Biologia Computacional/métodos
5.
BMC Bioinformatics ; 25(1): 6, 2024 Jan 02.
Artigo em Inglês | MEDLINE | ID: mdl-38166644

RESUMO

According to the expression of miRNA in pathological processes, miRNAs can be divided into oncogenes or tumor suppressors. Prediction of the regulation relations between miRNAs and small molecules (SMs) becomes a vital goal for miRNA-target therapy. But traditional biological approaches are laborious and expensive. Thus, there is an urgent need to develop a computational model. In this study, we proposed a computational model to predict whether the regulatory relationship between miRNAs and SMs is up-regulated or down-regulated. Specifically, we first use the Large-scale Information Network Embedding (LINE) algorithm to construct the node features from the self-similarity networks, then use the General Attributed Multiplex Heterogeneous Network Embedding (GATNE) algorithm to extract the topological information from the attribute network, and finally utilize the Light Gradient Boosting Machine (LightGBM) algorithm to predict the regulatory relationship between miRNAs and SMs. In the fivefold cross-validation experiment, the average accuracies of the proposed model on the SM2miR dataset reached 79.59% and 80.37% for up-regulation pairs and down-regulation pairs, respectively. In addition, we compared our model with another published model. Moreover, in the case study for 5-FU, 7 of 10 candidate miRNAs are confirmed by related literature. Therefore, we believe that our model can promote the research of miRNA-targeted therapy.


Assuntos
MicroRNAs , MicroRNAs/genética , MicroRNAs/metabolismo , Biologia Computacional , Algoritmos , Oncogenes
6.
Brief Bioinform ; 23(6)2022 11 19.
Artigo em Inglês | MEDLINE | ID: mdl-36384071

RESUMO

Emerging evidence suggests that circular RNA (circRNA) is an important regulator of a variety of pathological processes and serves as a promising biomarker for many complex human diseases. Nevertheless, there are relatively few known circRNA-disease associations, and uncovering new circRNA-disease associations by wet-lab methods is time consuming and costly. Considering the limitations of existing computational methods, we propose a novel approach named MNMDCDA, which combines high-order graph convolutional networks (high-order GCNs) and deep neural networks to infer associations between circRNAs and diseases. Firstly, we computed different biological attribute information of circRNA and disease separately and used them to construct multiple multi-source similarity networks. Then, we used the high-order GCN algorithm to learn feature embedding representations with high-order mixed neighborhood information of circRNA and disease from the constructed multi-source similarity networks, respectively. Finally, the deep neural network classifier was implemented to predict associations of circRNAs with diseases. The MNMDCDA model obtained AUC scores of 95.16%, 94.53%, 89.80% and 91.83% on four benchmark datasets, i.e., CircR2Disease, CircAtlas v2.0, Circ2Disease and CircRNADisease, respectively, using the 5-fold cross-validation approach. Furthermore, 25 of the top 30 circRNA-disease pairs with the best scores of MNMDCDA in the case study were validated by recent literature. Numerous experimental results indicate that MNMDCDA can be used as an effective computational tool to predict circRNA-disease associations and can provide the most promising candidates for biological experiments.


Assuntos
Redes Neurais de Computação , RNA Circular , Humanos , Algoritmos
7.
Brief Bioinform ; 23(1)2022 01 17.
Artigo em Inglês | MEDLINE | ID: mdl-34471921

RESUMO

Graph is a natural data structure for describing complex systems, which contains a set of objects and relationships. Ubiquitous real-life biomedical problems can be modeled as graph analytics tasks. Machine learning, especially deep learning, succeeds in vast bioinformatics scenarios with data represented in Euclidean domain. However, rich relational information between biological elements is retained in the non-Euclidean biomedical graphs, which is not learning friendly to classic machine learning methods. Graph representation learning aims to embed graph into a low-dimensional space while preserving graph topology and node properties. It bridges biomedical graphs and modern machine learning methods and has recently raised widespread interest in both machine learning and bioinformatics communities. In this work, we summarize the advances of graph representation learning and its representative applications in bioinformatics. To provide a comprehensive and structured analysis and perspective, we first categorize and analyze both graph embedding methods (homogeneous graph embedding, heterogeneous graph embedding, attribute graph embedding) and graph neural networks. Furthermore, we summarize their representative applications from molecular level to genomics, pharmaceutical and healthcare systems level. Moreover, we provide open resource platforms and libraries for implementing these graph representation learning methods and discuss the challenges and opportunities of graph representation learning in bioinformatics. This work provides a comprehensive survey of emerging graph representation learning algorithms and their applications in bioinformatics. It is anticipated that it could bring valuable insights for researchers to contribute their knowledge to graph representation learning and future-oriented bioinformatics studies.


Assuntos
Biologia Computacional , Redes Neurais de Computação , Algoritmos , Biologia Computacional/métodos , Conhecimento , Aprendizado de Máquina
8.
Brief Bioinform ; 23(1)2022 01 17.
Artigo em Inglês | MEDLINE | ID: mdl-34891172

RESUMO

Identifying new indications for drugs plays an essential role at many phases of drug research and development. Computational methods are regarded as an effective way to associate drugs with new indications. However, most of them complete their tasks by constructing a variety of heterogeneous networks without considering the biological knowledge of drugs and diseases, which are believed to be useful for improving the accuracy of drug repositioning. To this end, a novel heterogeneous information network (HIN) based model, namely HINGRL, is proposed to precisely identify new indications for drugs based on graph representation learning techniques. More specifically, HINGRL first constructs a HIN by integrating drug-disease, drug-protein and protein-disease biological networks with the biological knowledge of drugs and diseases. Then, different representation strategies are applied to learn the features of nodes in the HIN from the topological and biological perspectives. Finally, HINGRL adopts a Random Forest classifier to predict unknown drug-disease associations based on the integrated features of drugs and diseases obtained in the previous step. Experimental results demonstrate that HINGRL achieves the best performance on two real datasets when compared with state-of-the-art models. Besides, our case studies indicate that the simultaneous consideration of network topology and biological knowledge of drugs and diseases allows HINGRL to precisely predict drug-disease associations from a more comprehensive perspective. The promising performance of HINGRL also reveals that the utilization of rich heterogeneous information provides an alternative view for HINGRL to identify novel drug-disease associations especially for new diseases.


Assuntos
Serviços de Informação , Aprendizado de Máquina , Preparações Farmacêuticas , Algoritmos , Biologia Computacional/métodos , Doença , Reposicionamento de Medicamentos/métodos , Humanos , Modelos Teóricos , Redes Neurais de Computação
9.
Brief Bioinform ; 23(2)2022 03 10.
Artigo em Inglês | MEDLINE | ID: mdl-35079767

RESUMO

Numerous experiments have demonstrated that abnormal expression of microRNAs (miRNAs) in organisms is often accompanied by the emergence of specific diseases. The research of miRNAs can promote the prevention and drug research of specific diseases. However, there are still many undiscovered links between miRNAs and diseases, which greatly limits the research of miRNAs. Therefore, for exploring the unknown miRNA-disease associations, we combine the graph random propagation network based on DropFeature with attention network to propose a novel deep learning model to predict the miRNA-disease associations (GRPAMDA). Specifically, we firstly construct the miRNA-disease heterogeneous graph based on miRNA-disease association information. Secondly, we adopt DropFeature to randomly delete the features of nodes in the graph and then perform propagation operations to enhance the features of miRNA and disease nodes. Thirdly, we employ the attention mechanism to fuse the features of random propagation by aggregating the enhanced neighbor features of miRNA and disease nodes. Finally, miRNA-disease association scores are generated by a fully connected layer. The average area under the curve of GRPAMDA model based on 5-fold cross-validation is 93.46% on HMDD v2.0. Case studies of esophageal tumors, lymphomas and prostate tumors show that 48, 47 and 46 of the top 50 miRNAs associated with these diseases are confirmed by dbDEMC and miR2Disease database, respectively. In short, the GRPAMDA model can be used as a valuable method to study miRNA-disease associations.


Assuntos
MicroRNAs , Neoplasias da Próstata , Algoritmos , Biologia Computacional/métodos , Redes Reguladoras de Genes , Predisposição Genética para Doença , Humanos , Masculino , MicroRNAs/genética , MicroRNAs/metabolismo , Neoplasias da Próstata/genética
10.
Brief Bioinform ; 23(6)2022 11 19.
Artigo em Inglês | MEDLINE | ID: mdl-36198846

RESUMO

PIWI proteins and Piwi-Interacting RNAs (piRNAs) are commonly detected in human cancers, especially in germline and somatic tissues, and correlate with poorer clinical outcomes, suggesting that they play a functional role in cancer. As the problem of combinatorial explosions between ncRNA and disease exposes gradually, new bioinformatics methods for large-scale identification and prioritization of potential associations are therefore of interest. However, in the real world, the network of interactions between molecules is enormously intricate and noisy, which poses a problem for efficient graph mining. Line graphs can extend many heterogeneous networks to replace dichotomous networks. In this study, we present a new graph neural network framework, line graph attention networks (LGAT). And we apply it to predict PiRNA disease association (GAPDA). In the experiment, GAPDA performs excellently in 5-fold cross-validation with an AUC of 0.9038. Not only that, it still has superior performance compared with methods based on collaborative filtering and attribute features. The experimental results show that GAPDA ensures the prospect of the graph neural network on such problems and can be an excellent supplement for future biomedical research.


Assuntos
Proteínas Argonautas , Neoplasias , Humanos , RNA Interferente Pequeno/genética , RNA Interferente Pequeno/metabolismo , Proteínas Argonautas/genética , Proteínas Argonautas/metabolismo , Neoplasias/genética
11.
Brief Bioinform ; 23(3)2022 05 13.
Artigo em Inglês | MEDLINE | ID: mdl-35511108

RESUMO

MOTIVATION: Interaction between transcription factor (TF) and its target genes establishes the knowledge foundation for biological researches in transcriptional regulation, the number of which is, however, still limited by biological techniques. Existing computational methods relevant to the prediction of TF-target interactions are mostly proposed for predicting binding sites, rather than directly predicting the interactions. To this end, we propose here a graph attention-based autoencoder model to predict TF-target gene interactions using the information of the known TF-target gene interaction network combined with two sequential and chemical gene characters, considering that the unobserved interactions between transcription factors and target genes can be predicted by learning the pattern of the known ones. To the best of our knowledge, the proposed model is the first attempt to solve this problem by learning patterns from the known TF-target gene interaction network. RESULTS: In this paper, we formulate the prediction task of TF-target gene interactions as a link prediction problem on a complex knowledge graph and propose a deep learning model called GraphTGI, which is composed of a graph attention-based encoder and a bilinear decoder. We evaluated the prediction performance of the proposed method on a real dataset, and the experimental results show that the proposed model yields outstanding performance with an average AUC value of 0.8864 +/- 0.0057 in the 5-fold cross-validation. It is anticipated that the GraphTGI model can effectively and efficiently predict TF-target gene interactions on a large scale. AVAILABILITY: Python code and the datasets used in our studies are made available at https://github.com/YanghanWu/GraphTGI.


Assuntos
Redes Neurais de Computação
12.
Brief Bioinform ; 23(3)2022 05 13.
Artigo em Inglês | MEDLINE | ID: mdl-35323894

RESUMO

While the technologies of ribonucleic acid-sequence (RNA-seq) and transcript assembly analysis have continued to improve, a novel topology of RNA transcript was uncovered in the last decade and is called circular RNA (circRNA). Recently, researchers have revealed that they compete with messenger RNA (mRNA) and long noncoding for combining with microRNA in gene regulation. Therefore, circRNA was assumed to be associated with complex disease and discovering the relationship between them would contribute to medical research. However, the work of identifying the association between circRNA and disease in vitro takes a long time and usually without direction. During these years, more and more associations were verified by experiments. Hence, we proposed a computational method named identifying circRNA-disease association based on graph representation learning (iGRLCDA) for the prediction of the potential association of circRNA and disease, which utilized a deep learning model of graph convolution network (GCN) and graph factorization (GF). In detail, iGRLCDA first derived the hidden feature of known associations between circRNA and disease using the Gaussian interaction profile (GIP) kernel combined with disease semantic information to form a numeric descriptor. After that, it further used the deep learning model of GCN and GF to extract hidden features from the descriptor. Finally, the random forest classifier is introduced to identify the potential circRNA-disease association. The five-fold cross-validation of iGRLCDA shows strong competitiveness in comparison with other excellent prediction models at the gold standard data and achieved an average area under the receiver operating characteristic curve of 0.9289 and an area under the precision-recall curve of 0.9377. On reviewing the prediction results from the relevant literature, 22 of the top 30 predicted circRNA-disease associations were noted in recent published papers. These exceptional results make us believe that iGRLCDA can provide reliable circRNA-disease associations for medical research and reduce the blindness of wet-lab experiments.


Assuntos
MicroRNAs , RNA Circular , Algoritmos , Biologia Computacional/métodos , MicroRNAs/genética , Curva ROC
13.
Brief Bioinform ; 23(5)2022 09 20.
Artigo em Inglês | MEDLINE | ID: mdl-36070624

RESUMO

Drug-drug interactions (DDIs) prediction is a challenging task in drug development and clinical application. Due to the extremely large complete set of all possible DDIs, computer-aided DDIs prediction methods are getting lots of attention in the pharmaceutical industry and academia. However, most existing computational methods only use single perspective information and few of them conduct the task based on the biomedical knowledge graph (BKG), which can provide more detailed and comprehensive drug lateral side information flow. To this end, a deep learning framework, namely DeepLGF, is proposed to fully exploit BKG fusing local-global information to improve the performance of DDIs prediction. More specifically, DeepLGF first obtains chemical local information on drug sequence semantics through a natural language processing algorithm. Then a model of BFGNN based on graph neural network is proposed to extract biological local information on drug through learning embedding vector from different biological functional spaces. The global feature information is extracted from the BKG by our knowledge graph embedding method. In DeepLGF, for fusing local-global features well, we designed four aggregating methods to explore the most suitable ones. Finally, the advanced fusing feature vectors are fed into deep neural network to train and predict. To evaluate the prediction performance of DeepLGF, we tested our method in three prediction tasks and compared it with state-of-the-art models. In addition, case studies of three cancer-related and COVID-19-related drugs further demonstrated DeepLGF's superior ability for potential DDIs prediction. The webserver of the DeepLGF predictor is freely available at http://120.77.11.78/DeepLGF/.


Assuntos
Tratamento Farmacológico da COVID-19 , Reconhecimento Automatizado de Padrão , Interações Medicamentosas , Humanos , Bases de Conhecimento , Redes Neurais de Computação
14.
Brief Bioinform ; 23(5)2022 09 20.
Artigo em Inglês | MEDLINE | ID: mdl-36088547

RESUMO

A large amount of clinical evidence began to mount, showing that circular ribonucleic acids (RNAs; circRNAs) perform a very important function in complex diseases by participating in transcription and translation regulation of microRNA (miRNA) target genes. However, with strict high-throughput techniques based on traditional biological experiments and the conditions and environment, the association between circRNA and miRNA can be discovered to be labor-intensive, expensive, time-consuming, and inefficient. In this paper, we proposed a novel computational model based on Word2vec, Structural Deep Network Embedding (SDNE), Convolutional Neural Network and Deep Neural Network, which predicts the potential circRNA-miRNA associations, called Word2vec, SDNE, Convolutional Neural Network and Deep Neural Network (WSCD). Specifically, the WSCD model extracts attribute feature and behaviour feature by word embedding and graph embedding algorithm, respectively, and ultimately feed them into a feature fusion model constructed by combining Convolutional Neural Network and Deep Neural Network to deduce potential circRNA-miRNA interactions. The proposed method is proved on dataset and obtained a prediction accuracy and an area under the receiver operating characteristic curve of 81.61% and 0.8898, respectively, which is shown to have much higher accuracy than the state-of-the-art models and classifier models in prediction. In addition, 23 miRNA-related circular RNAs (circRNAs) from the top 30 were confirmed in relevant experiences. In these works, all results represent that WSCD would be a helpful supplementary reliable method for predicting potential miRNA-circRNA associations compared to wet laboratory experiments.


Assuntos
MicroRNAs , RNA Circular , Algoritmos , MicroRNAs/genética , Redes Neurais de Computação , Curva ROC
15.
Bioinformatics ; 39(2)2023 02 03.
Artigo em Inglês | MEDLINE | ID: mdl-36661313

RESUMO

MOTIVATION: In single-cell transcriptomics applications, effective identification of cell types in multicellular organisms and in-depth study of the relationships between genes has become one of the main goals of bioinformatics research. However, data heterogeneity and random noise pose significant difficulties for scRNA-seq data analysis. RESULTS: We have proposed an adversarial dense graph convolutional network architecture for single-cell classification. Specifically, to enhance the representation of higher-order features and the organic combination between features, dense connectivity mechanism and attention-based feature aggregation are introduced for feature learning in convolutional neural networks. To preserve the features of the original data, we use a feature reconstruction module to assist the goal of single-cell classification. In addition, HNNVAT uses virtual adversarial training to improve the generalization and robustness. Experimental results show that our model outperforms the existing classical methods in terms of classification accuracy on benchmark datasets. AVAILABILITY AND IMPLEMENTATION: The source code of HNNVAT is available at https://github.com/DisscLab/HNNVAT. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Redes Neurais de Computação , Software , Benchmarking , Análise de Célula Única
16.
Bioinformatics ; 39(8)2023 08 01.
Artigo em Inglês | MEDLINE | ID: mdl-37505483

RESUMO

MOTIVATION: The task of predicting drug-target interactions (DTIs) plays a significant role in facilitating the development of novel drug discovery. Compared with laboratory-based approaches, computational methods proposed for DTI prediction are preferred due to their high-efficiency and low-cost advantages. Recently, much attention has been attracted to apply different graph neural network (GNN) models to discover underlying DTIs from heterogeneous biological information network (HBIN). Although GNN-based prediction methods achieve better performance, they are prone to encounter the over-smoothing simulation when learning the latent representations of drugs and targets with their rich neighborhood information in HBIN, and thereby reduce the discriminative ability in DTI prediction. RESULTS: In this work, an improved graph representation learning method, namely iGRLDTI, is proposed to address the above issue by better capturing more discriminative representations of drugs and targets in a latent feature space. Specifically, iGRLDTI first constructs an HBIN by integrating the biological knowledge of drugs and targets with their interactions. After that, it adopts a node-dependent local smoothing strategy to adaptively decide the propagation depth of each biomolecule in HBIN, thus significantly alleviating over-smoothing by enhancing the discriminative ability of feature representations of drugs and targets. Finally, a Gradient Boosting Decision Tree classifier is used by iGRLDTI to predict novel DTIs. Experimental results demonstrate that iGRLDTI yields better performance that several state-of-the-art computational methods on the benchmark dataset. Besides, our case study indicates that iGRLDTI can successfully identify novel DTIs with more distinguishable features of drugs and targets. AVAILABILITY AND IMPLEMENTATION: Python codes and dataset are available at https://github.com/stevejobws/iGRLDTI/.


Assuntos
Descoberta de Drogas , Redes Neurais de Computação , Simulação por Computador , Descoberta de Drogas/métodos , Interações Medicamentosas
17.
PLoS Comput Biol ; 19(6): e1011207, 2023 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-37339154

RESUMO

Interactions between transcription factor and target gene form the main part of gene regulation network in human, which are still complicating factors in biological research. Specifically, for nearly half of those interactions recorded in established database, their interaction types are yet to be confirmed. Although several computational methods exist to predict gene interactions and their type, there is still no method available to predict them solely based on topology information. To this end, we proposed here a graph-based prediction model called KGE-TGI and trained in a multi-task learning manner on a knowledge graph that we specially constructed for this problem. The KGE-TGI model relies on topology information rather than being driven by gene expression data. In this paper, we formulate the task of predicting interaction types of transcript factor and target genes as a multi-label classification problem for link types on a heterogeneous graph, coupled with solving another link prediction problem that is inherently related. We constructed a ground truth dataset as benchmark and evaluated the proposed method on it. As a result of the 5-fold cross experiments, the proposed method achieved average AUC values of 0.9654 and 0.9339 in the tasks of link prediction and link type classification, respectively. In addition, the results of a series of comparison experiments also prove that the introduction of knowledge information significantly benefits to the prediction and that our methodology achieve state-of-the-art performance in this problem.


Assuntos
Reconhecimento Automatizado de Padrão , Fatores de Transcrição , Humanos , Bases de Dados Factuais , Fatores de Transcrição/genética , Redes Reguladoras de Genes , Proteoma , Algoritmos , Biologia de Sistemas , Ontologia Genética
18.
J Chem Inf Model ; 64(1): 238-249, 2024 Jan 08.
Artigo em Inglês | MEDLINE | ID: mdl-38103039

RESUMO

Drug repositioning plays a key role in disease treatment. With the large-scale chemical data increasing, many computational methods are utilized for drug-disease association prediction. However, most of the existing models neglect the positive influence of non-Euclidean data and multisource information, and there is still a critical issue for graph neural networks regarding how to set the feature diffuse distance. To solve the problems, we proposed SiSGC, which makes full use of the biological knowledge information as initial features and learns the structure information from the constructed heterogeneous graph with the adaptive selection of the information diffuse distance. Then, the structural features are fused with the denoised similarity information and fed to the advanced classifier of CatBoost to make predictions. Three different data sets are used to confirm the robustness and generalization of SiSGC under two splitting strategies. Experiment results demonstrate that the proposed model achieves superior performance compared with the six leading methods and four variants. Our case study on breast neoplasms further indicates that SiSGC is trustworthy and robust yet simple. We also present four drugs for breast cancer treatment with high confidence and further give an explanation for demonstrating the rationality. There is no doubt that SiSGC can be used as a beneficial supplement for drug repositioning.


Assuntos
Reposicionamento de Medicamentos , Redes Neurais de Computação
19.
BMC Bioinformatics ; 24(1): 188, 2023 May 08.
Artigo em Inglês | MEDLINE | ID: mdl-37158823

RESUMO

BACKGROUND: The limited knowledge of miRNA-lncRNA interactions is considered as an obstruction of revealing the regulatory mechanism. Accumulating evidence on Human diseases indicates that the modulation of gene expression has a great relationship with the interactions between miRNAs and lncRNAs. However, such interaction validation via crosslinking-immunoprecipitation and high-throughput sequencing (CLIP-seq) experiments that inevitably costs too much money and time but with unsatisfactory results. Therefore, more and more computational prediction tools have been developed to offer many reliable candidates for a better design of further bio-experiments. METHODS: In this work, we proposed a novel link prediction model based on Gaussian kernel-based method and linear optimization algorithm for inferring miRNA-lncRNA interactions (GKLOMLI). Given an observed miRNA-lncRNA interaction network, the Gaussian kernel-based method was employed to output two similarity matrixes of miRNAs and lncRNAs. Based on the integrated matrix combined with similarity matrixes and the observed interaction network, a linear optimization-based link prediction model was trained for inferring miRNA-lncRNA interactions. RESULTS: To evaluate the performance of our proposed method, k-fold cross-validation (CV) and leave-one-out CV were implemented, in which each CV experiment was carried out 100 times on a training set generated randomly. The high area under the curves (AUCs) at 0.8623 ± 0.0027 (2-fold CV), 0.9053 ± 0.0017 (5-fold CV), 0.9151 ± 0.0013 (10-fold CV), and 0.9236 (LOO-CV), illustrated the precision and reliability of our proposed method. CONCLUSION: GKLOMLI with high performance is anticipated to be used to reveal underlying interactions between miRNA and their target lncRNAs, and deciphers the potential mechanisms of the complex diseases.


Assuntos
MicroRNAs , RNA Longo não Codificante , Humanos , RNA Longo não Codificante/genética , Reprodutibilidade dos Testes , Projetos de Pesquisa , Algoritmos , MicroRNAs/genética
20.
Brief Bioinform ; 22(4)2021 07 20.
Artigo em Inglês | MEDLINE | ID: mdl-34293850

RESUMO

Emerging evidence indicates that the abnormal expression of miRNAs involves in the evolution and progression of various human complex diseases. Identifying disease-related miRNAs as new biomarkers can promote the development of disease pathology and clinical medicine. However, designing biological experiments to validate disease-related miRNAs is usually time-consuming and expensive. Therefore, it is urgent to design effective computational methods for predicting potential miRNA-disease associations. Inspired by the great progress of graph neural networks in link prediction, we propose a novel graph auto-encoder model, named GAEMDA, to identify the potential miRNA-disease associations in an end-to-end manner. More specifically, the GAEMDA model applies a graph neural networks-based encoder, which contains aggregator function and multi-layer perceptron for aggregating nodes' neighborhood information, to generate the low-dimensional embeddings of miRNA and disease nodes and realize the effective fusion of heterogeneous information. Then, the embeddings of miRNA and disease nodes are fed into a bilinear decoder to identify the potential links between miRNA and disease nodes. The experimental results indicate that GAEMDA achieves the average area under the curve of $93.56\pm 0.44\%$ under 5-fold cross-validation. Besides, we further carried out case studies on colon neoplasms, esophageal neoplasms and kidney neoplasms. As a result, 48 of the top 50 predicted miRNAs associated with these diseases are confirmed by the database of differentially expressed miRNAs in human cancers and microRNA deregulation in human disease database, respectively. The satisfactory prediction performance suggests that GAEMDA model could serve as a reliable tool to guide the following researches on the regulatory role of miRNAs. Besides, the source codes are available at https://github.com/chimianbuhetang/GAEMDA.


Assuntos
Bases de Dados Genéticas , Regulação Neoplásica da Expressão Gênica , MicroRNAs , Modelos Genéticos , Neoplasias , Redes Neurais de Computação , RNA Neoplásico , Software , Humanos , MicroRNAs/biossíntese , MicroRNAs/genética , Neoplasias/genética , Neoplasias/metabolismo , RNA Neoplásico/biossíntese , RNA Neoplásico/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA