Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 221
Filtrar
1.
Artículo en Inglés | MEDLINE | ID: mdl-38917286

RESUMEN

Uncovering novel drug-drug interactions (DDIs) plays a pivotal role in advancing drug development and improving clinical treatment. The outstanding effectiveness of graph neural networks (GNNs) has garnered significant interest in the field of DDI prediction. Consequently, there has been a notable surge in the development of network-based computational approaches for predicting DDIs. However, current approaches face limitations in capturing the spatial relationships between neighboring nodes and their higher-level features during the aggregation of neighbor representations. To address this issue, this study introduces a novel model, KGCNN, designed to comprehensively tackle DDI prediction tasks by considering spatial relationships between molecules within the biomedical knowledge graph (BKG). KGCNN is built upon a message-passing GNN framework, consisting of propagation and aggregation. In the context of the BKG, KGCNN governs the propagation of information based on semantic relationships, which determine the flow and exchange of information between different molecules. In contrast to traditional linear aggregators, KGCNN introduces a spatial-aware capsule aggregator, which effectively captures the spatial relationships among neighboring molecules and their higher-level features within the graph structure. The ultimate goal is to leverage these learned drug representations to predict potential DDIs. To evaluate the effectiveness of KGCNN, it undergoes testing on two datasets. Extensive experimental results demonstrate its superiority in DDI predictions and quantified performance.

2.
Comput Biol Med ; 177: 108642, 2024 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-38820777

RESUMEN

BACKGROUND: Drug-drug interaction events influence the effectiveness of drug combinations and can lead to unexpected side effects or exacerbate underlying diseases, jeopardizing patient prognosis. Most existing methods are restricted to predicting whether two drugs interact or the type of drug-drug interactions, while very few studies endeavor to predict the specific risk levels of side effects of drug combinations. METHODS: In this study, we propose MathEagle, a novel approach to predict accurate risk levels of drug combinations based on multi-head attention and heterogeneous attribute graph learning. Initially, we model drugs and three distinct risk levels between drugs as a heterogeneous information graph. Subsequently, behavioral and chemical structure features of drugs are utilized by message passing neural networks and graph embedding algorithms, respectively. Ultimately, MathEagle employs heterogeneous graph convolution and multi-head attention mechanisms to learn efficient latent representations of drug nodes and estimates the risk levels of pairwise drugs in an end-to-end manner. RESULTS: To assess the effectiveness and robustness of the model, five-fold cross-validation, ablation experiments, and case studies were conducted. MathEagle achieved an accuracy of 85.85 % and an AUC of 0.9701 on the drug risk level prediction task and is superior to all comparative models. The MathEagle predictor is freely accessible at http://120.77.11.78/MathEagle/. CONCLUSIONS: The experimental results indicate that MathEagle can function as an effective tool for predicting accurate risk of drug combinations, aiding in guiding clinical medication, and enhancing patient outcomes.


Asunto(s)
Interacciones Farmacológicas , Humanos , Algoritmos , Redes Neurales de la Computación , Efectos Colaterales y Reacciones Adversas Relacionados con Medicamentos , Aprendizaje Automático
3.
IEEE J Biomed Health Inform ; 28(7): 4281-4294, 2024 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-38557614

RESUMEN

As post-transcriptional regulators of gene expression, micro-ribonucleic acids (miRNAs) are regarded as potential biomarkers for a variety of diseases. Hence, the prediction of miRNA-disease associations (MDAs) is of great significance for an in-depth understanding of disease pathogenesis and progression. Existing prediction models are mainly concentrated on incorporating different sources of biological information to perform the MDA prediction task while failing to consider the fully potential utility of MDA network information at the motif-level. To overcome this problem, we propose a novel motif-aware MDA prediction model, namely MotifMDA, by fusing a variety of high- and low-order structural information. In particular, we first design several motifs of interest considering their ability to characterize how miRNAs are associated with diseases through different network structural patterns. Then, MotifMDA adopts a two-layer hierarchical attention to identify novel MDAs. Specifically, the first attention layer learns high-order motif preferences based on their occurrences in the given MDA network, while the second one learns the final embeddings of miRNAs and diseases through coupling high- and low-order preferences. Experimental results on two benchmark datasets have demonstrated the superior performance of MotifMDA over several state-of-the-art prediction models. This strongly indicates that accurate MDA prediction can be achieved by relying solely on MDA network information. Furthermore, our case studies indicate that the incorporation of motif-level structure information allows MotifMDA to discover novel MDAs from different perspectives.


Asunto(s)
Biología Computacional , MicroARNs , MicroARNs/genética , Humanos , Biología Computacional/métodos , Predisposición Genética a la Enfermedad/genética , Algoritmos
4.
Brief Bioinform ; 25(2)2024 Jan 22.
Artículo en Inglés | MEDLINE | ID: mdl-38426324

RESUMEN

Emerging clinical evidence suggests that sophisticated associations with circular ribonucleic acids (RNAs) (circRNAs) and microRNAs (miRNAs) are a critical regulatory factor of various pathological processes and play a critical role in most intricate human diseases. Nonetheless, the above correlations via wet experiments are error-prone and labor-intensive, and the underlying novel circRNA-miRNA association (CMA) has been validated by numerous existing computational methods that rely only on single correlation data. Considering the inadequacy of existing machine learning models, we propose a new model named BGF-CMAP, which combines the gradient boosting decision tree with natural language processing and graph embedding methods to infer associations between circRNAs and miRNAs. Specifically, BGF-CMAP extracts sequence attribute features and interaction behavior features by Word2vec and two homogeneous graph embedding algorithms, large-scale information network embedding and graph factorization, respectively. Multitudinous comprehensive experimental analysis revealed that BGF-CMAP successfully predicted the complex relationship between circRNAs and miRNAs with an accuracy of 82.90% and an area under receiver operating characteristic of 0.9075. Furthermore, 23 of the top 30 miRNA-associated circRNAs of the studies on data were confirmed in relevant experiences, showing that the BGF-CMAP model is superior to others. BGF-CMAP can serve as a helpful model to provide a scientific theoretical basis for the study of CMA prediction.


Asunto(s)
MicroARNs , Humanos , MicroARNs/genética , ARN Circular/genética , Curva ROC , Aprendizaje Automático , Algoritmos , Biología Computacional/métodos
5.
Brief Bioinform ; 25(3)2024 Mar 27.
Artículo en Inglés | MEDLINE | ID: mdl-38555472

RESUMEN

Predicting interactions between microbes and hosts plays critical roles in microbiome population genetics and microbial ecology and evolution. How to systematically characterize the sophisticated mechanisms and signal interplay between microbes and hosts is a significant challenge for global health risks. Identifying microbe-host interactions (MHIs) can not only provide helpful insights into their fundamental regulatory mechanisms, but also facilitate the development of targeted therapies for microbial infections. In recent years, computational methods have become an appealing alternative due to the high risk and cost of wet-lab experiments. Therefore, in this study, we utilized rich microbial metagenomic information to construct a novel heterogeneous microbial network (HMN)-based model named KGVHI to predict candidate microbes for target hosts. Specifically, KGVHI first built a HMN by integrating human proteins, viruses and pathogenic bacteria with their biological attributes. Then KGVHI adopted a knowledge graph embedding strategy to capture the global topological structure information of the whole network. A natural language processing algorithm is used to extract the local biological attribute information from the nodes in HMN. Finally, we combined the local and global information and fed it into a blended deep neural network (DNN) for training and prediction. Compared to state-of-the-art methods, the comprehensive experimental results show that our model can obtain excellent results on the corresponding three MHI datasets. Furthermore, we also conducted two pathogenic bacteria case studies to further indicate that KGVHI has excellent predictive capabilities for potential MHI pairs.


Asunto(s)
Aprendizaje Profundo , Humanos , Reconocimiento de Normas Patrones Automatizadas , Redes Neurales de la Computación , Algoritmos , Bacterias
6.
Brief Bioinform ; 25(2)2024 Jan 22.
Artículo en Inglés | MEDLINE | ID: mdl-38324624

RESUMEN

Connections between circular RNAs (circRNAs) and microRNAs (miRNAs) assume a pivotal position in the onset, evolution, diagnosis and treatment of diseases and tumors. Selecting the most potential circRNA-related miRNAs and taking advantage of them as the biological markers or drug targets could be conducive to dealing with complex human diseases through preventive strategies, diagnostic procedures and therapeutic approaches. Compared to traditional biological experiments, leveraging computational models to integrate diverse biological data in order to infer potential associations proves to be a more efficient and cost-effective approach. This paper developed a model of Convolutional Autoencoder for CircRNA-MiRNA Associations (CA-CMA) prediction. Initially, this model merged the natural language characteristics of the circRNA and miRNA sequence with the features of circRNA-miRNA interactions. Subsequently, it utilized all circRNA-miRNA pairs to construct a molecular association network, which was then fine-tuned by labeled samples to optimize the network parameters. Finally, the prediction outcome is obtained by utilizing the deep neural networks classifier. This model innovatively combines the likelihood objective that preserves the neighborhood through optimization, to learn the continuous feature representation of words and preserve the spatial information of two-dimensional signals. During the process of 5-fold cross-validation, CA-CMA exhibited exceptional performance compared to numerous prior computational approaches, as evidenced by its mean area under the receiver operating characteristic curve of 0.9138 and a minimal SD of 0.0024. Furthermore, recent literature has confirmed the accuracy of 25 out of the top 30 circRNA-miRNA pairs identified with the highest CA-CMA scores during case studies. The results of these experiments highlight the robustness and versatility of our model.


Asunto(s)
MicroARNs , Neoplasias , Humanos , MicroARNs/genética , ARN Circular/genética , Funciones de Verosimilitud , Redes Neurales de la Computación , Neoplasias/genética , Biología Computacional/métodos
7.
BMC Bioinformatics ; 25(1): 6, 2024 Jan 02.
Artículo en Inglés | MEDLINE | ID: mdl-38166644

RESUMEN

According to the expression of miRNA in pathological processes, miRNAs can be divided into oncogenes or tumor suppressors. Prediction of the regulation relations between miRNAs and small molecules (SMs) becomes a vital goal for miRNA-target therapy. But traditional biological approaches are laborious and expensive. Thus, there is an urgent need to develop a computational model. In this study, we proposed a computational model to predict whether the regulatory relationship between miRNAs and SMs is up-regulated or down-regulated. Specifically, we first use the Large-scale Information Network Embedding (LINE) algorithm to construct the node features from the self-similarity networks, then use the General Attributed Multiplex Heterogeneous Network Embedding (GATNE) algorithm to extract the topological information from the attribute network, and finally utilize the Light Gradient Boosting Machine (LightGBM) algorithm to predict the regulatory relationship between miRNAs and SMs. In the fivefold cross-validation experiment, the average accuracies of the proposed model on the SM2miR dataset reached 79.59% and 80.37% for up-regulation pairs and down-regulation pairs, respectively. In addition, we compared our model with another published model. Moreover, in the case study for 5-FU, 7 of 10 candidate miRNAs are confirmed by related literature. Therefore, we believe that our model can promote the research of miRNA-targeted therapy.


Asunto(s)
MicroARNs , MicroARNs/genética , MicroARNs/metabolismo , Biología Computacional , Algoritmos , Oncogenes
8.
J Chem Inf Model ; 64(1): 238-249, 2024 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-38103039

RESUMEN

Drug repositioning plays a key role in disease treatment. With the large-scale chemical data increasing, many computational methods are utilized for drug-disease association prediction. However, most of the existing models neglect the positive influence of non-Euclidean data and multisource information, and there is still a critical issue for graph neural networks regarding how to set the feature diffuse distance. To solve the problems, we proposed SiSGC, which makes full use of the biological knowledge information as initial features and learns the structure information from the constructed heterogeneous graph with the adaptive selection of the information diffuse distance. Then, the structural features are fused with the denoised similarity information and fed to the advanced classifier of CatBoost to make predictions. Three different data sets are used to confirm the robustness and generalization of SiSGC under two splitting strategies. Experiment results demonstrate that the proposed model achieves superior performance compared with the six leading methods and four variants. Our case study on breast neoplasms further indicates that SiSGC is trustworthy and robust yet simple. We also present four drugs for breast cancer treatment with high confidence and further give an explanation for demonstrating the rationality. There is no doubt that SiSGC can be used as a beneficial supplement for drug repositioning.


Asunto(s)
Reposicionamiento de Medicamentos , Redes Neurales de la Computación
9.
IEEE J Biomed Health Inform ; 28(3): 1742-1751, 2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-38127594

RESUMEN

Growing studies reveal that Circular RNAs (circRNAs) are broadly engaged in physiological processes of cell proliferation, differentiation, aging, apoptosis, and are closely associated with the pathogenesis of numerous diseases. Clarification of the correlation among diseases and circRNAs is of great clinical importance to provide new therapeutic strategies for complex diseases. However, previous circRNA-disease association prediction methods rely excessively on the graph network, and the model performance is dramatically reduced when noisy connections occur in the graph structure. To address this problem, this paper proposes an unsupervised deep graph structure learning method GSLCDA to predict potential CDAs. Concretely, we first integrate circRNA and disease multi-source data to constitute the CDA heterogeneous network. Then the network topology is learned using the graph structure, and the original graph is enhanced in an unsupervised manner by maximize the inter information of the learned and original graphs to uncover their essential features. Finally, graph space sensitive k-nearest neighbor (KNN) algorithm is employed to search for latent CDAs. In the benchmark dataset, GSLCDA obtained 92.67% accuracy with 0.9279 AUC. GSLCDA also exhibits exceptional performance on independent datasets. Furthermore, 14, 12 and 14 of the top 16 circRNAs with the most points GSLCDA prediction scores were confirmed in the relevant literature in the breast cancer, colorectal cancer and lung cancer case studies, respectively. Such results demonstrated that GSLCDA can validly reveal underlying CDA and offer new perspectives for the diagnosis and therapy of complex human diseases.


Asunto(s)
Neoplasias de la Mama , Neoplasias Pulmonares , Humanos , Femenino , ARN Circular/genética , Neoplasias de la Mama/genética , Algoritmos , Envejecimiento , Biología Computacional/métodos
10.
IEEE J Biomed Health Inform ; 28(3): 1752-1761, 2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-38145538

RESUMEN

With a growing body of evidence establishing circular RNAs (circRNAs) are widely exploited in eukaryotic cells and have a significant contribution in the occurrence and development of many complex human diseases. Disease-associated circRNAs can serve as clinical diagnostic biomarkers and therapeutic targets, providing novel ideas for biopharmaceutical research. However, available computation methods for predicting circRNA-disease associations (CDAs) do not sufficiently consider the contextual information of biological network nodes, making their performance limited. In this work, we propose a multi-hop attention graph neural network-based approach MAGCDA to infer potential CDAs. Specifically, we first construct a multi-source attribute heterogeneous network of circRNAs and diseases, then use a multi-hop strategy of graph nodes to deeply aggregate node context information through attention diffusion, thus enhancing topological structure information and mining data hidden features, and finally use random forest to accurately infer potential CDAs. In the four gold standard data sets, MAGCDA achieved prediction accuracy of 92.58%, 91.42%, 83.46% and 91.12%, respectively. MAGCDA has also presented prominent achievements in ablation experiments and in comparisons with other models. Additionally, 18 and 17 potential circRNAs in top 20 predicted scores for MAGCDA prediction scores were confirmed in case studies of the complex diseases breast cancer and Almozheimer's disease, respectively. These results suggest that MAGCDA can be a practical tool to explore potential disease-associated circRNAs and provide a theoretical basis for disease diagnosis and treatment.


Asunto(s)
Neoplasias de la Mama , ARN Circular , Humanos , Femenino , ARN Circular/genética , Redes Neurales de la Computación , Biomarcadores , Biología Computacional/métodos
11.
Commun Biol ; 6(1): 1268, 2023 12 14.
Artículo en Inglés | MEDLINE | ID: mdl-38097699

RESUMEN

Recent developments in single-cell technology have enabled the exploration of cellular heterogeneity at an unprecedented level, providing invaluable insights into various fields, including medicine and disease research. Cell type annotation is an essential step in its omics research. The mainstream approach is to utilize well-annotated single-cell data to supervised learning for cell type annotation of new singlecell data. However, existing methods lack good generalization and robustness in cell annotation tasks, partially due to difficulties in dealing with technical differences between datasets, as well as not considering the heterogeneous associations of genes in regulatory mechanism levels. Here, we propose the scPML model, which utilizes various gene signaling pathway data to partition the genetic features of cells, thus characterizing different interaction maps between cells. Extensive experiments demonstrate that scPML performs better in cell type annotation and detection of unknown cell types from different species, platforms, and tissues.


Asunto(s)
Medicina , Análisis de Expresión Génica de una Sola Célula , Transducción de Señal , Tecnología
12.
Comput Biol Med ; 165: 107421, 2023 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-37672925

RESUMEN

MOTIVATION: Accumulating clinical evidence shows that circular RNA (circRNA) plays an important regulatory role in the occurrence and development of human diseases, which is expected to provide a new perspective for the diagnosis and treatment of related diseases. Using computational methods can provide high probability preselection for wet experiments to save resources. However, due to the lack of neighborhood structure in sparse biological networks, the model based on network embedding and graph embedding is difficult to achieve ideal results. RESULTS: In this paper, we propose BioDGW-CMI, which combines biological text mining and wavelet diffusion-based sparse network structure embedding to predict circRNA-miRNA interaction (CMI). In detail, BioDGW-CMI first uses the Bidirectional Encoder Representations from Transformers (BERT) for biological text mining to mine hidden features in RNA sequences, then constructs a CMI network, obtains the topological structure embedding of nodes in the network through heat wavelet diffusion patterns. Next, the Denoising autoencoder organically combines the structural features and Gaussian kernel similarity, finally, the feature is sent to lightGBM for training and prediction. BioDGW-CMI achieves the highest prediction performance in all three datasets in the field of CMI prediction. In the case study, all the 8 pairs of CMI based on circ-ITCH were successfully predicted. AVAILABILITY: The data and source code can be found at https://github.com/1axin/BioDGW-CMI-model.

13.
Brief Bioinform ; 24(6)2023 09 22.
Artículo en Inglés | MEDLINE | ID: mdl-37742053

RESUMEN

Identifying the potential bacteriophages (phage) candidate to treat bacterial infections plays an essential role in the research of human pathogens. Computational approaches are recognized as a valid way to predict bacteria and target phages. However, most of the current methods only utilize lower-order biological information without considering the higher-order connectivity patterns, which helps to improve the predictive accuracy. Therefore, we developed a novel microbial heterogeneous interaction network (MHIN)-based model called PTBGRP to predict new phages for bacterial hosts. Specifically, PTBGRP first constructs an MHIN by integrating phage-bacteria interaction (PBI) and six bacteria-bacteria interaction networks with their biological attributes. Then, different representation learning methods are deployed to extract higher-level biological features and lower-level topological features from MHIN. Finally, PTBGRP employs a deep neural network as the classifier to predict unknown PBI pairs based on the fused biological information. Experiment results demonstrated that PTBGRP achieves the best performance on the corresponding ESKAPE pathogens and PBI dataset when compared with state-of-art methods. In addition, case studies of Klebsiella pneumoniae and Staphylococcus aureus further indicate that the consideration of rich heterogeneous information enables PTBGRP to accurately predict PBI from a more comprehensive perspective. The webserver of the PTBGRP predictor is freely available at http://120.77.11.78/PTBGRP/.


Asunto(s)
Bacteriófagos , Infecciones Estafilocócicas , Humanos , Aprendizaje , Bacterias , Redes Neurales de la Computación
14.
Brief Funct Genomics ; 2023 Aug 03.
Artículo en Inglés | MEDLINE | ID: mdl-37539561

RESUMEN

Recently, the role of competing endogenous RNAs in regulating gene expression through the interaction of microRNAs has been closely associated with the expression of circular RNAs (circRNAs) in various biological processes such as reproduction and apoptosis. While the number of confirmed circRNA-miRNA interactions (CMIs) continues to increase, the conventional in vitro approaches for discovery are expensive, labor intensive, and time consuming. Therefore, there is an urgent need for effective prediction of potential CMIs through appropriate data modeling and prediction based on known information. In this study, we proposed a novel model, called DeepCMI, that utilizes multi-source information on circRNA/miRNA to predict potential CMIs. Comprehensive evaluations on the CMI-9905 and CMI-9589 datasets demonstrated that DeepCMI successfully infers potential CMIs. Specifically, DeepCMI achieved AUC values of 90.54% and 94.8% on the CMI-9905 and CMI-9589 datasets, respectively. These results suggest that DeepCMI is an effective model for predicting potential CMIs and has the potential to significantly reduce the need for downstream in vitro studies. To facilitate the use of our trained model and data, we have constructed a computational platform, which is available at http://120.77.11.78/DeepCMI/. The source code and datasets used in this work are available at https://github.com/LiYuechao1998/DeepCMI.

15.
J Chem Inf Model ; 63(16): 5384-5394, 2023 08 28.
Artículo en Inglés | MEDLINE | ID: mdl-37535872

RESUMEN

More and more evidence suggests that circRNA plays a vital role in generating and treating diseases by interacting with miRNA. Therefore, accurate prediction of potential circRNA-miRNA interaction (CMI) has become urgent. However, traditional wet experiments are time-consuming and costly, and the results will be affected by objective factors. In this paper, we propose a computational model BCMCMI, which combines three features to predict CMI. Specifically, BCMCMI utilizes the bidirectional encoding capability of the BERT algorithm to extract sequence features from the semantic information of circRNA and miRNA. Then, a heterogeneous network is constructed based on cosine similarity and known CMI information. The Metapath2vec is employed to conduct random walks following meta-paths in the network to capture topological features, including similarity features. Finally, potential CMIs are predicted using the XGBoost classifier. BCMCMI achieves superior results compared to other state-of-the-art models on two benchmark datasets for CMI prediction. We also utilize t-SNE to visually observe the distribution of the extracted features on a randomly selected dataset. The remarkable prediction results show that BCMCMI can serve as a valuable complement to the wet experiment process.


Asunto(s)
MicroARNs , MicroARNs/genética , ARN Circular , Semántica , Algoritmos , Biología Computacional/métodos
16.
iScience ; 26(8): 107478, 2023 Aug 18.
Artículo en Inglés | MEDLINE | ID: mdl-37583550

RESUMEN

Circular RNA (circRNA) plays an important role in the diagnosis, treatment, and prognosis of human diseases. The discovery of potential circRNA-miRNA interactions (CMI) is of guiding significance for subsequent biological experiments. Limited by the small amount of experimentally supported data and high randomness, existing models are difficult to accomplish the CMI prediction task based on real cases. In this paper, we propose KS-CMI, a novel method for effectively accomplishing CMI prediction in real cases. KS-CMI enriches the 'behavior relationships' of molecules by constructing circRNA-miRNA-cancer (CMCI) networks and extracts the behavior relationship attribute of molecules based on balance theory. Next, the denoising autoencoder (DAE) is used to enhance the feature representation of molecules. Finally, the CatBoost classifier was used for prediction. KS-CMI achieved the most reliable prediction results in real cases and achieved competitive performance in all datasets in the CMI prediction.

17.
Comput Struct Biotechnol J ; 21: 3404-3413, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37397626

RESUMEN

Emerging evidence suggests that due to the misuse of antibiotics, bacteriophage (phage) therapy has been recognized as one of the most promising strategies for treating human diseases infected by antibiotic-resistant bacteria. Identification of phage-host interactions (PHIs) can help to explore the mechanisms of bacterial response to phages and provide new insights into effective therapeutic approaches. Compared to conventional wet-lab experiments, computational models for predicting PHIs can not only save time and cost, but also be more efficient and economical. In this study, we developed a deep learning predictive framework called GSPHI to identify potential phage and target bacterium pairs through DNA and protein sequence information. More specifically, GSPHI first initialized the node representations of phages and target bacterial hosts via a natural language processing algorithm. Then a graph embedding algorithm structural deep network embedding (SDNE) was utilized to extract local and global information from the interaction network, and finally, a deep neural network (DNN) was applied to accurately detect the interactions between phages and their bacterial hosts. In the drug-resistant bacteria dataset ESKAPE, GSPHI achieved a prediction accuracy of 86.65 % and AUC of 0.9208 under the 5-fold cross-validation technique, significantly better than other methods. In addition, case studies in Gram-positive and negative bacterial species demonstrated that GSPHI is competent in detecting potential Phage-host interactions. Taken together, these results indicate that GSPHI can provide reasonable candidate sensitive bacteria to phages for biological experiments. The webserver of the GSPHI predictor is freely available at http://120.77.11.78/GSPHI/.

18.
Bioinformatics ; 39(8)2023 08 01.
Artículo en Inglés | MEDLINE | ID: mdl-37505483

RESUMEN

MOTIVATION: The task of predicting drug-target interactions (DTIs) plays a significant role in facilitating the development of novel drug discovery. Compared with laboratory-based approaches, computational methods proposed for DTI prediction are preferred due to their high-efficiency and low-cost advantages. Recently, much attention has been attracted to apply different graph neural network (GNN) models to discover underlying DTIs from heterogeneous biological information network (HBIN). Although GNN-based prediction methods achieve better performance, they are prone to encounter the over-smoothing simulation when learning the latent representations of drugs and targets with their rich neighborhood information in HBIN, and thereby reduce the discriminative ability in DTI prediction. RESULTS: In this work, an improved graph representation learning method, namely iGRLDTI, is proposed to address the above issue by better capturing more discriminative representations of drugs and targets in a latent feature space. Specifically, iGRLDTI first constructs an HBIN by integrating the biological knowledge of drugs and targets with their interactions. After that, it adopts a node-dependent local smoothing strategy to adaptively decide the propagation depth of each biomolecule in HBIN, thus significantly alleviating over-smoothing by enhancing the discriminative ability of feature representations of drugs and targets. Finally, a Gradient Boosting Decision Tree classifier is used by iGRLDTI to predict novel DTIs. Experimental results demonstrate that iGRLDTI yields better performance that several state-of-the-art computational methods on the benchmark dataset. Besides, our case study indicates that iGRLDTI can successfully identify novel DTIs with more distinguishable features of drugs and targets. AVAILABILITY AND IMPLEMENTATION: Python codes and dataset are available at https://github.com/stevejobws/iGRLDTI/.


Asunto(s)
Descubrimiento de Drogas , Redes Neurales de la Computación , Simulación por Computador , Descubrimiento de Drogas/métodos , Interacciones Farmacológicas
19.
PLoS Comput Biol ; 19(6): e1011207, 2023 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-37339154

RESUMEN

Interactions between transcription factor and target gene form the main part of gene regulation network in human, which are still complicating factors in biological research. Specifically, for nearly half of those interactions recorded in established database, their interaction types are yet to be confirmed. Although several computational methods exist to predict gene interactions and their type, there is still no method available to predict them solely based on topology information. To this end, we proposed here a graph-based prediction model called KGE-TGI and trained in a multi-task learning manner on a knowledge graph that we specially constructed for this problem. The KGE-TGI model relies on topology information rather than being driven by gene expression data. In this paper, we formulate the task of predicting interaction types of transcript factor and target genes as a multi-label classification problem for link types on a heterogeneous graph, coupled with solving another link prediction problem that is inherently related. We constructed a ground truth dataset as benchmark and evaluated the proposed method on it. As a result of the 5-fold cross experiments, the proposed method achieved average AUC values of 0.9654 and 0.9339 in the tasks of link prediction and link type classification, respectively. In addition, the results of a series of comparison experiments also prove that the introduction of knowledge information significantly benefits to the prediction and that our methodology achieve state-of-the-art performance in this problem.


Asunto(s)
Reconocimiento de Normas Patrones Automatizadas , Factores de Transcripción , Humanos , Bases de Datos Factuales , Factores de Transcripción/genética , Redes Reguladoras de Genes , Proteoma , Algoritmos , Biología de Sistemas , Ontología de Genes
20.
BMC Bioinformatics ; 24(1): 188, 2023 May 08.
Artículo en Inglés | MEDLINE | ID: mdl-37158823

RESUMEN

BACKGROUND: The limited knowledge of miRNA-lncRNA interactions is considered as an obstruction of revealing the regulatory mechanism. Accumulating evidence on Human diseases indicates that the modulation of gene expression has a great relationship with the interactions between miRNAs and lncRNAs. However, such interaction validation via crosslinking-immunoprecipitation and high-throughput sequencing (CLIP-seq) experiments that inevitably costs too much money and time but with unsatisfactory results. Therefore, more and more computational prediction tools have been developed to offer many reliable candidates for a better design of further bio-experiments. METHODS: In this work, we proposed a novel link prediction model based on Gaussian kernel-based method and linear optimization algorithm for inferring miRNA-lncRNA interactions (GKLOMLI). Given an observed miRNA-lncRNA interaction network, the Gaussian kernel-based method was employed to output two similarity matrixes of miRNAs and lncRNAs. Based on the integrated matrix combined with similarity matrixes and the observed interaction network, a linear optimization-based link prediction model was trained for inferring miRNA-lncRNA interactions. RESULTS: To evaluate the performance of our proposed method, k-fold cross-validation (CV) and leave-one-out CV were implemented, in which each CV experiment was carried out 100 times on a training set generated randomly. The high area under the curves (AUCs) at 0.8623 ± 0.0027 (2-fold CV), 0.9053 ± 0.0017 (5-fold CV), 0.9151 ± 0.0013 (10-fold CV), and 0.9236 (LOO-CV), illustrated the precision and reliability of our proposed method. CONCLUSION: GKLOMLI with high performance is anticipated to be used to reveal underlying interactions between miRNA and their target lncRNAs, and deciphers the potential mechanisms of the complex diseases.


Asunto(s)
MicroARNs , ARN Largo no Codificante , Humanos , ARN Largo no Codificante/genética , Reproducibilidad de los Resultados , Proyectos de Investigación , Algoritmos , MicroARNs/genética
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA