Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 227
Filtrar
1.
J Chem Inf Model ; 64(18): 7163-7172, 2024 Sep 23.
Artigo em Inglês | MEDLINE | ID: mdl-39231016

RESUMO

Circular RNA (circRNA)-microRNA (miRNA) interaction (CMI) plays crucial roles in cellular regulation, offering promising perspectives for disease diagnosis and therapy. Therefore, it is necessary to employ computational methods for the rapid and cost-effective prediction of potential circRNA-miRNA interactions. However, the existing methods are limited by incomplete data; therefore, it is difficult to model molecules with different attributes on a large scale, which greatly hinders the efficiency and performance of prediction. In this study, we propose an effective method for predicting circRNA-miRNA interactions, called RBNE-CMI, and introduce a framework that can embed incomplete multiattribute CMI heterogeneous networks. By combining the proposed method, we integrate different data sets in the CMI prediction field into one incomplete network for modeling, achieving superior performance in 5-fold cross-validation. Moreover, in the prediction task based on complete data, the proposed method still achieves better performance than the known model. In addition, in the case study, we successfully predicted 18 of the 20 potential cancer biomarkers. The data and source code can be found at https://github.com/1axin/RBNE-CMI.


Assuntos
MicroRNAs , RNA Circular , RNA Circular/genética , RNA Circular/metabolismo , MicroRNAs/genética , MicroRNAs/metabolismo , Humanos , Biologia Computacional/métodos , Biomarcadores Tumorais/genética
2.
J Chem Inf Model ; 2024 Sep 17.
Artigo em Inglês | MEDLINE | ID: mdl-39289839

RESUMO

Current studies have demonstrated that microbe-host interactions (MHIs) play important roles in human public health. Therefore, identifying the interactions between microbes and hosts is beneficial to understanding the role of the microbiome and their underlying mechanisms. However, traditional wet-lab experimental approaches are insufficient for large-scale exploration of candidate microbes, as they are costly, laborious, and time-consuming. Thus, it is critical to prioritize microbe-interacting hosts by computational approaches for further biological experimental validation. In this work, we proposed a novel deep learning-based method called MHIPM, to predict MHIs by utilizing multisource biological information. Specifically, we first constructed a heterogeneous microbial network that consisted of human proteins, viruses, bacteriophages (phages), and pathogenic bacteria. Next, we used one of the largest protein language models, ESM-2, and a document embedding model, doc2vec, combined with a self-attention mechanism to extract the interview features from protein sequences. Then, an inductive learning-based model, GraphSAGE, was used to capture the intraview features from the heterogeneous network. Experimental results on three prediction tasks indicated that the MHIPM model consistently achieved better performance than seven baseline algorithms and its four variants. In addition, case studies and molecular docking experiments for two human proteins further confirmed the effectiveness of our model. In conclusion, MHIPM is an efficient and robust method in predicting MHIs and provides plausible candidate microbes for biological experiments. MHIPM is available at https://github.com/JIENWU/MHIPM.

3.
Artigo em Inglês | MEDLINE | ID: mdl-39264774

RESUMO

Advancements in high-throughput technologies have yielded large-scale human gut microbiota profiles, sparking considerable interest in exploring the relationship between the gut microbiome and complex human diseases. Through extracting and integrating knowledge from complex microbiome data, existing machine learning (ML)-based studies have demonstrated their effectiveness in the precise identification of high-risk individuals. However, these approaches struggle to address the heterogeneity and sparsity of microbial features and explore the intrinsic relatedness among human diseases. In this work, we reframe human gut microbiome-based disease detection as a multilabel classification (MLC) problem and integrate a range of innovative techniques within the proposed MLC framework, aptly named GutMLC. Specifically, the entity semantic similarity as priori knowledge is incorporated into multilabel feature selection and loss functions by capturing the shared attributes and inherent associations among diseases and microbes. To tackle the issue of label imbalance, both within and between labels, we adapt the focal loss (FL) function for MLC using debiased inverse weighting. Extensive experiment results consistently demonstrate the competitive performance of GutMLC in comparison with commonly used MLC and single-label classification (SLC) algorithms. This work seeks to unlock the potential of gut microbiota as robust biomarkers for multiple disease prediction.

4.
Genome Biol ; 25(1): 207, 2024 Aug 05.
Artigo em Inglês | MEDLINE | ID: mdl-39103856

RESUMO

Cell type identification is an indispensable analytical step in single-cell data analyses. To address the high noise stemming from gene expression data, existing computational methods often overlook the biologically meaningful relationships between genes, opting to reduce all genes to a unified data space. We assume that such relationships can aid in characterizing cell type features and improving cell type recognition accuracy. To this end, we introduce scPriorGraph, a dual-channel graph neural network that integrates multi-level gene biosemantics. Experimental results demonstrate that scPriorGraph effectively aggregates feature values of similar cells using high-quality graphs, achieving state-of-the-art performance in cell type identification.


Assuntos
Análise de Célula Única , Análise de Célula Única/métodos , Humanos , Redes Neurais de Computação , RNA-Seq/métodos , Biologia Computacional/métodos , Algoritmos , Software , Análise da Expressão Gênica de Célula Única
5.
Artigo em Inglês | MEDLINE | ID: mdl-39102330

RESUMO

Extensive research indicates that microRNAs (miRNAs) play a crucial role in the analysis of complex human diseases. Recently, numerous methods utilizing graph neural networks have been developed to investigate the complex relationships between miRNAs and diseases. However, these methods often face challenges in terms of overall effectiveness and are sensitive to node positioning. To address these issues, the researchers introduce DARSFormer, an advanced deep learning model that integrates dynamic attention mechanisms with a spectral graph Transformer effectively. In the DARSFormer model, a miRNA-disease heterogeneous network is constructed initially. This network undergoes spectral decomposition into eigenvalues and eigenvectors, with the eigenvalue scalars being mapped into a vector space subsequently. An orthogonal graph neural network is employed to refine the parameter matrix. The enhanced features are then input into a graph Transformer, which utilizes a dynamic attention mechanism to amalgamate features by aggregating the enhanced neighbor features of miRNA and disease nodes. A projection layer is subsequently utilized to derive the association scores between miRNAs and diseases. The performance of DARSFormer in predicting miRNA-disease associations is exemplary. It achieves an AUC of 94.18% in a five-fold cross-validation on the HMDD v2.0 database. Similarly, on HMDD v3.2, it records an AUC of 95.27%. Case studies involving colorectal, esophageal, and prostate tumors confirm 27, 28, and 26 of the top 30 associated miRNAs against the dbDEMC and miR2Disease databases, respectively. The code and data for DARSFormer are accessible at https://github.com/baibaibaialone/DARSFormer.

6.
BMC Bioinformatics ; 25(1): 264, 2024 Aug 10.
Artigo em Inglês | MEDLINE | ID: mdl-39127625

RESUMO

Circular RNA (CircRNA)-microRNA (miRNA) interaction (CMI) is an important model for the regulation of biological processes by non-coding RNA (ncRNA), which provides a new perspective for the study of human complex diseases. However, the existing CMI prediction models mainly rely on the nearest neighbor structure in the biological network, ignoring the molecular network topology, so it is difficult to improve the prediction performance. In this paper, we proposed a new CMI prediction method, BEROLECMI, which uses molecular sequence attributes, molecular self-similarity, and biological network topology to define the specific role feature representation for molecules to infer the new CMI. BEROLECMI effectively makes up for the lack of network topology in the CMI prediction model and achieves the highest prediction performance in three commonly used data sets. In the case study, 14 of the 15 pairs of unknown CMIs were correctly predicted.


Assuntos
Biologia Computacional , MicroRNAs , RNA Circular , MicroRNAs/genética , MicroRNAs/metabolismo , MicroRNAs/química , RNA Circular/genética , RNA Circular/metabolismo , Humanos , Biologia Computacional/métodos , RNA/química , RNA/genética , RNA/metabolismo , Algoritmos , Redes Reguladoras de Genes
7.
Artigo em Inglês | MEDLINE | ID: mdl-38917286

RESUMO

Uncovering novel drug-drug interactions (DDIs) plays a pivotal role in advancing drug development and improving clinical treatment. The outstanding effectiveness of graph neural networks (GNNs) has garnered significant interest in the field of DDI prediction. Consequently, there has been a notable surge in the development of network-based computational approaches for predicting DDIs. However, current approaches face limitations in capturing the spatial relationships between neighboring nodes and their higher-level features during the aggregation of neighbor representations. To address this issue, this study introduces a novel model, KGCNN, designed to comprehensively tackle DDI prediction tasks by considering spatial relationships between molecules within the biomedical knowledge graph (BKG). KGCNN is built upon a message-passing GNN framework, consisting of propagation and aggregation. In the context of the BKG, KGCNN governs the propagation of information based on semantic relationships, which determine the flow and exchange of information between different molecules. In contrast to traditional linear aggregators, KGCNN introduces a spatial-aware capsule aggregator, which effectively captures the spatial relationships among neighboring molecules and their higher-level features within the graph structure. The ultimate goal is to leverage these learned drug representations to predict potential DDIs. To evaluate the effectiveness of KGCNN, it undergoes testing on two datasets. Extensive experimental results demonstrate its superiority in DDI predictions and quantified performance.

8.
Comput Biol Med ; 177: 108642, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38820777

RESUMO

BACKGROUND: Drug-drug interaction events influence the effectiveness of drug combinations and can lead to unexpected side effects or exacerbate underlying diseases, jeopardizing patient prognosis. Most existing methods are restricted to predicting whether two drugs interact or the type of drug-drug interactions, while very few studies endeavor to predict the specific risk levels of side effects of drug combinations. METHODS: In this study, we propose MathEagle, a novel approach to predict accurate risk levels of drug combinations based on multi-head attention and heterogeneous attribute graph learning. Initially, we model drugs and three distinct risk levels between drugs as a heterogeneous information graph. Subsequently, behavioral and chemical structure features of drugs are utilized by message passing neural networks and graph embedding algorithms, respectively. Ultimately, MathEagle employs heterogeneous graph convolution and multi-head attention mechanisms to learn efficient latent representations of drug nodes and estimates the risk levels of pairwise drugs in an end-to-end manner. RESULTS: To assess the effectiveness and robustness of the model, five-fold cross-validation, ablation experiments, and case studies were conducted. MathEagle achieved an accuracy of 85.85 % and an AUC of 0.9701 on the drug risk level prediction task and is superior to all comparative models. The MathEagle predictor is freely accessible at http://120.77.11.78/MathEagle/. CONCLUSIONS: The experimental results indicate that MathEagle can function as an effective tool for predicting accurate risk of drug combinations, aiding in guiding clinical medication, and enhancing patient outcomes.


Assuntos
Interações Medicamentosas , Humanos , Algoritmos , Redes Neurais de Computação , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos , Aprendizado de Máquina
9.
IEEE J Biomed Health Inform ; 28(7): 4281-4294, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38557614

RESUMO

As post-transcriptional regulators of gene expression, micro-ribonucleic acids (miRNAs) are regarded as potential biomarkers for a variety of diseases. Hence, the prediction of miRNA-disease associations (MDAs) is of great significance for an in-depth understanding of disease pathogenesis and progression. Existing prediction models are mainly concentrated on incorporating different sources of biological information to perform the MDA prediction task while failing to consider the fully potential utility of MDA network information at the motif-level. To overcome this problem, we propose a novel motif-aware MDA prediction model, namely MotifMDA, by fusing a variety of high- and low-order structural information. In particular, we first design several motifs of interest considering their ability to characterize how miRNAs are associated with diseases through different network structural patterns. Then, MotifMDA adopts a two-layer hierarchical attention to identify novel MDAs. Specifically, the first attention layer learns high-order motif preferences based on their occurrences in the given MDA network, while the second one learns the final embeddings of miRNAs and diseases through coupling high- and low-order preferences. Experimental results on two benchmark datasets have demonstrated the superior performance of MotifMDA over several state-of-the-art prediction models. This strongly indicates that accurate MDA prediction can be achieved by relying solely on MDA network information. Furthermore, our case studies indicate that the incorporation of motif-level structure information allows MotifMDA to discover novel MDAs from different perspectives.


Assuntos
Biologia Computacional , MicroRNAs , MicroRNAs/genética , Humanos , Biologia Computacional/métodos , Predisposição Genética para Doença/genética , Algoritmos
10.
Brief Bioinform ; 25(3)2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38555472

RESUMO

Predicting interactions between microbes and hosts plays critical roles in microbiome population genetics and microbial ecology and evolution. How to systematically characterize the sophisticated mechanisms and signal interplay between microbes and hosts is a significant challenge for global health risks. Identifying microbe-host interactions (MHIs) can not only provide helpful insights into their fundamental regulatory mechanisms, but also facilitate the development of targeted therapies for microbial infections. In recent years, computational methods have become an appealing alternative due to the high risk and cost of wet-lab experiments. Therefore, in this study, we utilized rich microbial metagenomic information to construct a novel heterogeneous microbial network (HMN)-based model named KGVHI to predict candidate microbes for target hosts. Specifically, KGVHI first built a HMN by integrating human proteins, viruses and pathogenic bacteria with their biological attributes. Then KGVHI adopted a knowledge graph embedding strategy to capture the global topological structure information of the whole network. A natural language processing algorithm is used to extract the local biological attribute information from the nodes in HMN. Finally, we combined the local and global information and fed it into a blended deep neural network (DNN) for training and prediction. Compared to state-of-the-art methods, the comprehensive experimental results show that our model can obtain excellent results on the corresponding three MHI datasets. Furthermore, we also conducted two pathogenic bacteria case studies to further indicate that KGVHI has excellent predictive capabilities for potential MHI pairs.


Assuntos
Aprendizado Profundo , Humanos , Reconhecimento Automatizado de Padrão , Redes Neurais de Computação , Algoritmos , Bactérias
11.
Brief Bioinform ; 25(2)2024 Jan 22.
Artigo em Inglês | MEDLINE | ID: mdl-38426324

RESUMO

Emerging clinical evidence suggests that sophisticated associations with circular ribonucleic acids (RNAs) (circRNAs) and microRNAs (miRNAs) are a critical regulatory factor of various pathological processes and play a critical role in most intricate human diseases. Nonetheless, the above correlations via wet experiments are error-prone and labor-intensive, and the underlying novel circRNA-miRNA association (CMA) has been validated by numerous existing computational methods that rely only on single correlation data. Considering the inadequacy of existing machine learning models, we propose a new model named BGF-CMAP, which combines the gradient boosting decision tree with natural language processing and graph embedding methods to infer associations between circRNAs and miRNAs. Specifically, BGF-CMAP extracts sequence attribute features and interaction behavior features by Word2vec and two homogeneous graph embedding algorithms, large-scale information network embedding and graph factorization, respectively. Multitudinous comprehensive experimental analysis revealed that BGF-CMAP successfully predicted the complex relationship between circRNAs and miRNAs with an accuracy of 82.90% and an area under receiver operating characteristic of 0.9075. Furthermore, 23 of the top 30 miRNA-associated circRNAs of the studies on data were confirmed in relevant experiences, showing that the BGF-CMAP model is superior to others. BGF-CMAP can serve as a helpful model to provide a scientific theoretical basis for the study of CMA prediction.


Assuntos
MicroRNAs , Humanos , MicroRNAs/genética , RNA Circular/genética , Curva ROC , Aprendizado de Máquina , Algoritmos , Biologia Computacional/métodos
12.
Brief Bioinform ; 25(2)2024 Jan 22.
Artigo em Inglês | MEDLINE | ID: mdl-38324624

RESUMO

Connections between circular RNAs (circRNAs) and microRNAs (miRNAs) assume a pivotal position in the onset, evolution, diagnosis and treatment of diseases and tumors. Selecting the most potential circRNA-related miRNAs and taking advantage of them as the biological markers or drug targets could be conducive to dealing with complex human diseases through preventive strategies, diagnostic procedures and therapeutic approaches. Compared to traditional biological experiments, leveraging computational models to integrate diverse biological data in order to infer potential associations proves to be a more efficient and cost-effective approach. This paper developed a model of Convolutional Autoencoder for CircRNA-MiRNA Associations (CA-CMA) prediction. Initially, this model merged the natural language characteristics of the circRNA and miRNA sequence with the features of circRNA-miRNA interactions. Subsequently, it utilized all circRNA-miRNA pairs to construct a molecular association network, which was then fine-tuned by labeled samples to optimize the network parameters. Finally, the prediction outcome is obtained by utilizing the deep neural networks classifier. This model innovatively combines the likelihood objective that preserves the neighborhood through optimization, to learn the continuous feature representation of words and preserve the spatial information of two-dimensional signals. During the process of 5-fold cross-validation, CA-CMA exhibited exceptional performance compared to numerous prior computational approaches, as evidenced by its mean area under the receiver operating characteristic curve of 0.9138 and a minimal SD of 0.0024. Furthermore, recent literature has confirmed the accuracy of 25 out of the top 30 circRNA-miRNA pairs identified with the highest CA-CMA scores during case studies. The results of these experiments highlight the robustness and versatility of our model.


Assuntos
MicroRNAs , Neoplasias , Humanos , MicroRNAs/genética , RNA Circular/genética , Funções Verossimilhança , Redes Neurais de Computação , Neoplasias/genética , Biologia Computacional/métodos
13.
BMC Bioinformatics ; 25(1): 6, 2024 Jan 02.
Artigo em Inglês | MEDLINE | ID: mdl-38166644

RESUMO

According to the expression of miRNA in pathological processes, miRNAs can be divided into oncogenes or tumor suppressors. Prediction of the regulation relations between miRNAs and small molecules (SMs) becomes a vital goal for miRNA-target therapy. But traditional biological approaches are laborious and expensive. Thus, there is an urgent need to develop a computational model. In this study, we proposed a computational model to predict whether the regulatory relationship between miRNAs and SMs is up-regulated or down-regulated. Specifically, we first use the Large-scale Information Network Embedding (LINE) algorithm to construct the node features from the self-similarity networks, then use the General Attributed Multiplex Heterogeneous Network Embedding (GATNE) algorithm to extract the topological information from the attribute network, and finally utilize the Light Gradient Boosting Machine (LightGBM) algorithm to predict the regulatory relationship between miRNAs and SMs. In the fivefold cross-validation experiment, the average accuracies of the proposed model on the SM2miR dataset reached 79.59% and 80.37% for up-regulation pairs and down-regulation pairs, respectively. In addition, we compared our model with another published model. Moreover, in the case study for 5-FU, 7 of 10 candidate miRNAs are confirmed by related literature. Therefore, we believe that our model can promote the research of miRNA-targeted therapy.


Assuntos
MicroRNAs , MicroRNAs/genética , MicroRNAs/metabolismo , Biologia Computacional , Algoritmos , Oncogenes
14.
IEEE J Biomed Health Inform ; 28(3): 1742-1751, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38127594

RESUMO

Growing studies reveal that Circular RNAs (circRNAs) are broadly engaged in physiological processes of cell proliferation, differentiation, aging, apoptosis, and are closely associated with the pathogenesis of numerous diseases. Clarification of the correlation among diseases and circRNAs is of great clinical importance to provide new therapeutic strategies for complex diseases. However, previous circRNA-disease association prediction methods rely excessively on the graph network, and the model performance is dramatically reduced when noisy connections occur in the graph structure. To address this problem, this paper proposes an unsupervised deep graph structure learning method GSLCDA to predict potential CDAs. Concretely, we first integrate circRNA and disease multi-source data to constitute the CDA heterogeneous network. Then the network topology is learned using the graph structure, and the original graph is enhanced in an unsupervised manner by maximize the inter information of the learned and original graphs to uncover their essential features. Finally, graph space sensitive k-nearest neighbor (KNN) algorithm is employed to search for latent CDAs. In the benchmark dataset, GSLCDA obtained 92.67% accuracy with 0.9279 AUC. GSLCDA also exhibits exceptional performance on independent datasets. Furthermore, 14, 12 and 14 of the top 16 circRNAs with the most points GSLCDA prediction scores were confirmed in the relevant literature in the breast cancer, colorectal cancer and lung cancer case studies, respectively. Such results demonstrated that GSLCDA can validly reveal underlying CDA and offer new perspectives for the diagnosis and therapy of complex human diseases.


Assuntos
Neoplasias da Mama , Neoplasias Pulmonares , Humanos , Feminino , RNA Circular/genética , Neoplasias da Mama/genética , Algoritmos , Envelhecimento , Biologia Computacional/métodos
15.
IEEE J Biomed Health Inform ; 28(3): 1752-1761, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38145538

RESUMO

With a growing body of evidence establishing circular RNAs (circRNAs) are widely exploited in eukaryotic cells and have a significant contribution in the occurrence and development of many complex human diseases. Disease-associated circRNAs can serve as clinical diagnostic biomarkers and therapeutic targets, providing novel ideas for biopharmaceutical research. However, available computation methods for predicting circRNA-disease associations (CDAs) do not sufficiently consider the contextual information of biological network nodes, making their performance limited. In this work, we propose a multi-hop attention graph neural network-based approach MAGCDA to infer potential CDAs. Specifically, we first construct a multi-source attribute heterogeneous network of circRNAs and diseases, then use a multi-hop strategy of graph nodes to deeply aggregate node context information through attention diffusion, thus enhancing topological structure information and mining data hidden features, and finally use random forest to accurately infer potential CDAs. In the four gold standard data sets, MAGCDA achieved prediction accuracy of 92.58%, 91.42%, 83.46% and 91.12%, respectively. MAGCDA has also presented prominent achievements in ablation experiments and in comparisons with other models. Additionally, 18 and 17 potential circRNAs in top 20 predicted scores for MAGCDA prediction scores were confirmed in case studies of the complex diseases breast cancer and Almozheimer's disease, respectively. These results suggest that MAGCDA can be a practical tool to explore potential disease-associated circRNAs and provide a theoretical basis for disease diagnosis and treatment.


Assuntos
Neoplasias da Mama , RNA Circular , Humanos , Feminino , RNA Circular/genética , Redes Neurais de Computação , Biomarcadores , Biologia Computacional/métodos
16.
J Chem Inf Model ; 64(1): 238-249, 2024 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-38103039

RESUMO

Drug repositioning plays a key role in disease treatment. With the large-scale chemical data increasing, many computational methods are utilized for drug-disease association prediction. However, most of the existing models neglect the positive influence of non-Euclidean data and multisource information, and there is still a critical issue for graph neural networks regarding how to set the feature diffuse distance. To solve the problems, we proposed SiSGC, which makes full use of the biological knowledge information as initial features and learns the structure information from the constructed heterogeneous graph with the adaptive selection of the information diffuse distance. Then, the structural features are fused with the denoised similarity information and fed to the advanced classifier of CatBoost to make predictions. Three different data sets are used to confirm the robustness and generalization of SiSGC under two splitting strategies. Experiment results demonstrate that the proposed model achieves superior performance compared with the six leading methods and four variants. Our case study on breast neoplasms further indicates that SiSGC is trustworthy and robust yet simple. We also present four drugs for breast cancer treatment with high confidence and further give an explanation for demonstrating the rationality. There is no doubt that SiSGC can be used as a beneficial supplement for drug repositioning.


Assuntos
Reposicionamento de Medicamentos , Redes Neurais de Computação
17.
Commun Biol ; 6(1): 1268, 2023 12 14.
Artigo em Inglês | MEDLINE | ID: mdl-38097699

RESUMO

Recent developments in single-cell technology have enabled the exploration of cellular heterogeneity at an unprecedented level, providing invaluable insights into various fields, including medicine and disease research. Cell type annotation is an essential step in its omics research. The mainstream approach is to utilize well-annotated single-cell data to supervised learning for cell type annotation of new singlecell data. However, existing methods lack good generalization and robustness in cell annotation tasks, partially due to difficulties in dealing with technical differences between datasets, as well as not considering the heterogeneous associations of genes in regulatory mechanism levels. Here, we propose the scPML model, which utilizes various gene signaling pathway data to partition the genetic features of cells, thus characterizing different interaction maps between cells. Extensive experiments demonstrate that scPML performs better in cell type annotation and detection of unknown cell types from different species, platforms, and tissues.


Assuntos
Medicina , Análise da Expressão Gênica de Célula Única , Transdução de Sinais , Tecnologia
18.
Comput Biol Med ; 165: 107421, 2023 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-37672925

RESUMO

MOTIVATION: Accumulating clinical evidence shows that circular RNA (circRNA) plays an important regulatory role in the occurrence and development of human diseases, which is expected to provide a new perspective for the diagnosis and treatment of related diseases. Using computational methods can provide high probability preselection for wet experiments to save resources. However, due to the lack of neighborhood structure in sparse biological networks, the model based on network embedding and graph embedding is difficult to achieve ideal results. RESULTS: In this paper, we propose BioDGW-CMI, which combines biological text mining and wavelet diffusion-based sparse network structure embedding to predict circRNA-miRNA interaction (CMI). In detail, BioDGW-CMI first uses the Bidirectional Encoder Representations from Transformers (BERT) for biological text mining to mine hidden features in RNA sequences, then constructs a CMI network, obtains the topological structure embedding of nodes in the network through heat wavelet diffusion patterns. Next, the Denoising autoencoder organically combines the structural features and Gaussian kernel similarity, finally, the feature is sent to lightGBM for training and prediction. BioDGW-CMI achieves the highest prediction performance in all three datasets in the field of CMI prediction. In the case study, all the 8 pairs of CMI based on circ-ITCH were successfully predicted. AVAILABILITY: The data and source code can be found at https://github.com/1axin/BioDGW-CMI-model.

19.
Brief Bioinform ; 24(6)2023 09 22.
Artigo em Inglês | MEDLINE | ID: mdl-37742053

RESUMO

Identifying the potential bacteriophages (phage) candidate to treat bacterial infections plays an essential role in the research of human pathogens. Computational approaches are recognized as a valid way to predict bacteria and target phages. However, most of the current methods only utilize lower-order biological information without considering the higher-order connectivity patterns, which helps to improve the predictive accuracy. Therefore, we developed a novel microbial heterogeneous interaction network (MHIN)-based model called PTBGRP to predict new phages for bacterial hosts. Specifically, PTBGRP first constructs an MHIN by integrating phage-bacteria interaction (PBI) and six bacteria-bacteria interaction networks with their biological attributes. Then, different representation learning methods are deployed to extract higher-level biological features and lower-level topological features from MHIN. Finally, PTBGRP employs a deep neural network as the classifier to predict unknown PBI pairs based on the fused biological information. Experiment results demonstrated that PTBGRP achieves the best performance on the corresponding ESKAPE pathogens and PBI dataset when compared with state-of-art methods. In addition, case studies of Klebsiella pneumoniae and Staphylococcus aureus further indicate that the consideration of rich heterogeneous information enables PTBGRP to accurately predict PBI from a more comprehensive perspective. The webserver of the PTBGRP predictor is freely available at http://120.77.11.78/PTBGRP/.


Assuntos
Bacteriófagos , Infecções Estafilocócicas , Humanos , Aprendizagem , Bactérias , Redes Neurais de Computação
20.
iScience ; 26(8): 107478, 2023 Aug 18.
Artigo em Inglês | MEDLINE | ID: mdl-37583550

RESUMO

Circular RNA (circRNA) plays an important role in the diagnosis, treatment, and prognosis of human diseases. The discovery of potential circRNA-miRNA interactions (CMI) is of guiding significance for subsequent biological experiments. Limited by the small amount of experimentally supported data and high randomness, existing models are difficult to accomplish the CMI prediction task based on real cases. In this paper, we propose KS-CMI, a novel method for effectively accomplishing CMI prediction in real cases. KS-CMI enriches the 'behavior relationships' of molecules by constructing circRNA-miRNA-cancer (CMCI) networks and extracts the behavior relationship attribute of molecules based on balance theory. Next, the denoising autoencoder (DAE) is used to enhance the feature representation of molecules. Finally, the CatBoost classifier was used for prediction. KS-CMI achieved the most reliable prediction results in real cases and achieved competitive performance in all datasets in the CMI prediction.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA