Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 5 de 5
Filter
Add more filters

Database
Language
Publication year range
1.
Brief Bioinform ; 25(5)2024 Jul 25.
Article in English | MEDLINE | ID: mdl-39154194

ABSTRACT

Understanding the genetic basis of disease is a fundamental aspect of medical research, as genes are the classic units of heredity and play a crucial role in biological function. Identifying associations between genes and diseases is critical for diagnosis, prevention, prognosis, and drug development. Genes that encode proteins with similar sequences are often implicated in related diseases, as proteins causing identical or similar diseases tend to show limited variation in their sequences. Predicting gene-disease association (GDA) requires time-consuming and expensive experiments on a large number of potential candidate genes. Although methods have been proposed to predict associations between genes and diseases using traditional machine learning algorithms and graph neural networks, these approaches struggle to capture the deep semantic information within the genes and diseases and are dependent on training data. To alleviate this issue, we propose a novel GDA prediction model named FusionGDA, which utilizes a pre-training phase with a fusion module to enrich the gene and disease semantic representations encoded by pre-trained language models. Multi-modal representations are generated by the fusion module, which includes rich semantic information about two heterogeneous biomedical entities: protein sequences and disease descriptions. Subsequently, the pooling aggregation strategy is adopted to compress the dimensions of the multi-modal representation. In addition, FusionGDA employs a pre-training phase leveraging a contrastive learning loss to extract potential gene and disease features by training on a large public GDA dataset. To rigorously evaluate the effectiveness of the FusionGDA model, we conduct comprehensive experiments on five datasets and compare our proposed model with five competitive baseline models on the DisGeNet-Eval dataset. Notably, our case study further demonstrates the ability of FusionGDA to discover hidden associations effectively. The complete code and datasets of our experiments are available at https://github.com/ZhaohanM/FusionGDA.


Subject(s)
Machine Learning , Humans , Computational Biology/methods , Genetic Predisposition to Disease , Semantics , Algorithms , Genetic Association Studies , Neural Networks, Computer
2.
Brief Bioinform ; 24(5)2023 09 20.
Article in English | MEDLINE | ID: mdl-37651605

ABSTRACT

MicroRNAs (miRNAs) silence genes by binding to messenger RNAs, whereas long non-coding RNAs (lncRNAs) act as competitive endogenous RNAs (ceRNAs) that can relieve miRNA silencing effects and upregulate target gene expression. The ceRNA association between lncRNAs and miRNAs has been a research hotspot due to its medical importance, but it is challenging to verify experimentally. In this paper, we propose a novel deep learning scheme, i.e. sequence pre-training-based graph neural network (SPGNN), that combines pre-training and fine-tuning stages to predict lncRNA-miRNA associations from RNA sequences and the existing interactions represented as a graph. First, we utilize a sequence-to-vector technique to generate pre-trained embeddings based on the sequences of all RNAs during the pre-training stage. In the fine-tuning stage, we use Graph Neural Network to learn node representations from the heterogeneous graph constructed using lncRNA-miRNA association information. We evaluate our proposed scheme SPGNN on our newly collected animal lncRNA-miRNA association dataset and demonstrate that combining the $k$-mer technique and Doc2vec model for pre-training with the Simple Graph Convolution Network for fine-tuning is effective in predicting lncRNA-miRNA associations. Our approach outperforms state-of-the-art baselines across various evaluation metrics. We also conduct an ablation study and hyperparameter analysis to verify the effectiveness of each component and parameter of our scheme. The complete code and dataset are available on GitHub: https://github.com/zixwang/SPGNN.


Subject(s)
MicroRNAs , RNA, Long Noncoding , Animals , MicroRNAs/genetics , RNA, Long Noncoding/genetics , Benchmarking , Neural Networks, Computer , RNA, Messenger
3.
BMC Bioinformatics ; 24(1): 335, 2023 Sep 11.
Article in English | MEDLINE | ID: mdl-37697297

ABSTRACT

Circular RNA (CircRNA) is a type of non-coding RNAs in which both ends are covalently linked. Researchers have demonstrated that many circRNAs can act as biomarkers of diseases. However, traditional experimental methods for circRNA-disease associations identification are labor-intensive. In this work, we propose a novel method based on the heterogeneous graph neural network and metapaths for circRNA-disease associations prediction termed as HMCDA. First, a heterogeneous graph consisting of circRNA-disease associations, circRNA-miRNA associations, miRNA-disease associations and disease-disease associations are constructed. Then, six metapaths are defined and generated according to the biomedical pathways. Afterwards, the entity content transformation, intra-metapath and inter-metapath aggregation are implemented to learn the embeddings of circRNA and disease entities. Finally, the learned embeddings are used to predict novel circRNA-disase associations. In particular, the result of extensive experiments demonstrates that HMCDA outperforms four state-of-the-art models in fivefold cross validation. In addition, our case study indicates that HMCDA has the ability to identify novel circRNA-disease associations.


Subject(s)
MicroRNAs , RNA, Circular , Research Design , Learning , MicroRNAs/genetics , Neural Networks, Computer
4.
Article in English | MEDLINE | ID: mdl-35900994

ABSTRACT

In this article, we study the problem of embedding temporal attributed networks, with the goal of which is to learn dynamic low-dimensional representations over time for temporal attributed networks. Existing temporal network embedding methods only learn the representations for nodes, which are unable to capture the dynamic affinities between nodes and attributes. Moreover, existing co-embedding methods that learn the static embeddings of both nodes and attributes cannot be naturally utilized to obtain their dynamic embeddings for temporal attributed networks. To address these issues, we propose the dynamic co-embedding model for temporal attributed networks (DCTANs) based on the dynamic stochastic state-space framework. Our model captures the dynamics of a temporal attributed network by modeling the abstract belief states representing the condition of the nodes and attributes of current time step, and predicting the transitions between temporal abstract states of two successive time steps. Our model is able to learn embeddings for both nodes and attributes based on their belief states at each time step of the temporal attributed network, while the state transition tendency for predicting the future network can be tracked and the affinities between nodes and attributes can be preserved. Experimental results on real-world networks demonstrate that our model achieves substantial performance gains in several static and dynamic graph mining applications compared with the state-of-the-art static and dynamic models.

5.
Comput Intell Neurosci ; 2021: 4296247, 2021.
Article in English | MEDLINE | ID: mdl-34354743

ABSTRACT

A number of literature reports have shown that multi-view clustering can acquire a better performance on complete multi-view data. However, real-world data usually suffers from missing some samples in each view and has a small number of labeled samples. Additionally, almost all existing multi-view clustering models do not execute incomplete multi-view data well and fail to fully utilize the labeled samples to reduce computational complexity, which precludes them from practical application. In view of these problems, this paper proposes a novel framework called Semi-supervised Multi-View Clustering with Weighted Anchor Graph Embedding (SMVC_WAGE), which is conceptually simple and efficiently generates high-quality clustering results in practice. Specifically, we introduce a simple and effective anchor strategy. Based on selected anchor points, we can exploit the intrinsic and extrinsic view information to bridge all samples and capture more reliable nonlinear relations, which greatly enhances efficiency and improves stableness. Meanwhile, we construct the global fused graph compatibly across multiple views via a parameter-free graph fusion mechanism which directly coalesces the view-wise graphs. To this end, the proposed method can not only deal with complete multi-view clustering well but also be easily extended to incomplete multi-view cases. Experimental results clearly show that our algorithm surpasses some state-of-the-art competitors in clustering ability and time cost.


Subject(s)
Algorithms , Cluster Analysis
SELECTION OF CITATIONS
SEARCH DETAIL