Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 211
Filtrar
1.
Brief Bioinform ; 25(5)2024 Jul 25.
Artigo em Inglês | MEDLINE | ID: mdl-39175132

RESUMO

Numerous studies have demonstrated that microRNAs (miRNAs) are critically important for the prediction, diagnosis, and characterization of diseases. However, identifying miRNA-disease associations through traditional biological experiments is both costly and time-consuming. To further explore these associations, we proposed a model based on hybrid high-order moments combined with element-level attention mechanisms (HHOMR). This model innovatively fused hybrid higher-order statistical information along with structural and community information. Specifically, we first constructed a heterogeneous graph based on existing associations between miRNAs and diseases. HHOMR employs a structural fusion layer to capture structure-level embeddings and leverages a hybrid high-order moments encoder layer to enhance features. Element-level attention mechanisms are then used to adaptively integrate the features of these hybrid moments. Finally, a multi-layer perceptron is utilized to calculate the association scores between miRNAs and diseases. Through five-fold cross-validation on HMDD v2.0, we achieved a mean AUC of 93.28%. Compared with four state-of-the-art models, HHOMR exhibited superior performance. Additionally, case studies on three diseases-esophageal neoplasms, lymphoma, and prostate neoplasms-were conducted. Among the top 50 miRNAs with high disease association scores, 46, 47, and 45 associated with these diseases were confirmed by the dbDEMC and miR2Disease databases, respectively. Our results demonstrate that HHOMR not only outperforms existing models but also shows significant potential in predicting miRNA-disease associations.


Assuntos
MicroRNAs , MicroRNAs/genética , Humanos , Biologia Computacional/métodos , Predisposição Genética para Doença , Algoritmos , Neoplasias da Próstata/genética , Modelos Genéticos
2.
Brief Bioinform ; 25(3)2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38581419

RESUMO

Piwi-interacting RNAs (piRNAs) play a crucial role in various biological processes and are implicated in disease. Consequently, there is an escalating demand for computational tools to predict piRNA-disease interactions. Although there have been computational methods proposed for the detection of piRNA-disease associations, the problem of imbalanced and sparse dataset has brought great challenges to capture the complex relationships between piRNAs and diseases. In response to this necessity, we have developed a novel computational architecture, denoted as PUTransGCN, which uses heterogeneous graph convolutional networks to uncover potential piRNA-disease associations. Additionally, the attention mechanism was used to adjust the weight parameters of aggregation heterogeneous node features automatically. For tackling the imbalanced dataset problem, the combined positive unlabelled learning (PUL) method comprising PU bagging, two-step and spy technique was applied to select reliable negative associations. The features of piRNAs and diseases were derived from three distinct biological sources by PUTransGCN, including information on piRNA sequences, semantic terms related to diseases and the existing network of piRNA-disease associations. In the experiment, PUTransGCN performs in 5-fold cross-validation with an AUC of 0.93 and 0.95 on two datasets, respectively, which outperforms the other six state-of-the-art models. We compared three different PUL methods, and the results of the ablation experiment indicate that the combined PUL method yields the best results. The PUTransGCN could serve as a valuable piRNA-disease prediction tool for upcoming studies in the biomedical field. The code for PUTransGCN is available at https://github.com/chenqiuhao/PUTransGCN.


Assuntos
RNA de Interação com Piwi
3.
Brief Bioinform ; 25(2)2024 Jan 22.
Artigo em Inglês | MEDLINE | ID: mdl-38517693

RESUMO

Numerous investigations increasingly indicate the significance of microRNA (miRNA) in human diseases. Hence, unearthing associations between miRNA and diseases can contribute to precise diagnosis and efficacious remediation of medical conditions. The detection of miRNA-disease linkages via computational techniques utilizing biological information has emerged as a cost-effective and highly efficient approach. Here, we introduced a computational framework named ReHoGCNES, designed for prospective miRNA-disease association prediction (ReHoGCNES-MDA). This method constructs homogenous graph convolutional network with regular graph structure (ReHoGCN) encompassing disease similarity network, miRNA similarity network and known MDA network and then was tested on four experimental tasks. A random edge sampler strategy was utilized to expedite processes and diminish training complexity. Experimental results demonstrate that the proposed ReHoGCNES-MDA method outperforms both homogenous graph convolutional network and heterogeneous graph convolutional network with non-regular graph structure in all four tasks, which implicitly reveals steadily degree distribution of a graph does play an important role in enhancement of model performance. Besides, ReHoGCNES-MDA is superior to several machine learning algorithms and state-of-the-art methods on the MDA prediction. Furthermore, three case studies were conducted to further demonstrate the predictive ability of ReHoGCNES. Consequently, 93.3% (breast neoplasms), 90% (prostate neoplasms) and 93.3% (prostate neoplasms) of the top 30 forecasted miRNAs were validated by public databases. Hence, ReHoGCNES-MDA might serve as a dependable and beneficial model for predicting possible MDAs.


Assuntos
MicroRNAs , Neoplasias da Próstata , Humanos , Masculino , Algoritmos , Biologia Computacional/métodos , Bases de Dados Genéticas , MicroRNAs/genética , Estudos Prospectivos , Neoplasias da Próstata/genética , Feminino
4.
Brief Bioinform ; 25(4)2024 May 23.
Artigo em Inglês | MEDLINE | ID: mdl-38980370

RESUMO

RepurposeDrugs (https://repurposedrugs.org/) is a comprehensive web-portal that combines a unique drug indication database with a machine learning (ML) predictor to discover new drug-indication associations for approved as well as investigational mono and combination therapies. The platform provides detailed information on treatment status, disease indications and clinical trials across 25 indication categories, including neoplasms and cardiovascular conditions. The current version comprises 4314 compounds (approved, terminated or investigational) and 161 drug combinations linked to 1756 indications/conditions, totaling 28 148 drug-disease pairs. By leveraging data on both approved and failed indications, RepurposeDrugs provides ML-based predictions for the approval potential of new drug-disease indications, both for mono- and combinatorial therapies, demonstrating high predictive accuracy in cross-validation. The validity of the ML predictor is validated through a number of real-world case studies, demonstrating its predictive power to accurately identify repurposing candidates with a high likelihood of future approval. To our knowledge, RepurposeDrugs web-portal is the first integrative database and ML-based predictor for interactive exploration and prediction of both single-drug and combination approval likelihood across indications. Given its broad coverage of indication areas and therapeutic options, we expect it accelerates many future drug repurposing projects.


Assuntos
Reposicionamento de Medicamentos , Aprendizado de Máquina , Reposicionamento de Medicamentos/métodos , Humanos , Internet , Quimioterapia Combinada , Bases de Dados de Produtos Farmacêuticos , Bases de Dados Factuais
5.
Brief Bioinform ; 25(3)2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38605642

RESUMO

MicroRNAs (miRNAs) synergize with various biomolecules in human cells resulting in diverse functions in regulating a wide range of biological processes. Predicting potential disease-associated miRNAs as valuable biomarkers contributes to the treatment of human diseases. However, few previous methods take a holistic perspective and only concentrate on isolated miRNA and disease objects, thereby ignoring that human cells are responsible for multiple relationships. In this work, we first constructed a multi-view graph based on the relationships between miRNAs and various biomolecules, and then utilized graph attention neural network to learn the graph topology features of miRNAs and diseases for each view. Next, we added an attention mechanism again, and developed a multi-scale feature fusion module, aiming to determine the optimal fusion results for the multi-view topology features of miRNAs and diseases. In addition, the prior attribute knowledge of miRNAs and diseases was simultaneously added to achieve better prediction results and solve the cold start problem. Finally, the learned miRNA and disease representations were then concatenated and fed into a multi-layer perceptron for end-to-end training and predicting potential miRNA-disease associations. To assess the efficacy of our model (called MUSCLE), we performed 5- and 10-fold cross-validation (CV), which got average the Area under ROC curves of 0.966${\pm }$0.0102 and 0.973${\pm }$0.0135, respectively, outperforming most current state-of-the-art models. We then examined the impact of crucial parameters on prediction performance and performed ablation experiments on the feature combination and model architecture. Furthermore, the case studies about colon cancer, lung cancer and breast cancer also fully demonstrate the good inductive capability of MUSCLE. Our data and code are free available at a public GitHub repository: https://github.com/zht-code/MUSCLE.git.


Assuntos
Neoplasias do Colo , Neoplasias Pulmonares , MicroRNAs , Humanos , Músculos , Aprendizagem , MicroRNAs/genética , Algoritmos , Biologia Computacional
6.
Brief Bioinform ; 24(2)2023 03 19.
Artigo em Inglês | MEDLINE | ID: mdl-36790856

RESUMO

Potential miRNA-disease associations (MDA) play an important role in the discovery of complex human disease etiology. Therefore, MDA prediction is an attractive research topic in the field of biomedical machine learning. Recently, several models have been proposed for this task, but their performance limited by over-reliance on relevant network information with noisy graph structure connections. However, the application of self-supervised graph structure learning to MDA tasks remains unexplored. Our study is the first to use multi-view self-supervised contrastive learning (MSGCL) for MDA prediction. Specifically, we generated a learner view without association labels of miRNAs and diseases as input, and utilized the known association network to generate an anchor view that provides guiding signals for the learner view. The graph structure was optimized by designing a contrastive loss to maximize the consistency between the anchor and learner views. Our model is similar to a pre-trained model that continuously optimizes upstream tasks for high-quality association graph topology, thereby enhancing the latent representation of association predictions. The experimental results show that our proposed method outperforms state-of-the-art methods by 2.79$\%$ and 3.20$\%$ in area under the receiver operating characteristic curve (AUC) and area under the precision/recall curve (AUPR), respectively.


Assuntos
Aprendizado de Máquina , MicroRNAs , Humanos , Área Sob a Curva , MicroRNAs/genética , Curva ROC
7.
Brief Bioinform ; 24(1)2023 01 19.
Artigo em Inglês | MEDLINE | ID: mdl-36611256

RESUMO

Accumulating evidences demonstrate that circular RNA (circRNA) plays an important role in human diseases. Identification of circRNA-disease associations can help for the diagnosis of human diseases, while the traditional method based on biological experiments is time-consuming. In order to address the limitation, a series of computational methods have been proposed in recent years. However, few works have summarized these methods or compared the performance of them. In this paper, we divided the existing methods into three categories: information propagation, traditional machine learning and deep learning. Then, the baseline methods in each category are introduced in detail. Further, 5 different datasets are collected, and 14 representative methods of each category are selected and compared in the 5-fold, 10-fold cross-validation and the de novo experiment. In order to further evaluate the effectiveness of these methods, six common cancers are selected to compare the number of correctly identified circRNA-disease associations in the top-10, top-20, top-50, top-100 and top-200. In addition, according to the results, the observation about the robustness and the character of these methods are concluded. Finally, the future directions and challenges are discussed.


Assuntos
Neoplasias , RNA Circular , Humanos , RNA Circular/genética , Benchmarking , Aprendizado de Máquina , Neoplasias/genética , Biologia Computacional/métodos
8.
Brief Bioinform ; 24(1)2023 01 19.
Artigo em Inglês | MEDLINE | ID: mdl-36528809

RESUMO

MOTIVATION: Exploring the potential long noncoding RNA (lncRNA)-disease associations (LDAs) plays a critical role for understanding disease etiology and pathogenesis. Given the high cost of biological experiments, developing a computational method is a practical necessity to effectively accelerate experimental screening process of candidate LDAs. However, under the high sparsity of LDA dataset, many computational models hardly exploit enough knowledge to learn comprehensive patterns of node representations. Moreover, although the metapath-based GNN has been recently introduced into LDA prediction, it discards intermediate nodes along the meta-path and results in information loss. RESULTS: This paper presents a new multi-view contrastive heterogeneous graph attention network (GAT) for lncRNA-disease association prediction, MCHNLDA for brevity. Specifically, MCHNLDA firstly leverages rich biological data sources of lncRNA, gene and disease to construct two-view graphs, feature structural graph of feature schema view and lncRNA-gene-disease heterogeneous graph of network topology view. Then, we design a cross-contrastive learning task to collaboratively guide graph embeddings of the two views without relying on any labels. In this way, we can pull closer the nodes of similar features and network topology, and push other nodes away. Furthermore, we propose a heterogeneous contextual GAT, where long short-term memory network is incorporated into attention mechanism to effectively capture sequential structure information along the meta-path. Extensive experimental comparisons against several state-of-the-art methods show the effectiveness of proposed framework.The code and data of proposed framework is freely available at https://github.com/zhaoxs686/MCHNLDA.


Assuntos
RNA Longo não Codificante , RNA Longo não Codificante/genética , Aprendizagem
9.
Brief Bioinform ; 24(1)2023 01 19.
Artigo em Inglês | MEDLINE | ID: mdl-36545805

RESUMO

Discovering the relationships between long non-coding RNAs (lncRNAs) and diseases is significant in the treatment, diagnosis and prevention of diseases. However, current identified lncRNA-disease associations are not enough because of the expensive and heavy workload of wet laboratory experiments. Therefore, it is greatly important to develop an efficient computational method for predicting potential lncRNA-disease associations. Previous methods showed that combining the prediction results of the lncRNA-disease associations predicted by different classification methods via Learning to Rank (LTR) algorithm can be effective for predicting potential lncRNA-disease associations. However, when the classification results are incorrect, the ranking results will inevitably be affected. We propose the GraLTR-LDA predictor based on biological knowledge graphs and ranking framework for predicting potential lncRNA-disease associations. Firstly, homogeneous graph and heterogeneous graph are constructed by integrating multi-source biological information. Then, GraLTR-LDA integrates graph auto-encoder and attention mechanism to extract embedded features from the constructed graphs. Finally, GraLTR-LDA incorporates the embedded features into the LTR via feature crossing statistical strategies to predict priority order of diseases associated with query lncRNAs. Experimental results demonstrate that GraLTR-LDA outperforms the other state-of-the-art predictors and can effectively detect potential lncRNA-disease associations. Availability and implementation: Datasets and source codes are available at http://bliulab.net/GraLTR-LDA.


Assuntos
Neoplasias , RNA Longo não Codificante , Humanos , RNA Longo não Codificante/genética , Biologia Computacional/métodos , Algoritmos , Software
10.
Brief Bioinform ; 24(1)2023 01 19.
Artigo em Inglês | MEDLINE | ID: mdl-36526276

RESUMO

Increasing studies have proved that microRNAs (miRNAs) are critical biomarkers in the development of human complex diseases. Identifying disease-related miRNAs is beneficial to disease prevention, diagnosis and remedy. Based on the assumption that similar miRNAs tend to associate with similar diseases, various computational methods have been developed to predict novel miRNA-disease associations (MDAs). However, selecting proper features for similarity calculation is a challenging task because of data deficiencies in biomedical science. In this study, we propose a deep learning-based computational method named MAGCN to predict potential MDAs without using any similarity measurements. Our method predicts novel MDAs based on known lncRNA-miRNA interactions via graph convolution networks with multichannel attention mechanism and convolutional neural network combiner. Extensive experiments show that the average area under the receiver operating characteristic values obtained by our method under 2-fold, 5-fold and 10-fold cross-validations are 0.8994, 0.9032 and 0.9044, respectively. When compared with five state-of-the-art methods, MAGCN shows improvement in terms of prediction accuracy. In addition, we conduct case studies on three diseases to discover their related miRNAs, and find that all the top 50 predictions for all the three diseases have been supported by established databases. The comprehensive results demonstrate that our method is a reliable tool in detecting new disease-related miRNAs.


Assuntos
MicroRNAs , RNA Longo não Codificante , Humanos , Algoritmos , Biologia Computacional/métodos , Bases de Dados Genéticas , MicroRNAs/genética , RNA Longo não Codificante/genética , Aprendizado Profundo
11.
Brief Bioinform ; 24(4)2023 07 20.
Artigo em Inglês | MEDLINE | ID: mdl-37406190

RESUMO

Studies have confirmed that the occurrence of many complex diseases in the human body is closely related to the microbial community, and microbes can affect tumorigenesis and metastasis by regulating the tumor microenvironment. However, there are still large gaps in the clinical observation of the microbiota in disease. Although biological experiments are accurate in identifying disease-associated microbes, they are also time-consuming and expensive. The computational models for effective identification of diseases related microbes can shorten this process, and reduce capital and time costs. Based on this, in the paper, a model named DSAE_RF is presented to predict latent microbe-disease associations by combining multi-source features and deep learning. DSAE_RF calculates four similarities between microbes and diseases, which are then used as feature vectors for the disease-microbe pairs. Later, reliable negative samples are screened by k-means clustering, and a deep sparse autoencoder neural network is further used to extract effective features of the disease-microbe pairs. In this foundation, a random forest classifier is presented to predict the associations between microbes and diseases. To assess the performance of the model in this paper, 10-fold cross-validation is implemented on the same dataset. As a result, the AUC and AUPR of the model are 0.9448 and 0.9431, respectively. Furthermore, we also conduct a variety of experiments, including comparison of negative sample selection methods, comparison with different models and classifiers, Kolmogorov-Smirnov test and t-test, ablation experiments, robustness analysis, and case studies on Covid-19 and colorectal cancer. The results fully demonstrate the reliability and availability of our model.


Assuntos
COVID-19 , Aprendizado Profundo , Microbiota , Humanos , Reprodutibilidade dos Testes , Algoritmos , Biologia Computacional/métodos
12.
Brief Bioinform ; 24(1)2023 01 19.
Artigo em Inglês | MEDLINE | ID: mdl-36592062

RESUMO

Recent studies have revealed that long noncoding RNAs (lncRNAs) are closely linked to several human diseases, providing new opportunities for their use in detection and therapy. Many graph propagation and similarity fusion approaches can be used for predicting potential lncRNA-disease associations. However, existing similarity fusion approaches suffer from noise and self-similarity loss in the fusion process. To address these problems, a new prediction approach, termed SSMF-BLNP, based on organically combining selective similarity matrix fusion (SSMF) and bidirectional linear neighborhood label propagation (BLNP), is proposed in this paper to predict lncRNA-disease associations. In SSMF, self-similarity networks of lncRNAs and diseases are obtained by selective preprocessing and nonlinear iterative fusion. The fusion process assigns weights to each initial similarity network and introduces a unit matrix that can reduce noise and compensate for the loss of self-similarity. In BLNP, the initial lncRNA-disease associations are employed in both lncRNA and disease directions as label information for linear neighborhood label propagation. The propagation was then performed on the self-similarity network obtained from SSMF to derive the scoring matrix for predicting the relationships between lncRNAs and diseases. Experimental results showed that SSMF-BLNP performed better than seven other state of-the-art approaches. Furthermore, a case study demonstrated up to 100% and 80% accuracy in 10 lncRNAs associated with hepatocellular carcinoma and 10 lncRNAs associated with renal cell carcinoma, respectively. The source code and datasets used in this paper are available at: https://github.com/RuiBingo/SSMF-BLNP.


Assuntos
RNA Longo não Codificante , Humanos , Algoritmos , Biologia Computacional/métodos , RNA Longo não Codificante/genética , Software , Carcinoma Hepatocelular/genética , Carcinoma de Células Renais/genética , Neoplasias Hepáticas/genética , Neoplasias Renais/genética
13.
Brief Bioinform ; 24(5)2023 09 20.
Artigo em Inglês | MEDLINE | ID: mdl-37482409

RESUMO

Numerous biological studies have shown that considering disease-associated micro RNAs (miRNAs) as potential biomarkers or therapeutic targets offers new avenues for the diagnosis of complex diseases. Computational methods have gradually been introduced to reveal disease-related miRNAs. Considering that previous models have not fused sufficiently diverse similarities, that their inappropriate fusion methods may lead to poor quality of the comprehensive similarity network and that their results are often limited by insufficiently known associations, we propose a computational model called Generative Adversarial Matrix Completion Network based on Multi-source Data Fusion (GAMCNMDF) for miRNA-disease association prediction. We create a diverse network connecting miRNAs and diseases, which is then represented using a matrix. The main task of GAMCNMDF is to complete the matrix and obtain the predicted results. The main innovations of GAMCNMDF are reflected in two aspects: GAMCNMDF integrates diverse data sources and employs a nonlinear fusion approach to update the similarity networks of miRNAs and diseases. Also, some additional information is provided to GAMCNMDF in the form of a 'hint' so that GAMCNMDF can work successfully even when complete data are not available. Compared with other methods, the outcomes of 10-fold cross-validation on two distinct databases validate the superior performance of GAMCNMDF with statistically significant results. It is worth mentioning that we apply GAMCNMDF in the identification of underlying small molecule-related miRNAs, yielding outstanding performance results in this specific domain. In addition, two case studies about two important neoplasms show that GAMCNMDF is a promising prediction method.


Assuntos
MicroRNAs , Neoplasias , Humanos , MicroRNAs/genética , Algoritmos , Biologia Computacional/métodos , Neoplasias/genética , Bases de Dados Genéticas , Predisposição Genética para Doença
14.
Brief Bioinform ; 24(6)2023 09 22.
Artigo em Inglês | MEDLINE | ID: mdl-37985451

RESUMO

Non-coding RNAs (ncRNAs) play a critical role in the occurrence and development of numerous human diseases. Consequently, studying the associations between ncRNAs and diseases has garnered significant attention from researchers in recent years. Various computational methods have been proposed to explore ncRNA-disease relationships, with Graph Neural Network (GNN) emerging as a state-of-the-art approach for ncRNA-disease association prediction. In this survey, we present a comprehensive review of GNN-based models for ncRNA-disease associations. Firstly, we provide a detailed introduction to ncRNAs and GNNs. Next, we delve into the motivations behind adopting GNNs for predicting ncRNA-disease associations, focusing on data structure, high-order connectivity in graphs and sparse supervision signals. Subsequently, we analyze the challenges associated with using GNNs in predicting ncRNA-disease associations, covering graph construction, feature propagation and aggregation, and model optimization. We then present a detailed summary and performance evaluation of existing GNN-based models in the context of ncRNA-disease associations. Lastly, we explore potential future research directions in this rapidly evolving field. This survey serves as a valuable resource for researchers interested in leveraging GNNs to uncover the complex relationships between ncRNAs and diseases.


Assuntos
Redes Neurais de Computação , RNA não Traduzido , Humanos , RNA não Traduzido/genética , Pesquisadores
15.
Brief Bioinform ; 24(5)2023 09 20.
Artigo em Inglês | MEDLINE | ID: mdl-37587836

RESUMO

Recent studies have demonstrated the significant role that circRNA plays in the progression of human diseases. Identifying circRNA-disease associations (CDA) in an efficient manner can offer crucial insights into disease diagnosis. While traditional biological experiments can be time-consuming and labor-intensive, computational methods have emerged as a viable alternative in recent years. However, these methods are often limited by data sparsity and their inability to explore high-order information. In this paper, we introduce a novel method named Knowledge Graph Encoder from Transformer for predicting CDA (KGETCDA). Specifically, KGETCDA first integrates more than 10 databases to construct a large heterogeneous non-coding RNA dataset, which contains multiple relationships between circRNA, miRNA, lncRNA and disease. Then, a biological knowledge graph is created based on this dataset and Transformer-based knowledge representation learning and attentive propagation layers are applied to obtain high-quality embeddings with accurately captured high-order interaction information. Finally, multilayer perceptron is utilized to predict the matching scores of CDA based on their embeddings. Our empirical results demonstrate that KGETCDA significantly outperforms other state-of-the-art models. To enhance user experience, we have developed an interactive web-based platform named HNRBase that allows users to visualize, download data and make predictions using KGETCDA with ease. The code and datasets are publicly available at https://github.com/jinyangwu/KGETCDA.


Assuntos
RNA Circular , RNA Longo não Codificante , Humanos , Reconhecimento Automatizado de Padrão , Aprendizagem , Bases de Dados Factuais , Bases de Conhecimento , Biologia Computacional
16.
Brief Bioinform ; 24(4)2023 07 20.
Artigo em Inglês | MEDLINE | ID: mdl-37287133

RESUMO

MicroRNAs (miRNAs) are a family of non-coding RNA molecules with vital roles in regulating gene expression. Although researchers have recognized the importance of miRNAs in the development of human diseases, it is very resource-consuming to use experimental methods for identifying which dysregulated miRNA is associated with a specific disease. To reduce the cost of human effort, a growing body of studies has leveraged computational methods for predicting the potential miRNA-disease associations. However, the extant computational methods usually ignore the crucial mediating role of genes and suffer from the data sparsity problem. To address this limitation, we introduce the multi-task learning technique and develop a new model called MTLMDA (Multi-Task Learning model for predicting potential MicroRNA-Disease Associations). Different from existing models that only learn from the miRNA-disease network, our MTLMDA model exploits both miRNA-disease and gene-disease networks for improving the identification of miRNA-disease associations. To evaluate model performance, we compare our model with competitive baselines on a real-world dataset of experimentally supported miRNA-disease associations. Empirical results show that our model performs best using various performance metrics. We also examine the effectiveness of model components via ablation study and further showcase the predictive power of our model for six types of common cancers. The data and source code are available from https://github.com/qwslle/MTLMDA.


Assuntos
MicroRNAs , Neoplasias , Humanos , MicroRNAs/genética , MicroRNAs/metabolismo , Algoritmos , Biologia Computacional/métodos , Neoplasias/genética , Software
17.
Hum Genomics ; 18(1): 31, 2024 Mar 25.
Artigo em Inglês | MEDLINE | ID: mdl-38523305

RESUMO

PURPOSE: Coding mutations in the Transthyretin (TTR) gene cause a hereditary form of amyloidosis characterized by a complex genotype-phenotype correlation with limited information regarding differences among worldwide populations. METHODS: We compared 676 diverse individuals carrying TTR amyloidogenic mutations (rs138065384, Phe44Leu; rs730881165, Ala81Thr; rs121918074, His90Asn; rs76992529, Val122Ile) to 12,430 non-carriers matched by age, sex, and genetically-inferred ancestry to assess their clinical presentations across 1,693 outcomes derived from electronic health records in UK biobank. RESULTS: In individuals of African descent (AFR), Val122Ile mutation was linked to multiple outcomes related to the circulatory system (fold-enrichment = 2.96, p = 0.002) with the strongest associations being cardiac congenital anomalies (phecode 747.1, p = 0.003), endocarditis (phecode 420.3, p = 0.006), and cardiomyopathy (phecode 425, p = 0.007). In individuals of Central-South Asian descent (CSA), His90Asn mutation was associated with dermatologic outcomes (fold-enrichment = 28, p = 0.001). The same TTR mutation was linked to neoplasms in European-descent individuals (EUR, fold-enrichment = 3.09, p = 0.003). In EUR, Ala81Thr showed multiple associations with respiratory outcomes related (fold-enrichment = 3.61, p = 0.002), but the strongest association was with atrioventricular block (phecode 426.2, p = 2.81 × 10- 4). Additionally, the same mutation in East Asians (EAS) showed associations with endocrine-metabolic traits (fold-enrichment = 4.47, p = 0.003). In the cross-ancestry meta-analysis, Val122Ile mutation was associated with peripheral nerve disorders (phecode 351, p = 0.004) in addition to cardiac congenital anomalies (fold-enrichment = 6.94, p = 0.003). CONCLUSIONS: Overall, these findings highlight that TTR amyloidogenic mutations present ancestry-specific and ancestry-convergent associations related to a range of health domains. This supports the need to increase awareness regarding the range of outcomes associated with TTR mutations across worldwide populations to reduce misdiagnosis and delayed diagnosis of TTR-related amyloidosis.


Assuntos
Amiloidose , Pré-Albumina , Humanos , Pré-Albumina/genética , Mutação , Amiloidose/diagnóstico , Amiloidose/genética , Fenótipo , Genética Populacional
18.
Methods ; 229: 71-81, 2024 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-38909974

RESUMO

Identifying miRNA-disease associations (MDAs) is crucial for improving the diagnosis and treatment of various diseases. However, biological experiments can be time-consuming and expensive. To overcome these challenges, computational approaches have been developed, with Graph Convolutional Network (GCN) showing promising results in MDA prediction. The success of GCN-based methods relies on learning a meaningful spatial operator to extract effective node feature representations. To enhance the inference of MDAs, we propose a novel method called PGCNMDA, which employs graph convolutional networks with a learning graph spatial operator from paths. This approach enables the generation of meaningful spatial convolutions from paths in GCN, leading to improved prediction performance. On HMDD v2.0, PGCNMDA obtains a mean AUC of 0.9229 and an AUPRC of 0.9206 under 5-fold cross-validation (5-CV), and a mean AUC of 0.9235 and an AUPRC of 0.9212 under 10-fold cross-validation (10-CV), respectively. Additionally, the AUC of PGCNMDA also reaches 0.9238 under global leave-one-out cross-validation (GLOOCV). On HMDD v3.2, PGCNMDA obtains a mean AUC of 0.9413 and an AUPRC of 0.9417 under 5-CV, and a mean AUC of 0.9419 and an AUPRC of 0.9425 under 10-CV, respectively. Furthermore, the AUC of PGCNMDA also reaches 0.9415 under GLOOCV. The results show that PGCNMDA is superior to other compared methods. In addition, the case studies on pancreatic neoplasms, thyroid neoplasms and leukemia show that 50, 50 and 48 of the top 50 predicted miRNAs linked to these diseases are confirmed, respectively. It further validates the effectiveness and feasibility of PGCNMDA in practical applications.


Assuntos
MicroRNAs , Humanos , MicroRNAs/genética , Biologia Computacional/métodos , Redes Neurais de Computação , Predisposição Genética para Doença , Área Sob a Curva , Neoplasias Pancreáticas/genética , Algoritmos
19.
Methods ; 221: 73-81, 2024 01.
Artigo em Inglês | MEDLINE | ID: mdl-38123109

RESUMO

Research indicates that miRNAs present in herbal medicines are crucial for identifying disease markers, advancing gene therapy, facilitating drug delivery, and so on. These miRNAs maintain stability in the extracellular environment, making them viable tools for disease diagnosis. They can withstand the digestive processes in the gastrointestinal tract, positioning them as potential carriers for specific oral drug delivery. By engineering plants to generate effective, non-toxic miRNA interference sequences, it's possible to broaden their applicability, including the treatment of diseases such as hepatitis C. Consequently, delving into the miRNA-disease associations (MDAs) within herbal medicines holds immense promise for diagnosing and addressing miRNA-related diseases. In our research, we propose the SGAE-MDA model, which harnesses the strengths of a graph autoencoder (GAE) combined with a semi-supervised approach to uncover potential MDAs in herbal medicines more effectively. Leveraging the GAE framework, the SGAE-MDA model exactly integrates the inherent feature vectors of miRNAs and disease nodes with the regulatory data in the miRNA-disease network. Additionally, the proposed semi-supervised learning approach randomly hides the partial structure of the miRNA-disease network, subsequently reconstructing them within the GAE framework. This technique effectively minimizes network noise interference. Through comparison against other leading deep learning models, the results consistently highlighted the superior performance of the proposed SGAE-MDA model. Our code and dataset can be available at: https://github.com/22n9n23/SGAE-MDA.


Assuntos
MicroRNAs , MicroRNAs/genética , Algoritmos , Biologia Computacional/métodos , Aprendizado de Máquina Supervisionado , Extratos Vegetais
20.
BMC Bioinformatics ; 25(1): 46, 2024 Jan 29.
Artigo em Inglês | MEDLINE | ID: mdl-38287236

RESUMO

BACKGROUND: Many biological studies have shown that lncRNAs regulate the expression of epigenetically related genes. The study of lncRNAs has helped to deepen our understanding of the pathogenesis of complex diseases at the molecular level. Due to the large number of lncRNAs and the complex and time-consuming nature of biological experiments, applying computer techniques to predict potential lncRNA-disease associations is very effective. To explore information between complex network structures, existing methods rely mainly on lncRNA and disease information. Metapaths have been applied to network models as an effective method for exploring information in heterogeneous graphs. However, existing methods are dominated by lncRNAs or disease nodes and tend to ignore the paths provided by intermediate nodes. METHODS: We propose a deep learning model based on hierarchical graphical attention networks to predict unknown lncRNA-disease associations using multiple types of metapaths to extract features. We have named this model the MMHGAN. First, the model constructs a lncRNA-disease-miRNA heterogeneous graph based on known associations and two homogeneous graphs of lncRNAs and diseases. Second, for homogeneous graphs, the features of neighboring nodes are aggregated using a multihead attention mechanism. Third, for the heterogeneous graph, metapaths of different intermediate nodes are selected to construct subgraphs, and the importance of different types of metapaths is calculated and aggregated to obtain the final embedded features. Finally, the features are reconstructed using a fully connected layer to obtain the prediction results. RESULTS: We used a fivefold cross-validation method and obtained an average AUC value of 96.07% and an average AUPR value of 93.23%. Additionally, ablation experiments demonstrated the role of homogeneous graphs and different intermediate node path weights. In addition, we studied lung cancer, esophageal carcinoma, and breast cancer. Among the 15 lncRNAs associated with these diseases, 15, 12, and 14 lncRNAs were validated by the lncRNA Disease Database and the Lnc2Cancer Database, respectively. CONCLUSION: We compared the MMHGAN model with six existing models with better performance, and the case study demonstrated that the model was effective in predicting the correlation between potential lncRNAs and diseases.


Assuntos
Neoplasias da Mama , Neoplasias Pulmonares , MicroRNAs , RNA Longo não Codificante , Humanos , Feminino , RNA Longo não Codificante/genética , Biologia Computacional/métodos , MicroRNAs/genética , Algoritmos
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa