Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 215
Filtrar
1.
Front Microbiol ; 15: 1438942, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-39355422

RESUMO

Background: Clinical studies have demonstrated that microbes play a crucial role in human health and disease. The identification of microbe-disease interactions can provide insights into the pathogenesis and promote the diagnosis, treatment, and prevention of disease. Although a large number of computational methods are designed to screen novel microbe-disease associations, the accurate and efficient methods are still lacking due to data inconsistence, underutilization of prior information, and model performance. Methods: In this study, we proposed an improved deep learning-based framework, named GIMMDA, to identify latent microbe-disease associations, which is based on graph autoencoder and inductive matrix completion. By co-training the information from microbe and disease space, the new representations of microbes and diseases are used to reconstruct microbe-disease association in the end-to-end framework. In particular, a similarity fusion strategy is conducted to improve prediction performance. Results: The experimental results show that the performance of GIMMDA is competitive with that of existing state-of-the-art methods on 3 datasets (i.e., HMDAD, Disbiome, and multiMDA). In particular, it performs best with the area under the receiver operating characteristic curve (AUC) of 0.9735, 0.9156, 0.9396 on abovementioned 3 datasets, respectively. And the result also confirms that different similarity fusions can improve the prediction performance. Furthermore, case studies on two diseases, i.e., asthma and obesity, validate the effectiveness and reliability of our proposed model. Conclusion: The proposed GIMMDA model show a strong capability in predicting microbe-disease associations. We expect that GPUDMDA will help identify potential microbe-related diseases in the future.

2.
BMC Genomics ; 25(1): 885, 2024 Sep 20.
Artigo em Inglês | MEDLINE | ID: mdl-39304826

RESUMO

MicroRNAs (miRNAs) have been demonstrated to be closely related to human diseases. Studying the potential associations between miRNAs and diseases contributes to our understanding of disease pathogenic mechanisms. As traditional biological experiments are costly and time-consuming, computational models can be considered as effective complementary tools. In this study, we propose a novel model of robust orthogonal non-negative matrix tri-factorization (NMTF) with self-paced learning and dual hypergraph regularization, named SPLHRNMTF, to predict miRNA-disease associations. More specifically, SPLHRNMTF first uses a non-linear fusion method to obtain miRNA and disease comprehensive similarity. Subsequently, the improved miRNA-disease association matrix is reformulated based on weighted k-nearest neighbor profiles to correct false-negative associations. In addition, we utilize L 2 , 1 norm to replace Frobenius norm to calculate residual error, alleviating the impact of noise and outliers on prediction performance. Then, we integrate self-paced learning into NMTF to alleviate the model from falling into bad local optimal solutions by gradually including samples from easy to complex. Finally, hypergraph regularization is introduced to capture high-order complex relations from hypergraphs related to miRNAs and diseases. In 5-fold cross-validation five times experiments, SPLHRNMTF obtains higher average AUC values than other baseline models. Moreover, the case studies on breast neoplasms and lung neoplasms further demonstrate the accuracy of SPLHRNMTF. Meanwhile, the potential associations discovered are of biological significance.


Assuntos
Biologia Computacional , MicroRNAs , MicroRNAs/genética , Humanos , Biologia Computacional/métodos , Algoritmos , Predisposição Genética para Doença , Aprendizado de Máquina , Neoplasias Pulmonares/genética
3.
J Cell Mol Med ; 28(18): e70071, 2024 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-39300612

RESUMO

The use of matrix completion methods to predict the association between microbes and diseases can effectively improve treatment efficiency. However, the similarity measures used in the existing methods are often influenced by various factors such as neighbourhood size, choice of similarity metric, or multiple parameters for similarity fusion, making it challenging. Additionally, matrix completion is currently limited by the sparsity of the initial association matrix, which restricts its predictive performance. To address these problems, we propose a matrix completion method based on adaptive neighbourhood similarity and sparse constraints (ANS-SCMC) for predict microbe-disease potential associations. Adaptive neighbourhood similarity learning dynamically uses the decomposition results as effective information for the next learning iteration by simultaneously performing local manifold structure learning and decomposition. This approach effectively preserves fine local structure information and avoids the influence of weight parameters directly involved in similarity measurement. Additionally, the sparse constraint-based matrix completion approach can better handle the sparsity challenge in the association matrix. Finally, the algorithm we proposed has achieved significantly higher predictive performance in the validation compared to several commonly used prediction methods proposed to date. Furthermore, in the case study, the prediction algorithm achieved an accuracy of up to 80% for the top 10 microbes associated with type 1 diabetes and 100% for Crohn's disease respectively.


Assuntos
Algoritmos , Humanos , Biologia Computacional/métodos , Microbiota , Doença de Crohn/microbiologia
4.
Evol Bioinform Online ; 20: 11769343241272414, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-39279816

RESUMO

The identification of potential interactions and relationships between diseases and drugs is significant in public health care and drug discovery. As we all know, experimenting to determine the drug-disease interactions is very expensive in both time and money. However, there are still many drug-disease associations that are still undiscovered and potential. Therefore, the development of computational methods to explore the relationship between drugs and diseases is very important and essential. Many computational methods for predicting drug-disease associations have been developed based on known interactions to learn potential interactions of unknown drug-disease pairs. In this paper, we propose 3 new main groups of meta-paths based on the heterogeneous biological network of drug-protein-disease objects. For each meta-path, we design a machine learning model, then an integrated learning method is formed by these models. We evaluated our approach on 3 standard datasets which are DrugBank, OMIM, and Gottlieb's dataset. Experimental results demonstrate that the proposed method is better than some recent methods such as EMP-SVD, LRSSL, MBiRW, MPG-DDA, SCMFDD,. . . in some measures such as AUC, AUPR, and F1-score.

5.
Brief Bioinform ; 25(5)2024 Jul 25.
Artigo em Inglês | MEDLINE | ID: mdl-39175132

RESUMO

Numerous studies have demonstrated that microRNAs (miRNAs) are critically important for the prediction, diagnosis, and characterization of diseases. However, identifying miRNA-disease associations through traditional biological experiments is both costly and time-consuming. To further explore these associations, we proposed a model based on hybrid high-order moments combined with element-level attention mechanisms (HHOMR). This model innovatively fused hybrid higher-order statistical information along with structural and community information. Specifically, we first constructed a heterogeneous graph based on existing associations between miRNAs and diseases. HHOMR employs a structural fusion layer to capture structure-level embeddings and leverages a hybrid high-order moments encoder layer to enhance features. Element-level attention mechanisms are then used to adaptively integrate the features of these hybrid moments. Finally, a multi-layer perceptron is utilized to calculate the association scores between miRNAs and diseases. Through five-fold cross-validation on HMDD v2.0, we achieved a mean AUC of 93.28%. Compared with four state-of-the-art models, HHOMR exhibited superior performance. Additionally, case studies on three diseases-esophageal neoplasms, lymphoma, and prostate neoplasms-were conducted. Among the top 50 miRNAs with high disease association scores, 46, 47, and 45 associated with these diseases were confirmed by the dbDEMC and miR2Disease databases, respectively. Our results demonstrate that HHOMR not only outperforms existing models but also shows significant potential in predicting miRNA-disease associations.


Assuntos
MicroRNAs , MicroRNAs/genética , Humanos , Biologia Computacional/métodos , Predisposição Genética para Doença , Algoritmos , Neoplasias da Próstata/genética , Modelos Genéticos
6.
Front Microbiol ; 15: 1435408, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-39144226

RESUMO

Introduction: Accumulating evidence shows that human health and disease are closely related to the microbes in the human body. Methods: In this manuscript, a new computational model based on graph attention networks and sparse autoencoders, called GCANCAE, was proposed for inferring possible microbe-disease associations. In GCANCAE, we first constructed a heterogeneous network by combining known microbe-disease relationships, disease similarity, and microbial similarity. Then, we adopted the improved GCN and the CSAE to extract neighbor relations in the adjacency matrix and novel feature representations in heterogeneous networks. After that, in order to estimate the likelihood of a potential microbe associated with a disease, we integrated these two types of representations to create unique eigenmatrices for diseases and microbes, respectively, and obtained predicted scores for potential microbe-disease associations by calculating the inner product of these two types of eigenmatrices. Results and discussion: Based on the baseline databases such as the HMDAD and the Disbiome, intensive experiments were conducted to evaluate the prediction ability of GCANCAE, and the experimental results demonstrated that GCANCAE achieved better performance than state-of-the-art competitive methods under the frameworks of both 2-fold and 5-fold CV. Furthermore, case studies of three categories of common diseases, such as asthma, irritable bowel syndrome (IBS), and type 2 diabetes (T2D), confirmed the efficiency of GCANCAE.

7.
Brief Bioinform ; 25(4)2024 May 23.
Artigo em Inglês | MEDLINE | ID: mdl-38980370

RESUMO

RepurposeDrugs (https://repurposedrugs.org/) is a comprehensive web-portal that combines a unique drug indication database with a machine learning (ML) predictor to discover new drug-indication associations for approved as well as investigational mono and combination therapies. The platform provides detailed information on treatment status, disease indications and clinical trials across 25 indication categories, including neoplasms and cardiovascular conditions. The current version comprises 4314 compounds (approved, terminated or investigational) and 161 drug combinations linked to 1756 indications/conditions, totaling 28 148 drug-disease pairs. By leveraging data on both approved and failed indications, RepurposeDrugs provides ML-based predictions for the approval potential of new drug-disease indications, both for mono- and combinatorial therapies, demonstrating high predictive accuracy in cross-validation. The validity of the ML predictor is validated through a number of real-world case studies, demonstrating its predictive power to accurately identify repurposing candidates with a high likelihood of future approval. To our knowledge, RepurposeDrugs web-portal is the first integrative database and ML-based predictor for interactive exploration and prediction of both single-drug and combination approval likelihood across indications. Given its broad coverage of indication areas and therapeutic options, we expect it accelerates many future drug repurposing projects.


Assuntos
Reposicionamento de Medicamentos , Aprendizado de Máquina , Reposicionamento de Medicamentos/métodos , Humanos , Internet , Quimioterapia Combinada , Bases de Dados de Produtos Farmacêuticos , Bases de Dados Factuais
9.
Math Biosci Eng ; 21(4): 4814-4834, 2024 Feb 29.
Artigo em Inglês | MEDLINE | ID: mdl-38872515

RESUMO

Long non-coding RNA (lncRNA) is considered to be a crucial regulator involved in various human biological processes, including the regulation of tumor immune checkpoint proteins. It has great potential as both a cancer biomolecular biomarker and therapeutic target. Nevertheless, conventional biological experimental techniques are both resource-intensive and laborious, making it essential to develop an accurate and efficient computational method to facilitate the discovery of potential links between lncRNAs and diseases. In this study, we proposed HRGCNLDA, a computational approach utilizing hierarchical refinement of graph convolutional neural networks for forecasting lncRNA-disease potential associations. This approach effectively addresses the over-smoothing problem that arises from stacking multiple layers of graph convolutional neural networks. Specifically, HRGCNLDA enhances the layer representation during message propagation and node updates, thereby amplifying the contribution of hidden layers that resemble the ego layer while reducing discrepancies. The results of the experiments showed that HRGCNLDA achieved the highest AUC-ROC (area under the receiver operating characteristic curve, AUC for short) and AUC-PR (area under the precision versus recall curve, AUPR for short) values compared to other methods. Finally, to further demonstrate the reliability and efficacy of our approach, we performed case studies on the case of three prevalent human diseases, namely, breast cancer, lung cancer and gastric cancer.


Assuntos
Algoritmos , Área Sob a Curva , Biologia Computacional , Redes Neurais de Computação , RNA Longo não Codificante , Curva ROC , RNA Longo não Codificante/genética , Humanos , Biologia Computacional/métodos , Neoplasias/genética , Neoplasias Pulmonares/genética , Neoplasias da Mama/genética , Biomarcadores Tumorais/genética , Feminino , Previsões
10.
BMC Bioinformatics ; 25(1): 214, 2024 Jun 14.
Artigo em Inglês | MEDLINE | ID: mdl-38877401

RESUMO

BACKGROUND: The exploration of gene-disease associations is crucial for understanding the mechanisms underlying disease onset and progression, with significant implications for prevention and treatment strategies. Advances in high-throughput biotechnology have generated a wealth of data linking diseases to specific genes. While graph representation learning has recently introduced groundbreaking approaches for predicting novel associations, existing studies always overlooked the cumulative impact of functional modules such as protein complexes and the incompletion of some important data such as protein interactions, which limits the detection performance. RESULTS: Addressing these limitations, here we introduce a deep learning framework called ModulePred for predicting disease-gene associations. ModulePred performs graph augmentation on the protein interaction network using L3 link prediction algorithms. It builds a heterogeneous module network by integrating disease-gene associations, protein complexes and augmented protein interactions, and develops a novel graph embedding for the heterogeneous module network. Subsequently, a graph neural network is constructed to learn node representations by collectively aggregating information from topological structure, and gene prioritization is carried out by the disease and gene embeddings obtained from the graph neural network. Experimental results underscore the superiority of ModulePred, showcasing the effectiveness of incorporating functional modules and graph augmentation in predicting disease-gene associations. This research introduces innovative ideas and directions, enhancing the understanding and prediction of gene-disease relationships.


Assuntos
Algoritmos , Aprendizado Profundo , Humanos , Biologia Computacional/métodos , Mapas de Interação de Proteínas/genética , Predisposição Genética para Doença/genética , Redes Neurais de Computação , Estudos de Associação Genética/métodos
11.
Sci Rep ; 14(1): 12761, 2024 06 04.
Artigo em Inglês | MEDLINE | ID: mdl-38834687

RESUMO

Abundant researches have consistently illustrated the crucial role of microRNAs (miRNAs) in a wide array of essential biological processes. Furthermore, miRNAs have been validated as promising therapeutic targets for addressing complex diseases. Given the costly and time-consuming nature of traditional biological experimental validation methods, it is imperative to develop computational methods. In the work, we developed a novel approach named efficient matrix completion (EMCMDA) for predicting miRNA-disease associations. First, we calculated the similarities across multiple sources for miRNA/disease pairs and combined this information to create a holistic miRNA/disease similarity measure. Second, we utilized this biological information to create a heterogeneous network and established a target matrix derived from this network. Lastly, we framed the miRNA-disease association prediction issue as a low-rank matrix-complete issue that was addressed via minimizing matrix truncated schatten p-norm. Notably, we improved the conventional singular value contraction algorithm through using a weighted singular value contraction technique. This technique dynamically adjusts the degree of contraction based on the significance of each singular value, ensuring that the physical meaning of these singular values is fully considered. We evaluated the performance of EMCMDA by applying two distinct cross-validation experiments on two diverse databases, and the outcomes were statistically significant. In addition, we executed comprehensive case studies on two prevalent human diseases, namely lung cancer and breast cancer. Following prediction and multiple validations, it was evident that EMCMDA proficiently forecasts previously undisclosed disease-related miRNAs. These results underscore the robustness and efficacy of EMCMDA in miRNA-disease association prediction.


Assuntos
Algoritmos , Biologia Computacional , Predisposição Genética para Doença , MicroRNAs , MicroRNAs/genética , Humanos , Biologia Computacional/métodos , Neoplasias da Mama/genética
12.
Methods ; 229: 71-81, 2024 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-38909974

RESUMO

Identifying miRNA-disease associations (MDAs) is crucial for improving the diagnosis and treatment of various diseases. However, biological experiments can be time-consuming and expensive. To overcome these challenges, computational approaches have been developed, with Graph Convolutional Network (GCN) showing promising results in MDA prediction. The success of GCN-based methods relies on learning a meaningful spatial operator to extract effective node feature representations. To enhance the inference of MDAs, we propose a novel method called PGCNMDA, which employs graph convolutional networks with a learning graph spatial operator from paths. This approach enables the generation of meaningful spatial convolutions from paths in GCN, leading to improved prediction performance. On HMDD v2.0, PGCNMDA obtains a mean AUC of 0.9229 and an AUPRC of 0.9206 under 5-fold cross-validation (5-CV), and a mean AUC of 0.9235 and an AUPRC of 0.9212 under 10-fold cross-validation (10-CV), respectively. Additionally, the AUC of PGCNMDA also reaches 0.9238 under global leave-one-out cross-validation (GLOOCV). On HMDD v3.2, PGCNMDA obtains a mean AUC of 0.9413 and an AUPRC of 0.9417 under 5-CV, and a mean AUC of 0.9419 and an AUPRC of 0.9425 under 10-CV, respectively. Furthermore, the AUC of PGCNMDA also reaches 0.9415 under GLOOCV. The results show that PGCNMDA is superior to other compared methods. In addition, the case studies on pancreatic neoplasms, thyroid neoplasms and leukemia show that 50, 50 and 48 of the top 50 predicted miRNAs linked to these diseases are confirmed, respectively. It further validates the effectiveness and feasibility of PGCNMDA in practical applications.


Assuntos
MicroRNAs , Humanos , MicroRNAs/genética , Biologia Computacional/métodos , Redes Neurais de Computação , Predisposição Genética para Doença , Área Sob a Curva , Neoplasias Pancreáticas/genética , Algoritmos
13.
BMC Bioinformatics ; 25(1): 187, 2024 May 13.
Artigo em Inglês | MEDLINE | ID: mdl-38741200

RESUMO

MOTIVATION: Long non-coding RNAs (lncRNAs) are a class of molecules involved in important biological processes. Extensive efforts have been provided to get deeper understanding of disease mechanisms at the lncRNA level, guiding towards the detection of biomarkers for disease diagnosis, treatment, prognosis and prevention. Unfortunately, due to costs and time complexity, the number of possible disease-related lncRNAs verified by traditional biological experiments is very limited. Computational approaches for the prediction of disease-lncRNA associations allow to identify the most promising candidates to be verified in laboratory, reducing costs and time consuming. RESULTS: We propose novel approaches for the prediction of lncRNA-disease associations, all sharing the idea of exploring associations among lncRNAs, other intermediate molecules (e.g., miRNAs) and diseases, suitably represented by tripartite graphs. Indeed, while only a few lncRNA-disease associations are still known, plenty of interactions between lncRNAs and other molecules, as well as associations of the latters with diseases, are available. A first approach presented here, NGH, relies on neighborhood analysis performed on a tripartite graph, built upon lncRNAs, miRNAs and diseases. A second approach (CF) relies on collaborative filtering; a third approach (NGH-CF) is obtained boosting NGH by collaborative filtering. The proposed approaches have been validated on both synthetic and real data, and compared against other methods from the literature. It results that neighborhood analysis allows to outperform competitors, and when it is combined with collaborative filtering the prediction accuracy further improves, scoring a value of AUC equal to 0966. AVAILABILITY: Source code and sample datasets are available at: https://github.com/marybonomo/LDAsPredictionApproaches.git.


Assuntos
Biologia Computacional , RNA Longo não Codificante , RNA Longo não Codificante/genética , Humanos , Biologia Computacional/métodos , Algoritmos , MicroRNAs/genética , MicroRNAs/metabolismo , Predisposição Genética para Doença/genética
14.
Anal Biochem ; 692: 115554, 2024 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-38710353

RESUMO

A series of biological experiments has demonstrated that circular RNAs play a crucial regulatory role in cellular processes and may be potentially associated with diseases. Uncovering these connections helps in understanding potential disease mechanisms and advancing the development of treatment strategies. However, in biology, traditional experiments face limitations in terms of efficiency and cost, especially when enumerating possible associations. To address these limitations, several computational methods have been proposed, but existing methods only measure from a nodal perspective and cannot capture structural similarities between edges. In this study, we introduce an advanced computational method called SATPIC2CD for analyzing potential associations between circular RNAs and diseases. Specifically, we first employ an Structure-Aware Graph Transformer (SAT), which extracts five predefined metapath representations before calculating attention. This adaptive network integrates structural information into the original self-attention by aggregating information within and between paths. Subsequently, we use Path Integral Convolutional Networks (PACN) to integrate feature information for all path weights between two nodes. Afterward, we complement the network node features with feature loss and feature smoothing using Gated Recurrent Units (GRU) and node centrality. Finally, a Multi-Layer Perceptron (MLP) is employed to obtain the ultimate prediction scores for each circular RNA-disease pair. SATPIC2CD performs remarkably well, with an accuracy of up to 0.9715 measured by the Area Under the Curve (AUC) in a 5-fold cross-validation, surpassing other comparative models. Case studies further emphasize the high precision of our method in identifying circular RNA-disease associations, laying a solid foundation for guiding future biological research efforts.


Assuntos
RNA Circular , RNA Circular/genética , Humanos , Biologia Computacional/métodos , Redes Neurais de Computação , Algoritmos
15.
Brief Bioinform ; 25(3)2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38605642

RESUMO

MicroRNAs (miRNAs) synergize with various biomolecules in human cells resulting in diverse functions in regulating a wide range of biological processes. Predicting potential disease-associated miRNAs as valuable biomarkers contributes to the treatment of human diseases. However, few previous methods take a holistic perspective and only concentrate on isolated miRNA and disease objects, thereby ignoring that human cells are responsible for multiple relationships. In this work, we first constructed a multi-view graph based on the relationships between miRNAs and various biomolecules, and then utilized graph attention neural network to learn the graph topology features of miRNAs and diseases for each view. Next, we added an attention mechanism again, and developed a multi-scale feature fusion module, aiming to determine the optimal fusion results for the multi-view topology features of miRNAs and diseases. In addition, the prior attribute knowledge of miRNAs and diseases was simultaneously added to achieve better prediction results and solve the cold start problem. Finally, the learned miRNA and disease representations were then concatenated and fed into a multi-layer perceptron for end-to-end training and predicting potential miRNA-disease associations. To assess the efficacy of our model (called MUSCLE), we performed 5- and 10-fold cross-validation (CV), which got average the Area under ROC curves of 0.966${\pm }$0.0102 and 0.973${\pm }$0.0135, respectively, outperforming most current state-of-the-art models. We then examined the impact of crucial parameters on prediction performance and performed ablation experiments on the feature combination and model architecture. Furthermore, the case studies about colon cancer, lung cancer and breast cancer also fully demonstrate the good inductive capability of MUSCLE. Our data and code are free available at a public GitHub repository: https://github.com/zht-code/MUSCLE.git.


Assuntos
Neoplasias do Colo , Neoplasias Pulmonares , MicroRNAs , Humanos , Músculos , Aprendizagem , MicroRNAs/genética , Algoritmos , Biologia Computacional
16.
Brief Bioinform ; 25(3)2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38581419

RESUMO

Piwi-interacting RNAs (piRNAs) play a crucial role in various biological processes and are implicated in disease. Consequently, there is an escalating demand for computational tools to predict piRNA-disease interactions. Although there have been computational methods proposed for the detection of piRNA-disease associations, the problem of imbalanced and sparse dataset has brought great challenges to capture the complex relationships between piRNAs and diseases. In response to this necessity, we have developed a novel computational architecture, denoted as PUTransGCN, which uses heterogeneous graph convolutional networks to uncover potential piRNA-disease associations. Additionally, the attention mechanism was used to adjust the weight parameters of aggregation heterogeneous node features automatically. For tackling the imbalanced dataset problem, the combined positive unlabelled learning (PUL) method comprising PU bagging, two-step and spy technique was applied to select reliable negative associations. The features of piRNAs and diseases were derived from three distinct biological sources by PUTransGCN, including information on piRNA sequences, semantic terms related to diseases and the existing network of piRNA-disease associations. In the experiment, PUTransGCN performs in 5-fold cross-validation with an AUC of 0.93 and 0.95 on two datasets, respectively, which outperforms the other six state-of-the-art models. We compared three different PUL methods, and the results of the ablation experiment indicate that the combined PUL method yields the best results. The PUTransGCN could serve as a valuable piRNA-disease prediction tool for upcoming studies in the biomedical field. The code for PUTransGCN is available at https://github.com/chenqiuhao/PUTransGCN.


Assuntos
RNA de Interação com Piwi
17.
Brief Bioinform ; 25(2)2024 Jan 22.
Artigo em Inglês | MEDLINE | ID: mdl-38517693

RESUMO

Numerous investigations increasingly indicate the significance of microRNA (miRNA) in human diseases. Hence, unearthing associations between miRNA and diseases can contribute to precise diagnosis and efficacious remediation of medical conditions. The detection of miRNA-disease linkages via computational techniques utilizing biological information has emerged as a cost-effective and highly efficient approach. Here, we introduced a computational framework named ReHoGCNES, designed for prospective miRNA-disease association prediction (ReHoGCNES-MDA). This method constructs homogenous graph convolutional network with regular graph structure (ReHoGCN) encompassing disease similarity network, miRNA similarity network and known MDA network and then was tested on four experimental tasks. A random edge sampler strategy was utilized to expedite processes and diminish training complexity. Experimental results demonstrate that the proposed ReHoGCNES-MDA method outperforms both homogenous graph convolutional network and heterogeneous graph convolutional network with non-regular graph structure in all four tasks, which implicitly reveals steadily degree distribution of a graph does play an important role in enhancement of model performance. Besides, ReHoGCNES-MDA is superior to several machine learning algorithms and state-of-the-art methods on the MDA prediction. Furthermore, three case studies were conducted to further demonstrate the predictive ability of ReHoGCNES. Consequently, 93.3% (breast neoplasms), 90% (prostate neoplasms) and 93.3% (prostate neoplasms) of the top 30 forecasted miRNAs were validated by public databases. Hence, ReHoGCNES-MDA might serve as a dependable and beneficial model for predicting possible MDAs.


Assuntos
MicroRNAs , Neoplasias da Próstata , Humanos , Masculino , Algoritmos , Biologia Computacional/métodos , Bases de Dados Genéticas , MicroRNAs/genética , Estudos Prospectivos , Neoplasias da Próstata/genética , Feminino
18.
BMC Bioinformatics ; 25(1): 139, 2024 Mar 29.
Artigo em Inglês | MEDLINE | ID: mdl-38553698

RESUMO

BACKGROUND: MicroRNA (miRNA) has been shown to play a key role in the occurrence and progression of diseases, making uncovering miRNA-disease associations vital for disease prevention and therapy. However, traditional laboratory methods for detecting these associations are slow, strenuous, expensive, and uncertain. Although numerous advanced algorithms have emerged, it is still a challenge to develop more effective methods to explore underlying miRNA-disease associations. RESULTS: In the study, we designed a novel approach on the basis of deep autoencoder and combined feature representation (DAE-CFR) to predict possible miRNA-disease associations. We began by creating integrated similarity matrices of miRNAs and diseases, performing a logistic function transformation, balancing positive and negative samples with k-means clustering, and constructing training samples. Then, deep autoencoder was used to extract low-dimensional feature from two kinds of feature representations for miRNAs and diseases, namely, original association information-based and similarity information-based. Next, we combined the resulting features for each miRNA-disease pair and used a logistic regression (LR) classifier to infer all unknown miRNA-disease interactions. Under five and tenfold cross-validation (CV) frameworks, DAE-CFR not only outperformed six popular algorithms and nine classifiers, but also demonstrated superior performance on an additional dataset. Furthermore, case studies on three diseases (myocardial infarction, hypertension and stroke) confirmed the validity of DAE-CFR in practice. CONCLUSIONS: DAE-CFR achieved outstanding performance in predicting miRNA-disease associations and can provide evidence to inform biological experiments and clinical therapy.


Assuntos
MicroRNAs , Humanos , MicroRNAs/genética , Biologia Computacional/métodos , Algoritmos , Predisposição Genética para Doença
19.
Hum Genomics ; 18(1): 31, 2024 Mar 25.
Artigo em Inglês | MEDLINE | ID: mdl-38523305

RESUMO

PURPOSE: Coding mutations in the Transthyretin (TTR) gene cause a hereditary form of amyloidosis characterized by a complex genotype-phenotype correlation with limited information regarding differences among worldwide populations. METHODS: We compared 676 diverse individuals carrying TTR amyloidogenic mutations (rs138065384, Phe44Leu; rs730881165, Ala81Thr; rs121918074, His90Asn; rs76992529, Val122Ile) to 12,430 non-carriers matched by age, sex, and genetically-inferred ancestry to assess their clinical presentations across 1,693 outcomes derived from electronic health records in UK biobank. RESULTS: In individuals of African descent (AFR), Val122Ile mutation was linked to multiple outcomes related to the circulatory system (fold-enrichment = 2.96, p = 0.002) with the strongest associations being cardiac congenital anomalies (phecode 747.1, p = 0.003), endocarditis (phecode 420.3, p = 0.006), and cardiomyopathy (phecode 425, p = 0.007). In individuals of Central-South Asian descent (CSA), His90Asn mutation was associated with dermatologic outcomes (fold-enrichment = 28, p = 0.001). The same TTR mutation was linked to neoplasms in European-descent individuals (EUR, fold-enrichment = 3.09, p = 0.003). In EUR, Ala81Thr showed multiple associations with respiratory outcomes related (fold-enrichment = 3.61, p = 0.002), but the strongest association was with atrioventricular block (phecode 426.2, p = 2.81 × 10- 4). Additionally, the same mutation in East Asians (EAS) showed associations with endocrine-metabolic traits (fold-enrichment = 4.47, p = 0.003). In the cross-ancestry meta-analysis, Val122Ile mutation was associated with peripheral nerve disorders (phecode 351, p = 0.004) in addition to cardiac congenital anomalies (fold-enrichment = 6.94, p = 0.003). CONCLUSIONS: Overall, these findings highlight that TTR amyloidogenic mutations present ancestry-specific and ancestry-convergent associations related to a range of health domains. This supports the need to increase awareness regarding the range of outcomes associated with TTR mutations across worldwide populations to reduce misdiagnosis and delayed diagnosis of TTR-related amyloidosis.


Assuntos
Amiloidose , Pré-Albumina , Humanos , Pré-Albumina/genética , Mutação , Amiloidose/diagnóstico , Amiloidose/genética , Fenótipo , Genética Populacional
20.
Interdiscip Sci ; 16(2): 345-360, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38436840

RESUMO

Computational approaches employed for predicting potential microbe-disease associations often rely on similarity information between microbes and diseases. Therefore, it is important to obtain reliable similarity information by integrating multiple types of similarity information. However, existing similarity fusion methods do not consider multi-order fusion of similarity networks. To address this problem, a novel method of linear neighborhood label propagation with multi-order similarity fusion learning (MOSFL-LNP) is proposed to predict potential microbe-disease associations. Multi-order fusion learning comprises two parts: low-order global learning and high-order feature learning. Low-order global learning is used to obtain common latent features from multiple similarity sources. High-order feature learning relies on the interactions between neighboring nodes to identify high-order similarities and learn deeper interactive network structures. Coefficients are assigned to different high-order feature learning modules to balance the similarities learned from different orders and enhance the robustness of the fusion network. Overall, by combining low-order global learning with high-order feature learning, multi-order fusion learning can capture both the shared and unique features of different similarity networks, leading to more accurate predictions of microbe-disease associations. In comparison to six other advanced methods, MOSFL-LNP exhibits superior prediction performance in the leave-one-out cross-validation and 5-fold validation frameworks. In the case study, the predicted 10 microbes associated with asthma and type 1 diabetes have an accuracy rate of up to 90% and 100%, respectively.


Assuntos
Algoritmos , Humanos , Biologia Computacional/métodos , Aprendizado de Máquina
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA