Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 38
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
BMC Genomics ; 25(1): 175, 2024 Feb 14.
Artículo en Inglés | MEDLINE | ID: mdl-38350848

RESUMEN

BACKGROUND: Brain diseases pose a significant threat to human health, and various network-based methods have been proposed for identifying gene biomarkers associated with these diseases. However, the brain is a complex system, and extracting topological semantics from different brain networks is necessary yet challenging to identify pathogenic genes for brain diseases. RESULTS: In this study, we present a multi-network representation learning framework called M-GBBD for the identification of gene biomarker in brain diseases. Specifically, we collected multi-omics data to construct eleven networks from different perspectives. M-GBBD extracts the spatial distributions of features from these networks and iteratively optimizes them using Kullback-Leibler divergence to fuse the networks into a common semantic space that represents the gene network for the brain. Subsequently, a graph consisting of both gene and large-scale disease proximity networks learns representations through graph convolution techniques and predicts whether a gene is associated which brain diseases while providing associated scores. Experimental results demonstrate that M-GBBD outperforms several baseline methods. Furthermore, our analysis supported by bioinformatics revealed CAMP as a significantly associated gene with Alzheimer's disease identified by M-GBBD. CONCLUSION: Collectively, M-GBBD provides valuable insights into identifying gene biomarkers for brain diseases and serves as a promising framework for brain networks representation learning.


Asunto(s)
Enfermedad de Alzheimer , Semántica , Humanos , Encéfalo/diagnóstico por imagen , Enfermedad de Alzheimer/genética , Marcadores Genéticos , Aprendizaje
2.
IEEE J Biomed Health Inform ; 28(3): 1742-1751, 2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-38127594

RESUMEN

Growing studies reveal that Circular RNAs (circRNAs) are broadly engaged in physiological processes of cell proliferation, differentiation, aging, apoptosis, and are closely associated with the pathogenesis of numerous diseases. Clarification of the correlation among diseases and circRNAs is of great clinical importance to provide new therapeutic strategies for complex diseases. However, previous circRNA-disease association prediction methods rely excessively on the graph network, and the model performance is dramatically reduced when noisy connections occur in the graph structure. To address this problem, this paper proposes an unsupervised deep graph structure learning method GSLCDA to predict potential CDAs. Concretely, we first integrate circRNA and disease multi-source data to constitute the CDA heterogeneous network. Then the network topology is learned using the graph structure, and the original graph is enhanced in an unsupervised manner by maximize the inter information of the learned and original graphs to uncover their essential features. Finally, graph space sensitive k-nearest neighbor (KNN) algorithm is employed to search for latent CDAs. In the benchmark dataset, GSLCDA obtained 92.67% accuracy with 0.9279 AUC. GSLCDA also exhibits exceptional performance on independent datasets. Furthermore, 14, 12 and 14 of the top 16 circRNAs with the most points GSLCDA prediction scores were confirmed in the relevant literature in the breast cancer, colorectal cancer and lung cancer case studies, respectively. Such results demonstrated that GSLCDA can validly reveal underlying CDA and offer new perspectives for the diagnosis and therapy of complex human diseases.


Asunto(s)
Neoplasias de la Mama , Neoplasias Pulmonares , Humanos , Femenino , ARN Circular/genética , Neoplasias de la Mama/genética , Algoritmos , Envejecimiento , Biología Computacional/métodos
3.
IEEE J Biomed Health Inform ; 28(3): 1752-1761, 2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-38145538

RESUMEN

With a growing body of evidence establishing circular RNAs (circRNAs) are widely exploited in eukaryotic cells and have a significant contribution in the occurrence and development of many complex human diseases. Disease-associated circRNAs can serve as clinical diagnostic biomarkers and therapeutic targets, providing novel ideas for biopharmaceutical research. However, available computation methods for predicting circRNA-disease associations (CDAs) do not sufficiently consider the contextual information of biological network nodes, making their performance limited. In this work, we propose a multi-hop attention graph neural network-based approach MAGCDA to infer potential CDAs. Specifically, we first construct a multi-source attribute heterogeneous network of circRNAs and diseases, then use a multi-hop strategy of graph nodes to deeply aggregate node context information through attention diffusion, thus enhancing topological structure information and mining data hidden features, and finally use random forest to accurately infer potential CDAs. In the four gold standard data sets, MAGCDA achieved prediction accuracy of 92.58%, 91.42%, 83.46% and 91.12%, respectively. MAGCDA has also presented prominent achievements in ablation experiments and in comparisons with other models. Additionally, 18 and 17 potential circRNAs in top 20 predicted scores for MAGCDA prediction scores were confirmed in case studies of the complex diseases breast cancer and Almozheimer's disease, respectively. These results suggest that MAGCDA can be a practical tool to explore potential disease-associated circRNAs and provide a theoretical basis for disease diagnosis and treatment.


Asunto(s)
Neoplasias de la Mama , ARN Circular , Humanos , Femenino , ARN Circular/genética , Redes Neurales de la Computación , Biomarcadores , Biología Computacional/métodos
4.
ACS Omega ; 8(30): 27386-27397, 2023 Aug 01.
Artículo en Inglés | MEDLINE | ID: mdl-37546619

RESUMEN

Identifying noncoding RNAs (ncRNAs)-drug resistance association computationally would have a marked effect on understanding ncRNA molecular function and drug target mechanisms and alleviating the screening cost of corresponding biological wet experiments. Although graph neural network-based methods have been developed and facilitated the detection of ncRNAs related to drug resistance, it remains a challenge to explore a highly trusty ncRNA-drug resistance association prediction framework, due to inevitable noise edges originating from the batch effect and experimental errors. Herein, we proposed a framework, referred to as RDRGSE (RDR association prediction by using graph skeleton extraction and attentional feature fusion), for detecting ncRNA-drug resistance association. Specifically, starting with the construction of the original ncRNA-drug resistance association as a bipartite graph, RDRGSE took advantage of a bi-view skeleton extraction strategy to obtain two types of skeleton views, followed by a graph neural network-based estimator for iteratively optimizing skeleton views aimed at learning high-quality ncRNA-drug resistance edge embedding and optimal graph skeleton structure, jointly. Then, RDRGSE adopted adaptive attentional feature fusion to obtain final edge embedding and identified potential RDRAs under an end-to-end pattern. Comprehensive experiments were conducted, and experimental results indicated the significant advantage of a skeleton structure for ncRNA-drug resistance association discovery. Compared with state-of-the-art approaches, RDRGSE improved the prediction performance by 6.7% in terms of AUC and 6.1% in terms of AUPR. Also, ablation-like analysis and independent case studies corroborated RDRGSE generalization ability and robustness. Overall, RDRGSE provides a powerful computational method for ncRNA-drug resistance association prediction, which can also serve as a screening tool for drug resistance biomarkers.

5.
Front Genet ; 14: 1084482, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37274787

RESUMEN

Identification of long non-coding RNAs (lncRNAs) associated with common diseases is crucial for patient self-diagnosis and monitoring of health conditions using artificial intelligence (AI) technology at home. LncRNAs have gained significant attention due to their crucial roles in the pathogenesis of complex human diseases and identifying their associations with diseases can aid in developing diagnostic biomarkers at the molecular level. Computational methods for predicting lncRNA-disease associations (LDAs) have become necessary due to the time-consuming and labor-intensive nature of wet biological experiments in hospitals, enabling patients to access LDAs through their AI terminal devices at any time. Here, we have developed a predictive tool, LDAGRL, for identifying potential LDAs using a bridge heterogeneous information network (BHnet) constructed via Structural Deep Network Embedding (SDNE). The BHnet consists of three types of molecules as bridge nodes to implicitly link the lncRNA with disease nodes and the SDNE is used to learn high-quality node representations and make LDA predictions in a unified graph space. To assess the feasibility and performance of LDAGRL, extensive experiments, including 5-fold cross-validation, comparison with state-of-the-art methods, comparison on different classifiers and comparison of different node feature combinations, were conducted, and the results showed that LDAGRL achieved satisfactory prediction performance, indicating its potential as an effective LDAs prediction tool for family medicine and primary care.

6.
BMC Bioinformatics ; 24(1): 188, 2023 May 08.
Artículo en Inglés | MEDLINE | ID: mdl-37158823

RESUMEN

BACKGROUND: The limited knowledge of miRNA-lncRNA interactions is considered as an obstruction of revealing the regulatory mechanism. Accumulating evidence on Human diseases indicates that the modulation of gene expression has a great relationship with the interactions between miRNAs and lncRNAs. However, such interaction validation via crosslinking-immunoprecipitation and high-throughput sequencing (CLIP-seq) experiments that inevitably costs too much money and time but with unsatisfactory results. Therefore, more and more computational prediction tools have been developed to offer many reliable candidates for a better design of further bio-experiments. METHODS: In this work, we proposed a novel link prediction model based on Gaussian kernel-based method and linear optimization algorithm for inferring miRNA-lncRNA interactions (GKLOMLI). Given an observed miRNA-lncRNA interaction network, the Gaussian kernel-based method was employed to output two similarity matrixes of miRNAs and lncRNAs. Based on the integrated matrix combined with similarity matrixes and the observed interaction network, a linear optimization-based link prediction model was trained for inferring miRNA-lncRNA interactions. RESULTS: To evaluate the performance of our proposed method, k-fold cross-validation (CV) and leave-one-out CV were implemented, in which each CV experiment was carried out 100 times on a training set generated randomly. The high area under the curves (AUCs) at 0.8623 ± 0.0027 (2-fold CV), 0.9053 ± 0.0017 (5-fold CV), 0.9151 ± 0.0013 (10-fold CV), and 0.9236 (LOO-CV), illustrated the precision and reliability of our proposed method. CONCLUSION: GKLOMLI with high performance is anticipated to be used to reveal underlying interactions between miRNA and their target lncRNAs, and deciphers the potential mechanisms of the complex diseases.


Asunto(s)
MicroARNs , ARN Largo no Codificante , Humanos , ARN Largo no Codificante/genética , Reproducibilidad de los Resultados , Proyectos de Investigación , Algoritmos , MicroARNs/genética
7.
J Vasc Surg Venous Lymphat Disord ; 11(4): 774-782.e1, 2023 07.
Artículo en Inglés | MEDLINE | ID: mdl-37028512

RESUMEN

OBJECTIVE: Obesity is highly prevalent and a major risk factor for deep vein thrombosis (DVT) and chronic venous disease. It can also technically limit duplex ultrasound evaluations for lower extremity DVT. We compared the rates and results of repeat lower extremity venous duplex ultrasound (LEVDUS) after an initial incomplete and negative (IIN) LEVDUS in overweight (body mass index [BMI] ≤25-30 kg/m2) and obese (BMI ≥30 kg/m2) patients with those of patients with a BMI <25 kg/m2 to evaluate whether increasing the rate of follow-up examinations in overweight and obese patients might facilitate improved patient care. METHODS: We performed a retrospective review of 617 patients with an IIN LEVDUS study from December 31, 2017 to December 31, 2020. Demographic and imaging data of the patients with an IIN LEVDUS and the frequency of repeat studies performed within 2 weeks were abstracted from the electronic medical records. The patients were divided into three BMI-based groups: normal (BMI <25 kg/m2), overweight (BMI 25-30 kg/m2), and obese (BMI ≥30 kg/m2). RESULTS: Of the 617 patients with an IIN LEVDUS, 213 (34.5%) were normal weight, 177 (29%) were overweight, and 227 (37%) were obese. The repeat LEVDUS rates were significantly different across the three weight groups (P < .001). After an IIN LEVDUS, the rate of repeat LEVDUS for the normal weight, overweight, and obese groups was 46% (98 of 213), 28% (50 of 227), and 32% (73 of 227), respectively. The overall rates of thrombosis (both DVT and superficial vein thrombosis) in the repeat LEVDUS examinations were not significantly different among the normal weight (14%), overweight (11%), and obese (18%) patients (P = .431). CONCLUSIONS: Overweight and obese patients (BMI ≥25 kg/m2) received fewer follow-up examinations after an IIN LEVDUS. Follow-up LEVDUS examinations of overweight and obese patients after an IIN LEVDUS study have similar rates of venous thrombosis compared with normal weight patients. Targeting improving usage of follow-up LEVDUS studies for all patients, but especially for those who are overweight and obese, with an IIN LEVDUS through quality improvement efforts could help minimize missed diagnoses of venous thrombosis and improve the quality of patient care.


Asunto(s)
Trombosis , Trombosis de la Vena , Humanos , Índice de Masa Corporal , Sobrepeso/complicaciones , Sobrepeso/diagnóstico por imagen , Estudios de Seguimiento , Trombosis de la Vena/diagnóstico por imagen , Trombosis de la Vena/terapia , Obesidad/complicaciones , Estudios Retrospectivos
8.
IEEE J Biomed Health Inform ; 27(1): 573-582, 2023 01.
Artículo en Inglés | MEDLINE | ID: mdl-36301791

RESUMEN

Identifying protein targets for drugs establishes an indispensable knowledge foundation for drug repurposing and drug development. Though expensive and time-consuming, vitro trials are widely employed to discover drug targets, and the existing relevant computational algorithms still cannot satisfy the demand for real application in drug R&D with regards to the prediction accuracy and performance efficiency, which are urgently needed to be improved. To this end, we propose here the PPAEDTI model, which uses the graph personalized propagation technique to predict drug-target interactions from the known interaction network. To evaluate the prediction performance, six benchmark datasets were used for testing with some state-of-the-art methods compared. As a result, using the 5-fold cross-validation, the proposed PPAEDTI model achieves average AUCs>90% on 5 collected datasets. We also manually checked the top-20 prediction list for 2 proteins (hsa:775 and hsa:779) and a kind of drug (D00618), and successfully confirmed 18, 17, and 20 items from the public datasets, respectively. The experimental results indicate that, given known drug-target interactions, the PPAEDTI model can provide accurate predictions for the new ones, which is anticipated to serve as a useful tool for pharmacology research. Using the proposed model that was trained with the collected datasets, we have built a computational platform that is accessible at http://120.77.11.78/PPAEDTI/ and corresponding codes and datasets are also released.


Asunto(s)
Algoritmos , Reposicionamiento de Medicamentos , Humanos , Interacciones Farmacológicas , Área Bajo la Curva , Proteínas/metabolismo
9.
IEEE/ACM Trans Comput Biol Bioinform ; 20(5): 2610-2618, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-35675235

RESUMEN

Accumulating evidences show that circular RNAs (circRNAs) play an important role in regulating gene expression, and involve in many complex human diseases. Identifying associations of circRNA with disease helps to understand the pathogenesis, treatment and diagnosis of complex diseases. Since inferring circRNA-disease associations by biological experiments is costly and time-consuming, there is an urgently need to develop a computational model to identify the association between them. In this paper, we proposed a novel method named KNN-NMF, which combines K nearest neighbors with nonnegative matrix factorization to infer associations between circRNA and disease (KNN-NMF). Frist, we compute the Gaussian Interaction Profile (GIP) kernel similarity of circRNA and disease, the semantic similarity of disease, respectively. Then, the circRNA-disease new interaction profiles are established using weight K nearest neighbors to reduce the false negative association impact on prediction performance. Finally, Nonnegative Matrix Factorization is implemented to predict associations of circRNA with disease. The experiment results indicate that the prediction performance of KNN-NMF outperforms the competing methods under five-fold cross-validation. Moreover, case studies of two common diseases further show that KNN-NMF can identify potential circRNA-disease associations effectively.

10.
J Transl Med ; 20(1): 552, 2022 12 03.
Artículo en Inglés | MEDLINE | ID: mdl-36463215

RESUMEN

BACKGROUND: Associations of drugs with diseases provide important information for expediting drug development. Due to the number of known drug-disease associations is still insufficient, and considering that inferring associations between them through traditional in vitro experiments is time-consuming and costly. Therefore, more accurate and reliable computational methods urgent need to be developed to predict potential associations of drugs with diseases. METHODS: In this study, we present the model called weighted graph regularized collaborative non-negative matrix factorization for drug-disease association prediction (WNMFDDA). More specifically, we first calculated the drug similarity and disease similarity based on the chemical structures of drugs and medical description information of diseases, respectively. Then, to extend the model to work for new drugs and diseases, weighted [Formula: see text] nearest neighbor was used as a preprocessing step to reconstruct the interaction score profiles of drugs with diseases. Finally, a graph regularized non-negative matrix factorization model was used to identify potential associations between drug and disease. RESULTS: During the cross-validation process, WNMFDDA achieved the AUC values of 0.939 and 0.952 on Fdataset and Cdataset under ten-fold cross validation, respectively, which outperforms other competing prediction methods. Moreover, case studies for several drugs and diseases were carried out to further verify the predictive performance of WNMFDDA. As a result, 13(Doxorubicin), 13(Amiodarone), 12(Obesity) and 12(Asthma) of the top 15 corresponding candidate diseases or drugs were confirmed by existing databases. CONCLUSIONS: The experimental results adequately demonstrated that WNMFDDA is a very effective method for drug-disease association prediction. We believe that WNMFDDA is helpful for relevant biomedical researchers in follow-up studies.


Asunto(s)
Algoritmos , Asma , Humanos , Análisis por Conglomerados , Bases de Datos Factuales , Proyectos de Investigación
11.
Brief Bioinform ; 23(5)2022 09 20.
Artículo en Inglés | MEDLINE | ID: mdl-36070867

RESUMEN

Circular RNAs (circRNAs) are involved in the regulatory mechanisms of multiple complex diseases, and the identification of their associations is critical to the diagnosis and treatment of diseases. In recent years, many computational methods have been designed to predict circRNA-disease associations. However, most of the existing methods rely on single correlation data. Here, we propose a machine learning framework for circRNA-disease association prediction, called MLCDA, which effectively fuses multiple sources of heterogeneous information including circRNA sequences and disease ontology. Comprehensive evaluation in the gold standard dataset showed that MLCDA can successfully capture the complex relationships between circRNAs and diseases and accurately predict their potential associations. In addition, the results of case studies on real data show that MLCDA significantly outperforms other existing methods. MLCDA can serve as a useful tool for circRNA-disease association prediction, providing mechanistic insights for disease research and thus facilitating the progress of disease treatment.


Asunto(s)
Aprendizaje Automático , ARN Circular , Biología Computacional/métodos
12.
IEEE J Biomed Health Inform ; 26(10): 5075-5084, 2022 10.
Artículo en Inglés | MEDLINE | ID: mdl-35976848

RESUMEN

Increasing evidence suggest that circRNA, as one of the most promising emerging biomarkers, has a very close relationship with diseases. Exploring the relationship between circRNA and diseases can provide novel perspective for diseases diagnosis and pathogenesis. The existing circRNA-disease association (CDA) prediction models, however, generally treat the data attributes equally, do not pay special attention to the attributes with more significant influence, and do not make full use of the correlation and symbiosis between attributes to dig into the latent semantic information of the data. Therefore, in response to the above problems, this paper proposes a natural semantic enhancement method NSECDA to predict CDA. In practical terms, we first recognize the circRNA sequence as a biological language, and analyze its natural semantic properties through the natural language understanding theory; then integrate it with disease attributes, circRNA and disease Gaussian Interaction Profile (GIP) kernel attributes, and use Graph Attention Network (GAT) to focus on the influential attributes, so as to mine the deeply hidden features; finally, the Rotation Forest (RoF) classifier was used to accurately determine CDA. In the gold standard data set CircR2Disease, NSECDA achieved 92.49% accuracy with 0.9225 AUC score. In comparison with the non-natural semantic enhancement model and other classifier models, NSECDA also shows competitive performance. Additionally, 25 of the CDA pairs with unknown associations in the top 30 prediction scores of NSECDA have been proven by newly reported studies. These achievements suggest that NSECDA is an effective model to predict CDA, which can provide credible candidate for subsequent wet experiments, thus significantly reducing the scope of investigations.


Asunto(s)
ARN Circular , Semántica , Algoritmos , Biología Computacional/métodos , Humanos , ARN Circular/genética
13.
Biomedicines ; 10(7)2022 Jun 29.
Artículo en Inglés | MEDLINE | ID: mdl-35884848

RESUMEN

Protein is the basic organic substance that constitutes the cell and is the material condition for the life activity and the guarantee of the biological function activity. Elucidating the interactions and functions of proteins is a central task in exploring the mysteries of life. As an important protein interaction, self-interacting protein (SIP) has a critical role. The fast growth of high-throughput experimental techniques among biomolecules has led to a massive influx of available SIP data. How to conduct scientific research using the massive amount of SIP data has become a new challenge that is being faced in related research fields such as biology and medicine. In this work, we design an SIP prediction method SIPGCN using a deep learning graph convolutional network (GCN) based on protein sequences. First, protein sequences are characterized using a position-specific scoring matrix, which is able to describe the biological evolutionary message, then their hidden features are extracted by the deep learning method GCN, and, finally, the random forest is utilized to predict whether there are interrelationships between proteins. In the cross-validation experiment, SIPGCN achieved 93.65% accuracy and 99.64% specificity in the human data set. SIPGCN achieved 90.69% and 99.08% of these two indicators in the yeast data set, respectively. Compared with other feature models and previous methods, SIPGCN showed excellent results. These outcomes suggest that SIPGCN may be a suitable instrument for predicting SIP and may be a reliable candidate for future wet experiments.

14.
Biology (Basel) ; 11(5)2022 May 13.
Artículo en Inglés | MEDLINE | ID: mdl-35625468

RESUMEN

The key to new drug discovery and development is first and foremost the search for molecular targets of drugs, thus advancing drug discovery and drug repositioning. However, traditional drug-target interactions (DTIs) is a costly, lengthy, high-risk, and low-success-rate system project. Therefore, more and more pharmaceutical companies are trying to use computational technologies to screen existing drug molecules and mine new drugs, leading to accelerating new drug development. In the current study, we designed a deep learning computational model MSPEDTI based on Molecular Structure and Protein Evolutionary to predict the potential DTIs. The model first fuses protein evolutionary information and drug structure information, then a deep learning convolutional neural network (CNN) to mine its hidden features, and finally accurately predicts the associated DTIs by extreme learning machine (ELM). In cross-validation experiments, MSPEDTI achieved 94.19%, 90.95%, 87.95%, and 86.11% prediction accuracy in the gold-standard datasets enzymes, ion channels, G-protein-coupled receptors (GPCRs), and nuclear receptors, respectively. MSPEDTI showed its competitive ability in ablation experiments and comparison with previous excellent methods. Additionally, 7 of 10 potential DTIs predicted by MSPEDTI were substantiated by the classical database. These excellent outcomes demonstrate the ability of MSPEDTI to provide reliable drug candidate targets and strongly facilitate the development of drug repositioning and drug development.

15.
Biology (Basel) ; 11(5)2022 May 13.
Artículo en Inglés | MEDLINE | ID: mdl-35625469

RESUMEN

As the basis for screening drug candidates, the identification of drug-target interactions (DTIs) plays a crucial role in the innovative drugs research. However, due to the inherent constraints of small-scale and time-consuming wet experiments, DTI recognition is usually difficult to carry out. In the present study, we developed a computational approach called RoFDT to predict DTIs by combining feature-weighted Rotation Forest (FwRF) with a protein sequence. In particular, we first encode protein sequences as numerical matrices by Position-Specific Score Matrix (PSSM), then extract their features utilize Pseudo Position-Specific Score Matrix (PsePSSM) and combine them with drug structure information-molecular fingerprints and finally feed them into the FwRF classifier and validate the performance of RoFDT on Enzyme, GPCR, Ion Channel and Nuclear Receptor datasets. In the above dataset, RoFDT achieved 91.68%, 84.72%, 88.11% and 78.33% accuracy, respectively. RoFDT shows excellent performance in comparison with support vector machine models and previous superior approaches. Furthermore, 7 of the top 10 DTIs with RoFDT estimate scores were proven by the relevant database. These results demonstrate that RoFDT can be employed to a powerful predictive approach for DTIs to provide theoretical support for innovative drug discovery.

16.
Appl Soft Comput ; 111: 107831, 2021 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-34456656

RESUMEN

The COVID-19 has now spread all over the world and causes a huge burden for public health and world economy. Drug repositioning has become a promising treatment strategy in COVID-19 crisis because it can shorten drug development process, reduce pharmaceutical costs and reposition approval drugs. Existing computational methods only focus on single information, such as drug and virus similarity or drug-virus network feature, which is not sufficient to predict potential drugs. In this paper, a sequence combined attentive network embedding model SANE is proposed for identifying drugs based on sequence features and network features. On the one hand, drug SMILES and virus sequence features are extracted by encoder-decoder in SANE as node initial embedding in drug-virus network. On the other hand, SANE obtains fields for each node by attention-based Depth-First-Search (DFS) to reduce noises and improve efficiency in representation learning and adopts a bottom-up aggregation strategy to learn node network representation from selected fields. Finally, a forward neural network is used for classifying. Experiment results show that SANE has achieved the performance with 81.98% accuracy and 0.8961 AUC value and outperformed state-of-the-art baselines. Further case study on COVID-19 indicates that SANE has a strong predictive ability since 25 of the top 40 (62.5%) drugs are verified by valuable dataset and literatures. Therefore, SANE is powerful to reposition drugs for COVID-19 and provides a new perspective for drug repositioning.

17.
Front Genet ; 12: 657182, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34054920

RESUMEN

Drug repositioning is an application-based solution based on mining existing drugs to find new targets, quickly discovering new drug-disease associations, and reducing the risk of drug discovery in traditional medicine and biology. Therefore, it is of great significance to design a computational model with high efficiency and accuracy. In this paper, we propose a novel computational method MGRL to predict drug-disease associations based on multi-graph representation learning. More specifically, MGRL first uses the graph convolution network to learn the graph representation of drugs and diseases from their self-attributes. Then, the graph embedding algorithm is used to represent the relationships between drugs and diseases. Finally, the two kinds of graph representation learning features were put into the random forest classifier for training. To the best of our knowledge, this is the first work to construct a multi-graph to extract the characteristics of drugs and diseases to predict drug-disease associations. The experiments show that the MGRL can achieve a higher AUC of 0.8506 based on five-fold cross-validation, which is significantly better than other existing methods. Case study results show the reliability of the proposed method, which is of great significance for practical applications.

18.
iScience ; 24(6): 102455, 2021 Jun 25.
Artículo en Inglés | MEDLINE | ID: mdl-34041455

RESUMEN

Predicting the microRNA-disease associations by using computational methods is conductive to the efficiency of costly and laborious traditional bio-experiments. In this study, we propose a computational machine learning-based method (DANE-MDA) that preserves integrated structure and attribute features via deep attributed network embedding to predict potential miRNA-disease associations. Specifically, the integrated features are extracted by using deep stacked auto-encoder on the diverse orders of matrixes containing structure and attribute information and are then trained by using random forest classifier. Under 5-fold cross-validation experiments, DANE-MDA yielded average accuracy, sensitivity, and AUC at 85.59%, 84.23%, and 0.9264 in term of HMDD v3.0 dataset, and 83.21%, 80.39%, and 0.9113 in term of HMDD v2.0 dataset, respectively. Additionally, case studies on breast, colon, and lung neoplasms related disease show that 47, 47, and 46 of the top 50 miRNAs can be predicted and retrieved in the other database.

19.
Cancers (Basel) ; 13(9)2021 Apr 27.
Artículo en Inglés | MEDLINE | ID: mdl-33925568

RESUMEN

Identification of drug-target interactions (DTIs) is a significant step in the drug discovery or repositioning process. Compared with the time-consuming and labor-intensive in vivo experimental methods, the computational models can provide high-quality DTI candidates in an instant. In this study, we propose a novel method called LGDTI to predict DTIs based on large-scale graph representation learning. LGDTI can capture the local and global structural information of the graph. Specifically, the first-order neighbor information of nodes can be aggregated by the graph convolutional network (GCN); on the other hand, the high-order neighbor information of nodes can be learned by the graph embedding method called DeepWalk. Finally, the two kinds of feature are fed into the random forest classifier to train and predict potential DTIs. The results show that our method obtained area under the receiver operating characteristic curve (AUROC) of 0.9455 and area under the precision-recall curve (AUPR) of 0.9491 under 5-fold cross-validation. Moreover, we compare the presented method with some existing state-of-the-art methods. These results imply that LGDTI can efficiently and robustly capture undiscovered DTIs. Moreover, the proposed model is expected to bring new inspiration and provide novel perspectives to relevant researchers.

20.
BMC Bioinformatics ; 21(1): 401, 2020 Sep 10.
Artículo en Inglés | MEDLINE | ID: mdl-32912137

RESUMEN

BACKGROUND: As an important non-coding RNA, microRNA (miRNA) plays a significant role in a series of life processes and is closely associated with a variety of Human diseases. Hence, identification of potential miRNA-disease associations can make great contributions to the research and treatment of Human diseases. However, to our knowledge, many existing computational methods only utilize the single type of known association information between miRNAs and diseases to predict their potential associations, without focusing on their interactions or associations with other types of molecules. RESULTS: In this paper, we propose a network embedding-based method for predicting miRNA-disease associations by preserving behavior and attribute information. Firstly, a heterogeneous network is constructed by integrating known associations among miRNA, protein and disease, and the network representation method Learning Graph Representations with Global Structural Information (GraRep) is implemented to learn the behavior information of miRNAs and diseases in the network. Then, the behavior information of miRNAs and diseases is combined with the attribute information of them to represent miRNA-disease association pairs. Finally, the prediction model is established based on the Random Forest algorithm. Under the five-fold cross validation, the proposed NEMPD model obtained average 85.41% prediction accuracy with 80.96% sensitivity at the AUC of 91.58%. Furthermore, the performance of NEMPD is also validated by the case studies. Among the top 50 predicted disease-related miRNAs, 48 (breast neoplasms), 47 (colon neoplasms), 47 (lung neoplasms) were confirmed by two other databases. CONCLUSIONS: The proposed NEMPD model has a good performance in predicting the potential associations between miRNAs and diseases, and has great potency in the field of miRNA-disease association prediction in the future.


Asunto(s)
Neoplasias de la Mama/diagnóstico , Neoplasias del Colon/diagnóstico , Biología Computacional/métodos , Neoplasias Pulmonares/diagnóstico , MicroARNs/metabolismo , Algoritmos , Área Bajo la Curva , Neoplasias de la Mama/genética , Neoplasias del Colon/genética , Femenino , Humanos , Neoplasias Pulmonares/genética , MicroARNs/genética , Curva ROC
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...