RESUMO
BACKGROUND: The prediction of potential drug-target interactions (DTIs) not only provides a better comprehension of biological processes but also is critical for identifying new drugs. However, due to the disadvantages of expensive and high time-consuming traditional experiments, only a small section of interactions between drugs and targets in the database were verified experimentally. Therefore, it is meaningful and important to develop new computational methods with good performance for DTIs prediction. At present, many existing computational methods only utilize the single type of interactions between drugs and proteins without paying attention to the associations and influences with other types of molecules. METHODS: In this work, we developed a novel network embedding-based heterogeneous information integration model to predict potential drug-target interactions. Firstly, a heterogeneous multi-molecuar information network is built by combining the known associations among protein, drug, lncRNA, disease, and miRNA. Secondly, the Large-scale Information Network Embedding (LINE) model is used to learn behavior information (associations with other nodes) of drugs and proteins in the network. Hence, the known drug-protein interaction pairs can be represented as a combination of attribute information (e.g. protein sequences information and drug molecular fingerprints) and behavior information of themselves. Thirdly, the Random Forest classifier is used for training and prediction. RESULTS: In the results, under the five-fold cross validation, our method obtained 85.83% prediction accuracy with 80.47% sensitivity at the AUC of 92.33%. Moreover, in the case studies of three common drugs, the top 10 candidate targets have 8 (Caffeine), 7 (Clozapine) and 6 (Pioglitazone) are respectively verified to be associated with corresponding drugs. CONCLUSIONS: In short, these results indicate that our method can be a powerful tool for predicting potential drug-target interactions and finding unknown targets for certain drugs or unknown drugs for certain targets.
Assuntos
MicroRNAs , Preparações Farmacêuticas , RNA Longo não Codificante , Algoritmos , Sequência de Aminoácidos , ProteínasRESUMO
BACKGROUND: In the process of drug development, computational drug repositioning is effective and resource-saving with regards to its important functions on identifying new drug-disease associations. Recent years have witnessed a great progression in the field of data mining with the advent of deep learning. An increasing number of deep learning-based techniques have been proposed to develop computational tools in bioinformatics. METHODS: Along this promising direction, we here propose a drug repositioning computational method combining the techniques of Sigmoid Kernel and Convolutional Neural Network (SKCNN) which is able to learn new features effectively representing drug-disease associations via its hidden layers. Specifically, we first construct similarity metric of drugs using drug sigmoid similarity and drug structural similarity, and that of disease using disease sigmoid similarity and disease semantic similarity. Based on the combined similarities of drugs and diseases, we then use SKCNN to learn hidden representations for each drug-disease pair whose labels are finally predicted by a classifier based on random forest. RESULTS: A series of experiments were implemented for performance evaluation and their results show that the proposed SKCNN improves the prediction accuracy compared with other state-of-the-art approaches. Case studies of two selected disease are also conducted through which we prove the superior performance of our method in terms of the actual discovery of potential drug indications. CONCLUSION: The aim of this study was to establish an effective predictive model for finding new drug-disease associations. These experimental results show that SKCNN can effectively predict the association between drugs and diseases.
Assuntos
Algoritmos , Doença/genética , Reposicionamento de Medicamentos , Estudos de Associação Genética , Área Sob a Curva , Asma/genética , Bases de Dados como Assunto , Humanos , Redes Neurais de Computação , Obesidade/genética , Curva ROC , Reprodutibilidade dos Testes , Máquina de Vetores de SuporteRESUMO
Drug-disease association is an important piece of information which participates in all stages of drug repositioning. Although the number of drug-disease associations identified by high-throughput technologies is increasing, the experimental methods are time consuming and expensive. As supplement to them, many computational methods have been developed for an accurate in silico prediction for new drug-disease associations. In this work, we present a novel computational model combining sparse auto-encoder and rotation forest (SAEROF) to predict drug-disease association. Gaussian interaction profile kernel similarity, drug structure similarity and disease semantic similarity were extracted for exploring the association among drugs and diseases. On this basis, a rotation forest classifier based on sparse auto-encoder is proposed to predict the association between drugs and diseases. In order to evaluate the performance of the proposed model, we used it to implement 10-fold cross validation on two golden standard datasets, Fdataset and Cdataset. As a result, the proposed model achieved AUCs (Area Under the ROC Curve) of Fdataset and Cdataset are 0.9092 and 0.9323, respectively. For performance evaluation, we compared SAEROF with the state-of-the-art support vector machine (SVM) classifier and some existing computational models. Three human diseases (Obesity, Stomach Neoplasms and Lung Neoplasms) were explored in case studies. As a result, more than half of the top 20 drugs predicted were successfully confirmed by the Comparative Toxicogenomics Database(CTD database). This model is a feasible and effective method to predict drug-disease correlation, and its performance is significantly improved compared with existing methods.
Assuntos
Algoritmos , Fármacos Antiobesidade/farmacologia , Antineoplásicos/farmacologia , Biologia Computacional/métodos , Neoplasias Pulmonares/tratamento farmacológico , Redes Neurais de Computação , Obesidade/tratamento farmacológico , Neoplasias Gástricas/tratamento farmacológico , Área Sob a Curva , Simulação por Computador , Bases de Dados Factuais , Humanos , Aprendizado de Máquina , Curva ROC , Máquina de Vetores de SuporteRESUMO
Computational drug repositioning, designed to identify new indications for existing drugs, significantly reduced the cost and time involved in drug development. Prediction of drug-disease associations is promising for drug repositioning. Recent years have witnessed an increasing number of machine learning-based methods for calculating drug repositioning. In this paper, a novel feature learning method based on Gaussian interaction profile kernel and autoencoder (GIPAE) is proposed for drug-disease association. In order to further reduce the computation cost, both batch normalization layer and the full-connected layer are introduced to reduce training complexity. The experimental results of 10-fold cross validation indicate that the proposed method achieves superior performance on Fdataset and Cdataset with the AUCs of 93.30% and 96.03%, respectively, which were higher than many previous computational models. To further assess the accuracy of GIPAE, we conducted case studies on two complex human diseases. The top 20 drugs predicted, 14 obesity-related drugs, and 11 drugs related to Alzheimer's disease were validated in the CTD database. The results of cross validation and case studies indicated that GIPAE is a reliable model for predicting drug-disease associations.