RESUMO
Computational prediction of drug-target interactions (DTIs) is of particular importance in the process of drug repositioning because of its efficiency in selecting potential candidates for DTIs. A variety of computational methods for predicting DTIs have been proposed over the past decade. Our interest is which methods or techniques are the most advantageous for increasing prediction accuracy. This article provides a comprehensive overview of network-based, machine learning, and integrated DTI prediction methods. The network-based methods handle a DTI network along with drug and target similarities in a matrix form and apply graph-theoretic algorithms to identify new DTIs. Machine learning methods use known DTIs and the features of drugs and target proteins as training data to build a predictive model. Integrated methods combine these two techniques. We assessed the prediction performance of the selected state-of-the-art methods using two different benchmark datasets. Our experimental results demonstrate that the integrated methods outperform the others in general. Some previous methods showed low accuracy on predicting interactions of unknown drugs which do not exist in the training dataset. Combining similarity matrices from multiple features by data fusion was not beneficial in increasing prediction accuracy. Finally, we analyzed future directions for further improvements in DTI predictions.
Assuntos
Algoritmos , Aprendizado de Máquina , Interações Medicamentosas , Reposicionamento de Medicamentos , Proteínas/metabolismoRESUMO
Genome-wide association studies (GWAS) can be used to infer genome intervals that are involved in genetic diseases. However, investigating a large number of putative mutations for GWAS is resource- and time-intensive. Network-based computational approaches are being used for efficient disease-gene association prediction. Network-based methods are based on the underlying assumption that the genes causing the same diseases are located close to each other in a molecular network, such as a protein-protein interaction (PPI) network. In this survey, we provide an overview of network-based disease-gene association prediction methods based on three categories: graph-theoretic algorithms, machine learning algorithms, and an integration of these two. We experimented with six selected methods to compare their prediction performance using a heterogeneous network constructed by combining a genome-wide weighted PPI network, an ontology-based disease network, and disease-gene associations. The experiment was conducted in two different settings according to the presence and absence of known disease-associated genes. The results revealed that HerGePred, an integrative method, outperformed in the presence of known disease-associated genes, whereas PRINCE, which adopted a network propagation algorithm, was the most competitive in the absence of known disease-associated genes. Overall, the results demonstrated that the integrative methods performed better than the methods using graph-theory only, and the methods using a heterogeneous network performed better than those using a homogeneous PPI network only.
Assuntos
Estudo de Associação Genômica Ampla , Mapas de Interação de Proteínas , Algoritmos , Biologia Computacional/métodos , Estudo de Associação Genômica Ampla/métodos , Aprendizado de Máquina , Mapas de Interação de Proteínas/genéticaRESUMO
Functional modules can be predicted using genome-wide protein-protein interactions (PPIs) from a systematic perspective. Various graph clustering algorithms have been applied to PPI networks for this task. In particular, the detection of overlapping clusters is necessary because a protein is involved in multiple functions under different conditions. graph entropy (GE) is a novel metric to assess the quality of clusters in a large, complex network. In this study, the unweighted and weighted GE algorithm is evaluated to prove the validity of predicting function modules. To measure clustering accuracy, the clustering results are compared to protein complexes and Gene Ontology (GO) annotations as references. We demonstrate that the GE algorithm is more accurate in overlapping clusters than the other competitive methods. Moreover, we confirm the biological feasibility of the proteins that occur most frequently in the set of identified clusters. Finally, novel proteins for the additional annotation of GO terms are revealed.
RESUMO
Drug repositioning offers the significant advantage of greatly reducing the cost and time of drug discovery by identifying new therapeutic indications for existing drugs. In particular, computational approaches using networks in drug repositioning have attracted attention for inferring potential associations between drugs and diseases efficiently based on the network connectivity. In this article, we proposed a network-based drug repositioning method to construct a drug-gene-disease tensor by integrating drug-disease, drug-gene, and disease-gene associations and predict drug-gene-disease triple associations through tensor decomposition. The proposed method, which ensembles generalized tensor decomposition (GTD) and multi-layer perceptron (MLP), models drug-gene-disease associations through GTD and learns the features of drugs, genes, and diseases through MLP, providing more flexibility and non-linearity than conventional tensor decomposition. We experimented with drug-gene-disease association prediction using two distinct networks created by chemical structures and ATC codes as drug features. Moreover, we leveraged drug, gene, and disease latent vectors obtained from the predicted triple associations to predict drug-disease, drug-gene, and disease-gene pairwise associations. Our experimental results revealed that the proposed ensemble method was superior for triple association prediction. The ensemble model achieved an AUC of 0.96 in predicting triple associations for new drugs, resulting in an approximately 7% improvement over the performance of existing models. It also showed competitive accuracy for pairwise association prediction compared with previous methods. This study demonstrated that incorporating genetic information leads to notable advancements in drug repositioning.
RESUMO
Drug repositioning, which involves the identification of new therapeutic indications for approved drugs, considerably reduces the time and cost of developing new drugs. Recent computational drug repositioning methods use heterogeneous networks to identify drug-disease associations. This review reveals existing network-based approaches for predicting drug-disease associations in three major categories: graph mining, matrix factorization or completion, and deep learning. We selected eleven methods from the three categories to compare their predictive performances. The experiment was conducted using two uniform datasets on the drug and disease sides, separately. We constructed heterogeneous networks using drug-drug similarities based on chemical structures and ATC codes, ontology-based disease-disease similarities, and drug-disease associations. An improved evaluation metric was used to reflect data imbalance as positive associations are typically sparse. The prediction results demonstrated that methods in the graph mining and matrix factorization or completion categories performed well in the overall assessment. Furthermore, prediction on the drug side had higher accuracy than on the disease side. Selecting and integrating informative drug features in drug-drug similarity measurement are crucial for improving disease-side prediction.