Search | VHL Regional Portal

A multi-source molecular network representation model for protein-protein interactions prediction.

Zou, Hai-Tao; Ji, Bo-Ya; Xie, Xiao-Lan.

Sci Rep ; 14(1): 6184, 2024 03 14.

Article in English | MEDLINE | ID: mdl-38485942

ABSTRACT

The prediction of potential protein-protein interactions (PPIs) is a critical step in decoding diseases and understanding cellular mechanisms. Traditional biological experiments have identified plenty of potential PPIs in recent years, but this problem is still far from being solved. Hence, there is urgent to develop computational models with good performance and high efficiency to predict potential PPIs. In this study, we propose a multi-source molecular network representation learning model (called MultiPPIs) to predict potential protein-protein interactions. Specifically, we first extract the protein sequence features according to the physicochemical properties of amino acids by utilizing the auto covariance method. Second, a multi-source association network is constructed by integrating the known associations among miRNAs, proteins, lncRNAs, drugs, and diseases. The graph representation learning method, DeepWalk, is adopted to extract the multisource association information of proteins with other biomolecules. In this way, the known protein-protein interaction pairs can be represented as a concatenation of the protein sequence and the multi-source association features of proteins. Finally, the Random Forest classifier and corresponding optimal parameters are used for training and prediction. In the results, MultiPPIs obtains an average 86.03% prediction accuracy with 82.69% sensitivity at the AUC of 93.03% under five-fold cross-validation. The experimental results indicate that MultiPPIs has a good prediction performance and provides valuable insights into the field of potential protein-protein interactions prediction. MultiPPIs is free available at https://github.com/jiboyalab/multiPPIs .

Subject(s)

MicroRNAs , RNA, Long Noncoding , Proteins/metabolism , Amino Acid Sequence , Amino Acids , Computational Biology/methods

SPRDA: a link prediction approach based on the structural perturbation to infer disease-associated Piwi-interacting RNAs.

Zheng, Kai; Zhang, Xin-Lu; Wang, Lei; You, Zhu-Hong; Ji, Bo-Ya; Liang, Xiao; Li, Zheng-Wei.

Brief Bioinform ; 24(1)2023 01 19.

Article in English | MEDLINE | ID: mdl-36445194

ABSTRACT

piRNA and PIWI proteins have been confirmed for disease diagnosis and treatment as novel biomarkers due to its abnormal expression in various cancers. However, the current research is not strong enough to further clarify the functions of piRNA in cancer and its underlying mechanism. Therefore, how to provide large-scale and serious piRNA candidates for biological research has grown up to be a pressing issue. In this study, a novel computational model based on the structural perturbation method is proposed to predict potential disease-associated piRNAs, called SPRDA. Notably, SPRDA belongs to positive-unlabeled learning, which is unaffected by negative examples in contrast to previous approaches. In the 5-fold cross-validation, SPRDA shows high performance on the benchmark dataset piRDisease, with an AUC of 0.9529. Furthermore, the predictive performance of SPRDA for 10 diseases shows the robustness of the proposed method. Overall, the proposed approach can provide unique insights into the pathogenesis of the disease and will advance the field of oncology diagnosis and treatment.

Subject(s)

Neoplasms , Piwi-Interacting RNA , Humans , RNA, Small Interfering/genetics , RNA, Small Interfering/metabolism , Neoplasms/genetics , Neoplasms/metabolism

SMMDA: Predicting miRNA-Disease Associations by Incorporating Multiple Similarity Profiles and a Novel Disease Representation.

Ji, Bo-Ya; Pan, Liang-Rui; Zhou, Ji-Ren; You, Zhu-Hong; Peng, Shao-Liang.

Biology (Basel) ; 11(5)2022 May 20.

Article in English | MEDLINE | ID: mdl-35625505

ABSTRACT

Increasing evidence has suggested that microRNAs (miRNAs) are significant in research on human diseases. Predicting possible associations between miRNAs and diseases would provide new perspectives on disease diagnosis, pathogenesis, and gene therapy. However, considering the intrinsic time-consuming and expensive cost of traditional Vitro studies, there is an urgent need for a computational approach that would allow researchers to identify potential associations between miRNAs and diseases for further research. In this paper, we presented a novel computational method called SMMDA to predict potential miRNA-disease associations. In particular, SMMDA first utilized a new disease representation method (MeSHHeading2vec) based on the network embedding algorithm and then fused it with Gaussian interaction profile kernel similarity information of miRNAs and diseases, disease semantic similarity, and miRNA functional similarity. Secondly, SMMDA utilized a deep auto-coder network to transform the original features further to achieve a better feature representation. Finally, the ensemble learning model, XGBoost, was used as the underlying training and prediction method for SMMDA. In the results, SMMDA acquired a mean accuracy of 86.68% with a standard deviation of 0.42% and a mean AUC of 94.07% with a standard deviation of 0.23%, outperforming many previous works. Moreover, we also compared the predictive ability of SMMDA with different classifiers and different feature descriptors. In the case studies of three common Human diseases, the top 50 candidate miRNAs have 47 (esophageal neoplasms), 48 (breast neoplasms), and 48 (colon neoplasms) are successfully verified by two other databases. The experimental results proved that SMMDA has a reliable prediction ability in predicting potential miRNA-disease associations. Therefore, it is anticipated that SMMDA could be an effective tool for biomedical researchers.

DANE-MDA: Predicting microRNA-disease associations via deep attributed network embedding.

Ji, Bo-Ya; You, Zhu-Hong; Wang, Yi; Li, Zheng-Wei; Wong, Leon.

iScience ; 24(6): 102455, 2021 Jun 25.

Article in English | MEDLINE | ID: mdl-34041455

ABSTRACT

Predicting the microRNA-disease associations by using computational methods is conductive to the efficiency of costly and laborious traditional bio-experiments. In this study, we propose a computational machine learning-based method (DANE-MDA) that preserves integrated structure and attribute features via deep attributed network embedding to predict potential miRNA-disease associations. Specifically, the integrated features are extracted by using deep stacked auto-encoder on the diverse orders of matrixes containing structure and attribute information and are then trained by using random forest classifier. Under 5-fold cross-validation experiments, DANE-MDA yielded average accuracy, sensitivity, and AUC at 85.59%, 84.23%, and 0.9264 in term of HMDD v3.0 dataset, and 83.21%, 80.39%, and 0.9113 in term of HMDD v2.0 dataset, respectively. Additionally, case studies on breast, colon, and lung neoplasms related disease show that 47, 47, and 46 of the top 50 miRNAs can be predicted and retrieved in the other database.

Prediction of lncRNA-disease associations via an embedding learning HOPE in heterogeneous information networks.

Zhou, Ji-Ren; You, Zhu-Hong; Cheng, Li; Ji, Bo-Ya.

Mol Ther Nucleic Acids ; 23: 277-285, 2021 Mar 05.

Article in English | MEDLINE | ID: mdl-33425486

ABSTRACT

Uncovering additional long non-coding RNA (lncRNA)-disease associations has become increasingly important for developing treatments for complex human diseases. Identification of lncRNA biomarkers and lncRNA-disease associations is central to diagnoses and treatment. However, traditional experimental methods are expensive and time-consuming. Enormous amounts of data present in public biological databases are available for computational methods used to predict lncRNA-disease associations. In this study, we propose a novel computational method to predict lncRNA-disease associations. More specifically, a heterogeneous network is first constructed by integrating the associations among microRNA (miRNA), lncRNA, protein, drug, and disease, Second, high-order proximity preserved embedding (HOPE) was used to embed nodes into a network. Finally, the rotation forest classifier was adopted to train the prediction model. In the 5-fold cross-validation experiment, the area under the curve (AUC) of our method achieved 0.8328 ± 0.0236. We compare it with the other four classifiers, in which the proposed method remarkably outperformed other comparison methods. Otherwise, we constructed three case studies for three excess death rate cancers, respectively. The results show that 9 (lung cancer, gastric cancer, and hepatocellular carcinomas) out of the top 15 predicted disease-related lncRNAs were confirmed by our method. In conclusion, our method could predict the unknown lncRNA-disease associations effectively.

NEMPD: a network embedding-based method for predicting miRNA-disease associations by preserving behavior and attribute information.

Ji, Bo-Ya; You, Zhu-Hong; Chen, Zhan-Heng; Wong, Leon; Yi, Hai-Cheng.

BMC Bioinformatics ; 21(1): 401, 2020 Sep 10.

Article in English | MEDLINE | ID: mdl-32912137

ABSTRACT

BACKGROUND: As an important non-coding RNA, microRNA (miRNA) plays a significant role in a series of life processes and is closely associated with a variety of Human diseases. Hence, identification of potential miRNA-disease associations can make great contributions to the research and treatment of Human diseases. However, to our knowledge, many existing computational methods only utilize the single type of known association information between miRNAs and diseases to predict their potential associations, without focusing on their interactions or associations with other types of molecules. RESULTS: In this paper, we propose a network embedding-based method for predicting miRNA-disease associations by preserving behavior and attribute information. Firstly, a heterogeneous network is constructed by integrating known associations among miRNA, protein and disease, and the network representation method Learning Graph Representations with Global Structural Information (GraRep) is implemented to learn the behavior information of miRNAs and diseases in the network. Then, the behavior information of miRNAs and diseases is combined with the attribute information of them to represent miRNA-disease association pairs. Finally, the prediction model is established based on the Random Forest algorithm. Under the five-fold cross validation, the proposed NEMPD model obtained average 85.41% prediction accuracy with 80.96% sensitivity at the AUC of 91.58%. Furthermore, the performance of NEMPD is also validated by the case studies. Among the top 50 predicted disease-related miRNAs, 48 (breast neoplasms), 47 (colon neoplasms), 47 (lung neoplasms) were confirmed by two other databases. CONCLUSIONS: The proposed NEMPD model has a good performance in predicting the potential associations between miRNAs and diseases, and has great potency in the field of miRNA-disease association prediction in the future.

Subject(s)

Breast Neoplasms/diagnosis , Colonic Neoplasms/diagnosis , Computational Biology/methods , Lung Neoplasms/diagnosis , MicroRNAs/metabolism , Algorithms , Area Under Curve , Breast Neoplasms/genetics , Colonic Neoplasms/genetics , Female , Humans , Lung Neoplasms/genetics , MicroRNAs/genetics , ROC Curve

Prediction of drug-target interactions from multi-molecular network based on LINE network representation method.

Ji, Bo-Ya; You, Zhu-Hong; Jiang, Han-Jing; Guo, Zhen-Hao; Zheng, Kai.

J Transl Med ; 18(1): 347, 2020 09 07.

Article in English | MEDLINE | ID: mdl-32894154

ABSTRACT

BACKGROUND: The prediction of potential drug-target interactions (DTIs) not only provides a better comprehension of biological processes but also is critical for identifying new drugs. However, due to the disadvantages of expensive and high time-consuming traditional experiments, only a small section of interactions between drugs and targets in the database were verified experimentally. Therefore, it is meaningful and important to develop new computational methods with good performance for DTIs prediction. At present, many existing computational methods only utilize the single type of interactions between drugs and proteins without paying attention to the associations and influences with other types of molecules. METHODS: In this work, we developed a novel network embedding-based heterogeneous information integration model to predict potential drug-target interactions. Firstly, a heterogeneous multi-molecuar information network is built by combining the known associations among protein, drug, lncRNA, disease, and miRNA. Secondly, the Large-scale Information Network Embedding (LINE) model is used to learn behavior information (associations with other nodes) of drugs and proteins in the network. Hence, the known drug-protein interaction pairs can be represented as a combination of attribute information (e.g. protein sequences information and drug molecular fingerprints) and behavior information of themselves. Thirdly, the Random Forest classifier is used for training and prediction. RESULTS: In the results, under the five-fold cross validation, our method obtained 85.83% prediction accuracy with 80.47% sensitivity at the AUC of 92.33%. Moreover, in the case studies of three common drugs, the top 10 candidate targets have 8 (Caffeine), 7 (Clozapine) and 6 (Pioglitazone) are respectively verified to be associated with corresponding drugs. CONCLUSIONS: In short, these results indicate that our method can be a powerful tool for predicting potential drug-target interactions and finding unknown targets for certain drugs or unknown drugs for certain targets.

Subject(s)

MicroRNAs , Pharmaceutical Preparations , RNA, Long Noncoding , Algorithms , Amino Acid Sequence , Proteins

Predicting miRNA-disease association from heterogeneous information network with GraRep embedding model.

Ji, Bo-Ya; You, Zhu-Hong; Cheng, Li; Zhou, Ji-Ren; Alghazzawi, Daniyal; Li, Li-Ping.

Sci Rep ; 10(1): 6658, 2020 04 20.

Article in English | MEDLINE | ID: mdl-32313121

ABSTRACT

In recent years, accumulating evidences have shown that microRNA (miRNA) plays an important role in the exploration and treatment of diseases, so detection of the associations between miRNA and disease has been drawn more and more attentions. However, traditional experimental methods have the limitations of high cost and time- consuming, a computational method can help us more systematically and effectively predict the potential miRNA-disease associations. In this work, we proposed a novel network embedding-based heterogeneous information integration method to predict miRNA-disease associations. More specifically, a heterogeneous information network is constructed by combining the known associations among lncRNA, drug, protein, disease, and miRNA. After that, the network embedding method Learning Graph Representations with Global Structural Information (GraRep) is employed to learn embeddings of nodes in heterogeneous information network. In this way, the embedding representations of miRNA and disease are integrated with the attribute information of miRNA and disease (e.g. miRNA sequence information and disease semantic similarity) to represent miRNA-disease association pairs. Finally, the Random Forest (RF) classifier is used for predicting potential miRNA-disease associations. Under the 5-fold cross validation, our method obtained 85.11% prediction accuracy with 80.41% sensitivity at the AUC of 91.25%. In addition, in case studies of three major Human diseases, 45 (Colon Neoplasms), 42 (Breast Neoplasms) and 44 (Esophageal Neoplasms) of top-50 predicted miRNAs are respectively verified by other miRNA-disease association databases. In conclusion, the experimental results suggest that our method can be a powerful and useful tool for predicting potential miRNA-disease associations.

Subject(s)

Breast Neoplasms/genetics , Colonic Neoplasms/genetics , Esophageal Neoplasms/genetics , MicroRNAs/genetics , RNA, Circular/genetics , RNA, Long Noncoding/genetics , RNA, Messenger/genetics , RNA, Neoplasm/genetics , Algorithms , Antineoplastic Agents/metabolism , Antineoplastic Agents/pharmacokinetics , Breast Neoplasms/diagnosis , Breast Neoplasms/drug therapy , Breast Neoplasms/pathology , Colonic Neoplasms/diagnosis , Colonic Neoplasms/drug therapy , Colonic Neoplasms/pathology , Computational Biology/methods , Databases, Genetic , Decision Trees , Esophageal Neoplasms/diagnosis , Esophageal Neoplasms/drug therapy , Esophageal Neoplasms/pathology , Female , Humans , Male , MicroRNAs/classification , MicroRNAs/metabolism , Models, Genetic , RNA, Circular/classification , RNA, Circular/metabolism , RNA, Long Noncoding/classification , RNA, Long Noncoding/metabolism , RNA, Messenger/classification , RNA, Messenger/metabolism , RNA, Neoplasm/classification , RNA, Neoplasm/metabolism

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL