Seq-SymRF: a random forest model predicts potential miRNA-disease associations based on information of sequences and clinical symptoms.
Sci Rep
; 10(1): 17901, 2020 10 21.
Article
in En
| MEDLINE
| ID: mdl-33087810
Increasing evidence indicates that miRNAs play a vital role in biological processes and are closely related to various human diseases. Research on miRNA-disease associations is helpful not only for disease prevention, diagnosis and treatment, but also for new drug identification and lead compound discovery. A novel sequence- and symptom-based random forest algorithm model (Seq-SymRF) was developed to identify potential associations between miRNA and disease. Features derived from sequence information and clinical symptoms were utilized to characterize miRNA and disease, respectively. Moreover, the clustering method by calculating the Euclidean distance was adopted to construct reliable negative samples. Based on the fivefold cross-validation, Seq-SymRF achieved the accuracy of 98.00%, specificity of 99.43%, sensitivity of 96.58%, precision of 99.40% and Matthews correlation coefficient of 0.9604, respectively. The areas under the receiver operating characteristic curve and precision recall curve were 0.9967 and 0.9975, respectively. Additionally, case studies were implemented with leukemia, breast neoplasms and hsa-mir-21. Most of the top-25 predicted disease-related miRNAs (19/25 for leukemia; 20/25 for breast neoplasms) and 15 of top-25 predicted miRNA-related diseases were verified by literature and dbDEMC database. It is anticipated that Seq-SymRF could be regarded as a powerful high-throughput virtual screening tool for drug research and development. All source codes can be downloaded from https://github.com/LeeKamlong/Seq-SymRF .
Full text:
1
Collection:
01-internacional
Database:
MEDLINE
Main subject:
Breast Neoplasms
/
Leukemia
/
Computational Biology
/
Genetic Predisposition to Disease
/
MicroRNAs
/
Genetic Association Studies
/
High-Throughput Screening Assays
Type of study:
Clinical_trials
/
Diagnostic_studies
/
Prognostic_studies
/
Risk_factors_studies
Limits:
Female
/
Humans
/
Male
Language:
En
Journal:
Sci Rep
Year:
2020
Document type:
Article
Country of publication:
United kingdom