RESUMEN
The identification of human disease-related microRNAs (miRNAs) is important for understanding the pathogenesis of diseases, but to do this experimentally is a costly and time-consuming process. Computational prediction of disease-related miRNA candidates is a valuable complement to experimental studies. It is essential to develop an effective prediction method to provide reliable candidates for subsequent biological experiments. In this study, we constructed a miRNA functional similarity network based on calculation of the functional similarity between each pair of miRNAs. Here, we present a new method (DismiPred) for predicting disease-related miRNA candidates based on the network. This method incorporates functional similarity and common association information to achieve an efficient prediction performance. DismiPred has been successfully shown to recover experimentally validated disease-related miRNAs for 12 common human diseases, with an F-measure ranging from 69.49 to 91.69%. Furthermore, a case study examining breast neoplasms showed that DismiPred could uncover novel disease-related miRNAs. DismiPred is useful for further experimental studies on the involvement of miRNAs in the pathogenesis of diseases.
Asunto(s)
Biología Computacional/métodos , MicroARNs/genética , Algoritmos , Neoplasias de la Mama/genética , Neoplasias de la Mama/metabolismo , Femenino , Regulación de la Expresión Génica , Redes Reguladoras de Genes , Humanos , Masculino , MicroARNs/metabolismo , Reproducibilidad de los Resultados , Programas InformáticosRESUMEN
In order to classify the real/pseudo human precursor microRNA (pre-miRNAs) hairpins with ab initio methods, numerous features are extracted from the primary sequence and second structure of pre-miRNAs. However, they include some redundant and useless features. It is essential to select the most representative feature subset; this contributes to improving the classification accuracy. We propose a novel feature selection method based on a genetic algorithm, according to the characteristics of human pre-miRNAs. The information gain of a feature, the feature conservation relative to stem parts of pre-miRNA, and the redundancy among features are all considered. Feature conservation was introduced for the first time. Experimental results were validated by cross-validation using datasets composed of human real/pseudo pre-miRNAs. Compared with microPred, our classifier miPredGA, achieved more reliable sensitivity and specificity. The accuracy was improved nearly 12%. The feature selection algorithm is useful for constructing more efficient classifiers for identification of real human pre-miRNAs from pseudo hairpins.