RESUMEN
Current medical research has been demonstrating the roles of miRNAs in a variety of cellular mechanisms, lending credence to the association between miRNA dysregulation and multiple diseases. Understanding the mechanisms of miRNA is critical for developing effective diagnostic and therapeutic strategies. miRNA-mRNA interactions emerge as the most important mechanism to be understood despite their experimental validation constraints. Accordingly, several computational models have been developed to predict miRNA-mRNA interactions, albeit presenting limited predictive capabilities, poor characterisation of miRNA-mRNA interactions, and low usability. To address these drawbacks, we developed PRIMITI, a PRedictive model for the Identification of novel miRNA-Target mRNA Interactions. PRIMITI is a novel machine learning model that utilises CLIP-seq and expression data to characterise functional target sites in 3'-untranslated regions (3'-UTRs) and predict miRNA-target mRNA repression activity. The model was trained using a reliable negative sample selection approach and the robust extreme gradient boosting (XGBoost) model, which was coupled with newly introduced features, including sequence and genetic variation information. PRIMITI achieved an area under the receiver operating characteristic (ROC) curve (AUC) up to 0.96 for a prediction of functional miRNA-target site binding and 0.96 for a prediction of miRNA-target mRNA repression activity on cross-validation and an independent blind test. Additionally, the model outperformed state-of-the-art methods in recovering miRNA-target repressions in an unseen microarray dataset and in a collection of validated miRNA-mRNA interactions, highlighting its utility for preliminary screening. PRIMITI is available on a reliable, scalable, and user-friendly web server at https://biosig.lab.uq.edu.au/primiti.
RESUMEN
The emergence of high-throughput sequencing techniques has revealed a primary role of microRNAs (miRNAs) in a wide range of diseases, including cancers and neurodegenerative disorders. Understanding novel relationships between miRNAs and diseases can potentially unveil complex pathogenesis mechanisms, leading to effective diagnosis and treatment. The investigation of novel miRNA-disease associations, however, is currently costly and time consuming. Over the years, several computational models have been proposed to prioritize potential miRNA-disease associations, but with limited usability or predictive capability. In order to fill this gap, we introduce TSMDA, a novel machine-learning method that leverages target and symptom information and negative sample selection to predict miRNA-disease association. TSMDA significantly outperforms similar methods, achieving an area under the receiver operating characteristic (ROC) curve (AUC) of 0.989 and 0.982 under 5-fold cross-validation and blind test, respectively. We also demonstrate the capability of the method to uncover potential miRNA-disease associations in breast, prostate, and lung cancers, as case studies. We believe TSMDA will be an invaluable tool for the community to explore and prioritize potentially new miRNA-disease associations for further experimental characterization. The method was made available as a freely accessible and user-friendly web interface at http://biosig.unimelb.edu.au/tsmda/.