RESUMO
BACKGROUND: Colorectal cancer (CRC) is a heterogeneous cancer. Its treatment depends on its anatomical site and distinguishes between colon, rectum, and rectosigmoid junction cancer. This study aimed to identify diagnostic and prognostic biomarkers using networks of CRC-associated transcripts that can be built based on competing endogenous RNAs (ceRNA). METHODS: RNA expression and clinical information data of patients with colon, rectum, and rectosigmoid junction cancer were obtained from The Cancer Genome Atlas (TCGA). The RNA expression profiles were assessed through bioinformatics analysis, and a ceRNA was constructed for each CRC site. A functional enrichment analysis was performed to assess the functional roles of the ceRNA networks in the prognosis of colon, rectum, and rectosigmoid junction cancer. Finally, to verify the ceRNA impact on prognosis, an overall survival analysis was performed. RESULTS: The study identified various CRC site-specific prognosis biomarkers: hsa-miR-1271-5p, NRG1, hsa-miR-130a-3p, SNHG16, and hsa-miR-495-3p in the colon; E2F8 in the rectum and DMD and hsa-miR-130b-3p in the rectosigmoid junction. We also identified different biological pathways that highlight differences in CRC behavior at different anatomical sites, thus reinforcing the importance of correctly identifying the tumor site. CONCLUSIONS: Several potential prognostic markers for colon, rectum, and rectosigmoid junction cancer were found. CeRNA networks could provide better understanding of the differences between, and common factors in, prognosis of colon, rectum, and rectosigmoid junction cancer.
RESUMO
Non-coding RNAs (ncRNAs) constitute an important set of transcripts produced in the cells of organisms. Among them, there is a large amount of a particular class of long ncRNAs that are difficult to predict, the so-called long intergenic ncRNAs (lincRNAs), which might play essential roles in gene regulation and other cellular processes. Despite the importance of these lincRNAs, there is still a lack of biological knowledge and, currently, the few computational methods considered are so specific that they cannot be successfully applied to other species different from those that they have been originally designed to. Prediction of lncRNAs have been performed with machine learning techniques. Particularly, for lincRNA prediction, supervised learning methods have been explored in recent literature. As far as we know, there are no methods nor workflows specially designed to predict lincRNAs in plants. In this context, this work proposes a workflow to predict lincRNAs on plants, considering a workflow that includes known bioinformatics tools together with machine learning techniques, here a support vector machine (SVM). We discuss two case studies that allowed to identify novel lincRNAs, in sugarcane (Saccharum spp.) and in maize (Zea mays). From the results, we also could identify differentially-expressed lincRNAs in sugarcane and maize plants submitted to pathogenic and beneficial microorganisms.