RESUMO
BACKGROUND: Esophageal cancer (EC) remains a global health challenge, often diagnosed at advanced stages, leading to high mortality rates. Current diagnostic tools for EC are limited in their efficacy. This study aims to harness the potential of microRNAs (miRNAs) as novel, noninvasive diagnostic biomarkers for EC. Our objective was to determine the diagnostic accuracy of miRNAs, particularly in distinguishing miRNAs associated with EC from control miRNAs. METHODS: We applied machine learning (ML) techniques in WEKA (Waikato Environment for Knowledge Analysis) and TensorFlow Keras to a dataset of miRNA sequences and gene targets, assessing the predictive power of several classifiers: naïve Bayes, multilayer perceptron, Hoeffding tree, random forest, and random tree. The data were further subjected to InfoGain feature selection to identify the most informative miRNA sequence and gene target descriptors. The ML models' abilities to distinguish between miRNA implicated in EC and control group miRNA was then tested. RESULTS: Of the tested WEKA classifiers, the top 3 performing ones were random forest, Hoeffding tree, and naïve Bayes. The TensorFlow Keras neural network model was subsequently trained and tested, the model's predictive power was further validated using an independent dataset. The TensorFlow Keras gave an accuracy 0.91. The WEKA best algorithm (naïve Bayes) model yielded an accuracy of 0.94. CONCLUSIONS: The results demonstrate the potential of ML-based miRNA classifiers in diagnosing EC. However, further studies are necessary to validate these findings and explore the full clinical potential of this approach.