RESUMO
Dengue fever, prevalent in Southeast Asian countries, currently lacks effective pharmaceutical interventions for virus replication control. This study employs a strategy that combines machine learning (ML)-based quantitative-structure-activity relationship (QSAR), molecular docking, and molecular dynamics simulations to discover potential inhibitors of the NS3 protease of the dengue virus. We used nine molecular fingerprints from PaDEL to extract features from the NS3 protease dataset of dengue virus type 2 in the ChEMBL database. Feature selection was achieved through the low variance threshold, F-Score, and recursive feature elimination (RFE) methods. Our investigation employed three ML models - support vector machine (SVM), random forest (RF), and extreme gradient boosting (XGBoost) - for classifier development. Our SVM model, combined with SVM-RFE, had the best accuracy (0.866) and ROC_AUC (0.964) in the testing set. We identified potent inhibitors on the basis of the optimal classifier probabilities and docking binding affinities. SHAP and LIME analyses highlighted the significant molecular fingerprints (e.g. ExtFP69, ExtFP362, ExtFP576) involved in NS3 protease inhibitory activity. Molecular dynamics simulations indicated that amphotericin B exhibited the highest binding energy of -212 kJ/mol and formed a hydrogen bond with the critical residue Ser196. This approach enhances NS3 protease inhibitor identification and expedites the discovery of dengue therapeutics.