Predictive modeling for breast cancer classification in the context of Bangladeshi patients by use of machine learning approach with explainable AI.

Islam, Taminul; Sheakh, Md Alif; Tahosin, Mst Sazia; Hena, Most Hasna; Akash, Shopnil; Bin Jardan, Yousef A; FentahunWondmie, Gezahign; Nafidi, Hiba-Allah; Bourhia, Mohammed

Islam, Taminul; Sheakh, Md Alif; Tahosin, Mst Sazia; Hena, Most Hasna; Akash, Shopnil; Bin Jardan, Yousef A; FentahunWondmie, Gezahign; Nafidi, Hiba-Allah; Bourhia, Mohammed.

Afiliação

Islam T; School of Computing, Southern Illinois University Carbondale, Carbondale, IL, USA.
Sheakh MA; Department of Computer Science and Engineering, Daffodil International University, Dhaka, Bangladesh.
Tahosin MS; Department of Computer Science and Engineering, Daffodil International University, Dhaka, Bangladesh.
Hena MH; Department of Computer Science and Engineering, Daffodil International University, Dhaka, Bangladesh.
Akash S; Department of Pharmacy, Faculty of Allied Health Sciences, Daffodil International University, Dhaka, Bangladesh.
Bin Jardan YA; Department of Pharmaceutics, College of Pharmacy, King Saud University, P.O. Box 11451, Riyadh, Saudi Arabia.
FentahunWondmie G; Department of Biology, Bahir Dar University, P.O. Box 79, Bahir Dar, Ethiopia. resercherfent@gmail.com.
Nafidi HA; Department of Food Science, Faculty of Agricultural and Food Sciences, Laval University, 2325, Quebec City, QC, G1V 0A6, Canada.
Bourhia M; Laboratory of Biotechnology and Natural Resources Valorization, Ibn Zohr University, 80060, Agadir, Morocco.

Sci Rep ; 14(1): 8487, 2024 04 11.

Article em En | MEDLINE | ID: mdl-38605059

ABSTRACT

ABSTRACT

Breast cancer has rapidly increased in prevalence in recent years, making it one of the leading causes of mortality worldwide. Among all cancers, it is by far the most common. Diagnosing this illness manually requires significant time and expertise. Since detecting breast cancer is a time-consuming process, preventing its further spread can be aided by creating machine-based forecasts. Machine learning and Explainable AI are crucial in classification as they not only provide accurate predictions but also offer insights into how the model arrives at its decisions, aiding in the understanding and trustworthiness of the classification results. In this study, we evaluate and compare the classification accuracy, precision, recall, and F1 scores of five different machine learning methods using a primary dataset (500 patients from Dhaka Medical College Hospital). Five different supervised machine learning techniques, including decision tree, random forest, logistic regression, naive bayes, and XGBoost, have been used to achieve optimal results on our dataset. Additionally, this study applied SHAP analysis to the XGBoost model to interpret the model's predictions and understand the impact of each feature on the model's output. We compared the accuracy with which several algorithms classified the data, as well as contrasted with other literature in this field. After final evaluation, this study found that XGBoost achieved the best model accuracy, which is 97%.

Assuntos

Neoplasias da Mama; Humanos; Feminino; Neoplasias da Mama/diagnóstico; Teorema de Bayes; Bangladesh/epidemiologia; Mama; Aprendizado de Máquina; Hidrolases

Palavras-chave

Breast cancer prediction; Cancer prediction; Explainable AI; Hyperparameter tuning; Machine learning

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Neoplasias da Mama Limite: Female / Humans País/Região como assunto: Asia Idioma: En Ano de publicação: 2024 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google