An integrative machine learning framework for classifying SEER breast cancer.

Manikandan, P; Durga, U; Ponnuraja, C

Manikandan, P; Durga, U; Ponnuraja, C.

Afiliación

Manikandan P; Department of Data Science, Loyola College, Chennai, 600 034, India. manimkn89@gmail.com.
Durga U; Department of Data Science, Loyola College, Chennai, 600 034, India.
Ponnuraja C; ICMR-National Institute for Research in Tuberculosis, Chennai, 600 031, India. cponnuraja@gmail.com.

Sci Rep ; 13(1): 5362, 2023 04 01.

Article en En | MEDLINE | ID: mdl-37005484

RESUMEN

Breast cancer is the commonest type of cancer in women worldwide and the leading cause of mortality for females. The aim of this research is to classify the alive and death status of breast cancer patients using the Surveillance, Epidemiology, and End Results dataset. Due to its capacity to handle enormous data sets systematically, machine learning and deep learning has been widely employed in biomedical research to answer diverse classification difficulties. Pre-processing the data enables its visualization and analysis for use in making important decisions. This research presents a feasible machine learning-based approach for categorizing SEER breast cancer dataset. Moreover, a two-step feature selection method based on Variance Threshold and Principal Component Analysis was employed to select the features from the SEER breast cancer dataset. After selecting the features, the classification of the breast cancer dataset is carried out using Supervised and Ensemble learning techniques such as Ada Boosting, XG Boosting, Gradient Boosting, Naive Bayes and Decision Tree. Utilizing the train-test split and k-fold cross-validation approaches, the performance of various machine learning algorithms is examined. The accuracy of Decision Tree for both train-test split and cross validation achieved as 98%. In this study, it is observed that the Decision Tree algorithm outperforms other supervised and ensemble learning approaches for the SEER Breast Cancer dataset.

Asunto(s)

Neoplasias de la Mama; Humanos; Femenino; Teorema de Bayes; Algoritmos; Aprendizaje Automático; Máquina de Vectores de Soporte

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Neoplasias de la Mama Límite: Female / Humans Idioma: En Revista: Sci Rep Año: 2023 Tipo del documento: Article País de afiliación: India Pais de publicación: Reino Unido

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google