RESUMO
Background: Breast cancer (BC) is the most common cancer and the fifth cause of death in women worldwide. Exploring unique genes for cancers has been interesting. Patients and Methods: This study aimed to explore unique genes of five molecular subtypes of BC in women using penalized logistic regression models. For this purpose, microarray data of five independent GEO data sets were combined. This combination includes genetic information of 324 women with BC and 12 healthy women. Least absolute shrinkage and selection operator (LASSO) logistic regression and adaptive LASSO logistic regression were used to extract unique genes. The biological process of extracted genes was evaluated in an open-source GOnet web application. R software version 3.6.0 with the glmnet package was used for fitting the models. Results: Totally, 119 genes were extracted among 15 pairwise comparisons. Seventeen genes (14%) showed overlap between comparative groups. According to GO enrichment analysis, the biological process of extracted genes was enriched in negative and positive regulation biological processes, and molecular function tracking revealed that most genes are involved in kinase and transferring activities. On the other hand, we identified unique genes for each comparative group and the subsequent pathways for them. However, a significant pathway was not identified for genes in normal-like versus ERBB2 and luminal A, basal versus control, and lumina B versus luminal A groups. Conclusion: Most genes selected by LASSO logistic regression and adaptive LASSO logistic regression identified unique genes and related pathways for comparative subgroups of BC, which would be useful to comprehend the molecular differences between subgroups that would be considered for further research and therapeutic approaches in the future.