HSSG: Identification of Cancer Subtypes Based on Heterogeneity Score of A Single Gene.

Pang, Shanchen; Wu, Wenhao; Zhang, Yuanyuan; Wang, Shudong; Niu, Muyuan; Zhang, Kuijie; Yin, Wenjing

Pang, Shanchen; Wu, Wenhao; Zhang, Yuanyuan; Wang, Shudong; Niu, Muyuan; Zhang, Kuijie; Yin, Wenjing.

Afiliação

Pang S; College of Computer Science and Technology, Qingdao Institute of Software, China University of Petroleum, Qingdao 266580, China.
Wu W; College of Computer Science and Technology, Qingdao Institute of Software, China University of Petroleum, Qingdao 266580, China.
Zhang Y; College of Computer Science and Technology, Qingdao Institute of Software, China University of Petroleum, Qingdao 266580, China.
Wang S; School of Information and Control Engineering, Qingdao University of Technology, Qingdao 266525, China.
Niu M; College of Computer Science and Technology, Qingdao Institute of Software, China University of Petroleum, Qingdao 266580, China.
Zhang K; Normal College, Qingdao University, Qingdao 266071, China.
Yin W; College of Computer Science and Technology, Qingdao Institute of Software, China University of Petroleum, Qingdao 266580, China.

Cells ; 11(15)2022 08 08.

Article em En | MEDLINE | ID: mdl-35954300

ABSTRACT

ABSTRACT

Cancer is a highly heterogeneous disease, which leads to the fact that even the same cancer can be further classified into different subtypes according to its pathology. With the multi-omics data widely used in cancer subtypes identification, effective feature selection is essential for accurately identifying cancer subtypes. However, the feature selection in the existing cancer subtypes identification methods has the problem that the most helpful features cannot be selected from a biomolecular perspective, and the relationship between the selected features cannot be reflected. To solve this problem, we propose a method for feature selection to identify cancer subtypes based on the heterogeneity score of a single gene HSSG. In the proposed method, the sample-similarity network of a single gene is constructed, and pseudo-F statistics calculates the heterogeneity score for cancer subtypes identification of each gene. Finally, we construct gene-gene networks using genes with higher heterogeneity scores and mine essential genes from the networks. From the seven TCGA data sets for three experiments, including cancer subtypes identification in single-omics data, the performance in feature selection of multi-omics data, and the effectiveness and stability of the selected features, HSSG achieves good performance in all. This indicates that HSSG can effectively select features for subtypes identification.

Assuntos

Neoplasias; Redes Reguladoras de Genes; Humanos; Neoplasias/genética

Palavras-chave

cancer subtypes; heterogeneity; pseudo-F statistic; single gene

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Neoplasias Idioma: En Ano de publicação: 2022 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Neoplasias Idioma: En Ano de publicação: 2022 Tipo de documento: Article