Your browser doesn't support javascript.
loading
ECFS-DEA: an ensemble classifier-based feature selection for differential expression analysis on expression profiles.
Zhao, Xudong; Jiao, Qing; Li, Hangyu; Wu, Yiming; Wang, Hanxu; Huang, Shan; Wang, Guohua.
Afiliação
  • Zhao X; College of Information and Computer Engineering, Northeast Forestry University, No.26 Hexing Road, Harbin, 150040, China.
  • Jiao Q; College of Information and Computer Engineering, Northeast Forestry University, No.26 Hexing Road, Harbin, 150040, China.
  • Li H; College of Information and Computer Engineering, Northeast Forestry University, No.26 Hexing Road, Harbin, 150040, China.
  • Wu Y; College of Information and Computer Engineering, Northeast Forestry University, No.26 Hexing Road, Harbin, 150040, China.
  • Wang H; College of Information and Computer Engineering, Northeast Forestry University, No.26 Hexing Road, Harbin, 150040, China.
  • Huang S; Department of Neurology, The 2nd Affiliated Hospital of Harbin Medical University, No. 246 Xuefu Road, Harbin, 150086, China.
  • Wang G; College of Information and Computer Engineering, Northeast Forestry University, No.26 Hexing Road, Harbin, 150040, China. ghwang@nefu.edu.cn.
BMC Bioinformatics ; 21(1): 43, 2020 Feb 05.
Article em En | MEDLINE | ID: mdl-32024464
ABSTRACT

BACKGROUND:

Various methods for differential expression analysis have been widely used to identify features which best distinguish between different categories of samples. Multiple hypothesis testing may leave out explanatory features, each of which may be composed of individually insignificant variables. Multivariate hypothesis testing holds a non-mainstream position, considering the large computation overhead of large-scale matrix operation. Random forest provides a classification strategy for calculation of variable importance. However, it may be unsuitable for different distributions of samples.

RESULTS:

Based on the thought of using an ensemble classifier, we develop a feature selection tool for differential expression analysis on expression profiles (i.e., ECFS-DEA for short). Considering the differences in sample distribution, a graphical user interface is designed to allow the selection of different base classifiers. Inspired by random forest, a common measure which is applicable to any base classifier is proposed for calculation of variable importance. After an interactive selection of a feature on sorted individual variables, a projection heatmap is presented using k-means clustering. ROC curve is also provided, both of which can intuitively demonstrate the effectiveness of the selected feature.

CONCLUSIONS:

Feature selection through ensemble classifiers helps to select important variables and thus is applicable for different sample distributions. Experiments on simulation and realistic data demonstrate the effectiveness of ECFS-DEA for differential expression analysis on expression profiles. The software is available at http//bio-nefu.com/resource/ecfs-dea.
Assuntos
Palavras-chave

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Software / Perfilação da Expressão Gênica Tipo de estudo: Prognostic_studies Idioma: En Ano de publicação: 2020 Tipo de documento: Article

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Software / Perfilação da Expressão Gênica Tipo de estudo: Prognostic_studies Idioma: En Ano de publicação: 2020 Tipo de documento: Article