Post-boosting of classification boundary for imbalanced data using geometric mean.

Du, Jie; Vong, Chi-Man; Pun, Chi-Man; Wong, Pak-Kin; Ip, Weng-Fai

Du, Jie; Vong, Chi-Man; Pun, Chi-Man; Wong, Pak-Kin; Ip, Weng-Fai.

Afiliação

Du J; Department of Computer and Information Science, University of Macau, Macau. Electronic address: yb57415@umac.mo.
Vong CM; Department of Computer and Information Science, University of Macau, Macau. Electronic address: cmvong@umac.mo.
Pun CM; Department of Computer and Information Science, University of Macau, Macau. Electronic address: cmpun@umac.mo.
Wong PK; Department of Electromechanical Engineering, University of Macau, Macau. Electronic address: fstpkw@umac.mo.
Ip WF; Faculty of Science and Technology, University of Macau, Macau. Electronic address: andyip@umac.mo.

Neural Netw ; 96: 101-114, 2017 Dec.

Article em En | MEDLINE | ID: mdl-28987974

RESUMO

In this paper, a novel imbalance learning method for binary classes is proposed, named as Post-Boosting of classification boundary for Imbalanced data (PBI), which can significantly improve the performance of any trained neural networks (NN) classification boundary. The procedure of PBI simply consists of two steps: an (imbalanced) NN learning method is first applied to produce a classification boundary, which is then adjusted by PBI under the geometric mean (G-mean). For data imbalance, the geometric mean of the accuracies of both minority and majority classes is considered, that is statistically more suitable than the common metric accuracy. PBI also has the following advantages over traditional imbalance methods: (i) PBI can significantly improve the classification accuracy on minority class while improving or keeping that on majority class as well; (ii) PBI is suitable for large data even with high imbalance ratio (up to 0.001). For evaluation of (i), a new metric called Majority loss/Minority advance ratio (MMR) is proposed that evaluates the loss ratio of majority class to minority class. Experiments have been conducted for PBI and several imbalance learning methods over benchmark datasets of different sizes, different imbalance ratios, and different dimensionalities. By analyzing the experimental results, PBI is shown to outperform other imbalance learning methods on almost all datasets.

Assuntos

Aprendizado de Máquina/classificação; Redes Neurais de Computação; Estatística como Assunto/classificação; Algoritmos; Biometria

Palavras-chave

Boosting; Imbalance learning; SMOTE; Weighted ELM

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Estatística como Assunto / Redes Neurais de Computação / Aprendizado de Máquina Idioma: En Ano de publicação: 2017 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google