Your browser doesn't support javascript.
loading
Analysis of Learning Influence of Training Data Selected by Distribution Consistency.
Hwang, Myunggwon; Jeong, Yuna; Sung, Won-Kyung.
Afiliação
  • Hwang M; Intelligent Infrastructure Technology Research Center, Korea Institute of Science and Technology Information (KISTI), Daejeon 34141, Korea.
  • Jeong Y; Department of Data & HPC Science, University of Science and Technology (UST), Daejeon 34113, Korea.
  • Sung WK; Intelligent Infrastructure Technology Research Center, Korea Institute of Science and Technology Information (KISTI), Daejeon 34141, Korea.
Sensors (Basel) ; 21(4)2021 Feb 04.
Article em En | MEDLINE | ID: mdl-33557021
ABSTRACT
This study suggests a method to select core data that will be helpful for machine learning. Specifically, we form a two-dimensional distribution based on the similarity of the training data and compose grids with fixed ratios on the distribution. In each grid, we select data based on the distribution consistency (DC) of the target class data and examine how it affects the classifier. We use CIFAR-10 for the experiment and set various grid ratios from 0.5 to 0.005. The influences of these variables were analyzed with the use of different training data sizes selected based on high-DC, low-DC (inverse of high DC), and random (no criteria) selections. As a result, the average point accuracy at 0.95% (±0.65) and the point accuracy at 1.54% (±0.59) improved for the grid configurations of 0.008 and 0.005, respectively. These outcomes justify an improved performance compared with that of the existing approach (data distribution search). In this study, we confirmed that the learning performance improved when the training data were selected for very small grid and high-DC settings.
Palavras-chave

Texto completo: 1 Base de dados: MEDLINE Idioma: En Ano de publicação: 2021 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Idioma: En Ano de publicação: 2021 Tipo de documento: Article