Your browser doesn't support javascript.
loading
Elastic Knowledge Distillation by Learning From Recollection.
IEEE Trans Neural Netw Learn Syst ; 34(5): 2647-2658, 2023 May.
Article em En | MEDLINE | ID: mdl-34550892
ABSTRACT
Model performance can be further improved with the extra guidance apart from the one-hot ground truth. To achieve it, recently proposed recollection-based methods utilize the valuable information contained in the past training history and derive a "recollection" from it to provide data-driven prior to guide the training. In this article, we focus on two fundamental aspects of this method, i.e., recollection construction and recollection utilization. Specifically, to meet the various demands of models with different capacities and at different training periods, we propose to construct a set of recollections with diverse distributions from the same training history. After that, all the recollections collaborate together to provide guidance, which is adaptive to different model capacities, as well as different training periods, according to our similarity-based elastic knowledge distillation (KD) algorithm. Without any external prior to guide the training, our method achieves a significant performance gain and outperforms the methods of the same category, even as well as KD with well-trained teacher. Extensive experiments and further analysis are conducted to demonstrate the effectiveness of our method.

Texto completo: 1 Base de dados: MEDLINE Tipo de estudo: Prognostic_studies Idioma: En Ano de publicação: 2023 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Tipo de estudo: Prognostic_studies Idioma: En Ano de publicação: 2023 Tipo de documento: Article