Less confidence, less forgetting: Learning with a humbler teacher in exemplar-free Class-Incremental learning.

Gao, Zijian; Xu, Kele; Zhuang, Huiping; Liu, Li; Mao, Xinjun; Ding, Bo; Feng, Dawei; Wang, Huaimin

Gao, Zijian; Xu, Kele; Zhuang, Huiping; Liu, Li; Mao, Xinjun; Ding, Bo; Feng, Dawei; Wang, Huaimin.

Afiliação

Gao Z; National University of Defense Technology, Changsha 410000, China; State Key Laboratory of Complex & Critical Software Environment, Changsha 410000, China.
Xu K; National University of Defense Technology, Changsha 410000, China; State Key Laboratory of Complex & Critical Software Environment, Changsha 410000, China. Electronic address: xukelele@nudt.edu.cn.
Zhuang H; South China University of Technology, Guangzhou 510000, China.
Liu L; National University of Defense Technology, Changsha 410000, China; University of Oulu, 02150 Oulu, Finland.
Mao X; National University of Defense Technology, Changsha 410000, China; State Key Laboratory of Complex & Critical Software Environment, Changsha 410000, China.
Ding B; National University of Defense Technology, Changsha 410000, China; State Key Laboratory of Complex & Critical Software Environment, Changsha 410000, China.
Feng D; National University of Defense Technology, Changsha 410000, China; State Key Laboratory of Complex & Critical Software Environment, Changsha 410000, China.
Wang H; National University of Defense Technology, Changsha 410000, China; State Key Laboratory of Complex & Critical Software Environment, Changsha 410000, China.

Neural Netw ; 179: 106513, 2024 Nov.

Article em En | MEDLINE | ID: mdl-39018945

ABSTRACT

ABSTRACT

Class-Incremental learning (CIL) is challenging due to catastrophic forgetting (CF), which escalates in exemplar-free scenarios. To mitigate CF, Knowledge Distillation (KD), which leverages old models as teacher models, has been widely employed in CIL. However, based on a case study, our investigation reveals that the teacher model exhibits over-confidence in unseen new samples. In this article, we conduct empirical experiments and provide theoretical analysis to investigate the over-confident phenomenon and the impact of KD in exemplar-free CIL, where access to old samples is unavailable. Building on our analysis, we propose a novel approach, Learning with Humbler Teacher, by systematically selecting an appropriate checkpoint model as a humbler teacher to mitigate CF. Furthermore, we explore utilizing the nuclear norm to obtain an appropriate temporal ensemble to enhance model stability. Notably, LwHT outperforms the state-of-the-art approach by a significant margin of 10.41%, 6.56%, and 4.31% in various settings while demonstrating superior model plasticity.

Assuntos

Aprendizagem; Humanos; Aprendizagem/fisiologia; Redes Neurais de Computação

Palavras-chave

Catastrophic forgetting; Checkpoint model; Exemplar-free Class-Incremental learning; Knowledge distillation

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Aprendizagem Limite: Humans Idioma: En Ano de publicação: 2024 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Aprendizagem Limite: Humans Idioma: En Ano de publicação: 2024 Tipo de documento: Article