Multiscale knowledge distillation with attention based fusion for robust human activity recognition.

Yuan, Zhaohui; Yang, Zhengzhe; Ning, Hao; Tang, Xiangyang

Yuan, Zhaohui; Yang, Zhengzhe; Ning, Hao; Tang, Xiangyang.

Afiliação

Yuan Z; Department of Software Engineering,School of Software, East China Jiaotong University, No. 808 Shuanggang East Street, Nanchang, 330013, Jiangxi, China. yuanzh@whu.edu.cn.
Yang Z; Department of Software Engineering,School of Software, East China Jiaotong University, No. 808 Shuanggang East Street, Nanchang, 330013, Jiangxi, China. yzzqwq@gmail.com.
Ning H; Department of Software Engineering,School of Software, East China Jiaotong University, No. 808 Shuanggang East Street, Nanchang, 330013, Jiangxi, China.
Tang X; Department of Software Engineering,School of Software, East China Jiaotong University, No. 808 Shuanggang East Street, Nanchang, 330013, Jiangxi, China.

Sci Rep ; 14(1): 12411, 2024 05 30.

Article em En | MEDLINE | ID: mdl-38816446

ABSTRACT

ABSTRACT

Knowledge distillation is an effective approach for training robust multi-modal machine learning models when synchronous multimodal data are unavailable. However, traditional knowledge distillation techniques have limitations in comprehensively transferring knowledge across modalities and models. This paper proposes a multiscale knowledge distillation framework to address these limitations. Specifically, we introduce a multiscale semantic graph mapping (SGM) loss function to enable more comprehensive knowledge transfer between teacher and student networks at multiple feature scales. We also design a fusion and tuning (FT) module to fully utilize correlations within and between different data types of the same modality when training teacher networks. Furthermore, we adopt transformer-based backbones to improve feature learning compared to traditional convolutional neural networks. We apply the proposed techniques to multimodal human activity recognition and compared with the baseline method, it improved by 2.31% and 0.29% on the MMAct and UTD-MHAD datasets. Ablation studies validate the necessity of each component.

Assuntos

Atividades Humanas; Aprendizado de Máquina; Redes Neurais de Computação; Humanos; Algoritmos; Atenção

Palavras-chave

Human activity recognition; Knowledge distillation; Multi-modalities; Self-attention; Transfer learning

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Redes Neurais de Computação / Aprendizado de Máquina / Atividades Humanas Limite: Humans Idioma: En Revista: Sci Rep Ano de publicação: 2024 Tipo de documento: Article País de afiliação: China

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google