Your browser doesn't support javascript.
loading
A lightweight hybrid vision transformer network for radar-based human activity recognition.
Huan, Sha; Wang, Zhaoyue; Wang, Xiaoqiang; Wu, Limei; Yang, Xiaoxuan; Huang, Hongming; Dai, Gan E.
Afiliación
  • Huan S; School of Electronics and Communication Engineering, Guangzhou University, Guangzhou, 510006, China.
  • Wang Z; Key Laboratory of On-Chip Communication and Sensor Chip of Guangdong Higher Education Institutes, Guangzhou, 510006, China.
  • Wang X; School of Electronics and Communication Engineering, Guangzhou University, Guangzhou, 510006, China.
  • Wu L; College of Naval Architecture and Ocean Engineering, Naval University of Engineering, Wuhan, 430033, China. wxq_nue@126.com.
  • Yang X; School of Electronics and Communication Engineering, Guangzhou University, Guangzhou, 510006, China.
  • Huang H; School of Electronics and Communication Engineering, Guangzhou University, Guangzhou, 510006, China.
  • Dai GE; School of Electronics and Communication Engineering, Guangzhou University, Guangzhou, 510006, China.
Sci Rep ; 13(1): 17996, 2023 10 21.
Article en En | MEDLINE | ID: mdl-37865672
ABSTRACT
Radar-based human activity recognition (HAR) offers a non-contact technique with privacy protection and lighting robustness for many advanced applications. Complex deep neural networks demonstrate significant performance advantages when classifying the radar micro-Doppler signals that have unique correspondences with human behavior. However, in embedded applications, the demand for lightweight and low latency poses challenges to the radar-based HAR network construction. In this paper, an efficient network based on a lightweight hybrid Vision Transformer (LH-ViT) is proposed to address the HAR accuracy and network lightweight simultaneously. This network combines the efficient convolution operations with the strength of the self-attention mechanism in ViT. Feature Pyramid architecture is applied for the multi-scale feature extraction for the micro-Doppler map. Feature enhancement is executed by the stacked Radar-ViT subsequently, in which the fold and unfold operations are added to lower the computational load of the attention mechanism. The convolution operator in the LH-ViT is replaced by the RES-SE block, an efficient structure that combines the residual learning framework with the Squeeze-and-Excitation network. Experiments based on two human activity datasets indicate our method's advantages in terms of expressiveness and computing efficiency over traditional methods.
Asunto(s)

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Radar / Lesiones Accidentales Límite: Humans Idioma: En Revista: Sci Rep Año: 2023 Tipo del documento: Article País de afiliación: China

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Radar / Lesiones Accidentales Límite: Humans Idioma: En Revista: Sci Rep Año: 2023 Tipo del documento: Article País de afiliación: China
...