Your browser doesn't support javascript.
loading
Self-Supervised Motion Perception for Spatiotemporal Representation Learning.
IEEE Trans Neural Netw Learn Syst ; 34(12): 9832-9846, 2023 Dec.
Article en En | MEDLINE | ID: mdl-35358053
ABSTRACT
In this study, we propose a novel pretext task and a self-supervised motion perception (SMP) method for spatiotemporal representation learning. The pretext task is defined as video playback rate perception, which utilizes temporal dilated sampling to augment video clips to multiple duplicates of different temporal resolutions. The SMP method is built upon discriminative and generative motion perception models, which capture representations related to motion dynamics and appearance from video clips of multiple temporal resolutions in a collaborative fashion. To enhance the collaboration, we further propose difference and convolution motion attention (MA), which drives the generative model focusing on motion-related appearance, and leverage multiple granularity perception (MG) to extract accurate motion dynamics. Extensive experiments demonstrate SMP's effectiveness for video motion perception and state-of-the-art performance of self-supervised representation models upon target tasks, including action recognition and video retrieval. Code for SMP is available at github.com/yuanyao366/SMP.

Texto completo: 1 Base de datos: MEDLINE Idioma: En Revista: IEEE Trans Neural Netw Learn Syst Año: 2023 Tipo del documento: Article

Texto completo: 1 Base de datos: MEDLINE Idioma: En Revista: IEEE Trans Neural Netw Learn Syst Año: 2023 Tipo del documento: Article