Your browser doesn't support javascript.
loading
UNIMEMnet: Learning long-term motion and appearance dynamics for video prediction with a unified memory network.
Dai, Kuai; Li, Xutao; Luo, Chuyao; Chen, Wuqiao; Ye, Yunming; Feng, Shanshan.
Afiliación
  • Dai K; School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen 518055, Guangdong, China.
  • Li X; School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen 518055, Guangdong, China. Electronic address: lixutao@hit.edu.cn.
  • Luo C; School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen 518055, Guangdong, China.
  • Chen W; School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen 518055, Guangdong, China.
  • Ye Y; School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen 518055, Guangdong, China.
  • Feng S; School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen 518055, Guangdong, China.
Neural Netw ; 168: 256-271, 2023 Nov.
Article en En | MEDLINE | ID: mdl-37774512
ABSTRACT
As a pixel-wise dense forecast task, video prediction is challenging due to its high computation complexity, dramatic future uncertainty, and extremely complicated spatial-temporal patterns. Many deep learning methods are proposed for the task, which bring up significant improvements. However, they focus on modeling short-term spatial-temporal dynamics and fail to sufficiently exploit long-term ones. As a result, the methods tend to deliver unsatisfactory performance for a long-term forecast requirement. In this article, we propose a novel unified memory network (UNIMEMnet) for long-term video prediction, which can effectively exploit long-term motion-appearance dynamics and unify the short-term spatial-temporal dynamics and long-term ones in an architecture. In the UNIMEMnet, a dual branch multi-scale memory module is carefully designed to extract and preserve long-term spatial-temporal patterns. In addition, a short-term spatial-temporal dynamics module and an alignment and fusion module are devised to capture and coordinate short-term motion-appearance dynamics with long-term ones from our designed memory module. Extensive experiments on five video prediction datasets from both synthetic and real-world scenarios are conducted, which validate the effectiveness and superiority of our proposed method UNIMEMnet over state-of-the-art methods.
Asunto(s)
Palabras clave

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Movimiento (Física) Tipo de estudio: Prognostic_studies / Risk_factors_studies Idioma: En Revista: Neural Netw Asunto de la revista: NEUROLOGIA Año: 2023 Tipo del documento: Article País de afiliación: China

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Movimiento (Física) Tipo de estudio: Prognostic_studies / Risk_factors_studies Idioma: En Revista: Neural Netw Asunto de la revista: NEUROLOGIA Año: 2023 Tipo del documento: Article País de afiliación: China