Holistic Multi-modal Memory Network for Movie Question Answering.

Wang, Anran; Luu, Anh Tuan; Foo, Chuan-Sheng; Zhu, Hongyuan; Tay, Yi; Chandrasekhar, Vijay

Wang, Anran; Luu, Anh Tuan; Foo, Chuan-Sheng; Zhu, Hongyuan; Tay, Yi; Chandrasekhar, Vijay.

IEEE Trans Image Process ; 2019 Aug 02.

Article en En | MEDLINE | ID: mdl-31395548

ABSTRACT

ABSTRACT

Answering questions using multi-modal context is a challenging problem as it requires a deep integration of diverse data sources. Existing approaches only consider a subset of all possible interactions among data sources during one attention hop. In this paper, we present a Holistic Multi-modal Memory Network (HMMN) framework that fully considers interactions between different input sources (multi-modal context, question) at each hop. In addition, to hone in on relevant information, our framework takes answer choices into consideration during the context retrieval stage. Our HMMN framework effectively integrates information from the multi-modal context, question, and answer choices, enabling more informative context to be retrieved for question answering. Experimental results on the MovieQA and TVQA datasets validate the effectiveness of our HMMN framework. Extensive ablation studies show the importance of holistic reasoning and reveal the contributions of different attention strategies to model performance.

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Idioma: En Revista: IEEE Trans Image Process Asunto de la revista: INFORMATICA MEDICA Año: 2019 Tipo del documento: Article

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google