Your browser doesn't support javascript.
loading
Graph-based social relation inference with multi-level conditional attention.
Yu, Xiaotian; Yi, Hanling; Tang, Qie; Huang, Kun; Hu, Wenze; Zhang, Shiliang; Wang, Xiaoyu.
Affiliation
  • Yu X; Department of AI Technology Center, Shenzhen Intellifusion Ltd., China. Electronic address: xiaotianyu.ac@gmail.com.
  • Yi H; Department of AI Technology Center, Shenzhen Intellifusion Ltd., China.
  • Tang Q; Department of AI Technology Center, Shenzhen Intellifusion Ltd., China.
  • Huang K; Department of AI Technology Center, Shenzhen Intellifusion Ltd., China.
  • Hu W; Department of AI Technology Center, Shenzhen Intellifusion Ltd., China.
  • Zhang S; Department of Computer Science, Peking University, China.
  • Wang X; Department of AI Technology Center, Shenzhen Intellifusion Ltd., China; The Chinese University of Hong Kong (Shenzhen), China.
Neural Netw ; 173: 106216, 2024 May.
Article in En | MEDLINE | ID: mdl-38442650
ABSTRACT
Social relation inference intrinsically requires high-level semantic understanding. In order to accurately infer relations of persons in images, one needs not only to understand scenes and objects in images, but also to adaptively attend to important clues. Unlike prior works of classifying social relations using attention on detected objects, we propose a MUlti-level Conditional Attention (MUCA) mechanism for social relation inference, which attends to scenes, objects and human interactions based on each person pair. Then, we develop a transformer-style network to achieve the MUCA mechanism. The novel network named as Graph-based Relation Inference Transformer (i.e., GRIT) consists of two modules, i.e., a Conditional Query Module (CQM) and a Relation Attention Module (RAM). Specifically, we design a graph-based CQM to generate informative relation queries for all person pairs, which fuses local features and global context for each person pair. Moreover, we fully take advantage of transformer-style networks in RAM for multi-level attentions in classifying social relations. To our best knowledge, GRIT is the first for inferring social relations with multi-level conditional attention. GRIT is end-to-end trainable and significantly outperforms existing methods on two benchmark datasets, e.g., with performance improvement of 7.8% on PIPA and 9.6% on PISC.
Subject(s)
Key words

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: Knowledge / Benchmarking Limits: Humans Language: En Journal: Neural Netw / Neural netw / Neural networks Journal subject: NEUROLOGIA Year: 2024 Document type: Article Country of publication: Estados Unidos

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: Knowledge / Benchmarking Limits: Humans Language: En Journal: Neural Netw / Neural netw / Neural networks Journal subject: NEUROLOGIA Year: 2024 Document type: Article Country of publication: Estados Unidos