Cascaded Parsing of Human-Object Interaction Recognition.

Zhou, Tianfei; Qi, Siyuan; Wang, Wenguan; Shen, Jianbing; Zhu, Song-Chun

Zhou, Tianfei; Qi, Siyuan; Wang, Wenguan; Shen, Jianbing; Zhu, Song-Chun.

IEEE Trans Pattern Anal Mach Intell ; 44(6): 2827-2840, 2022 06.

Article em En | MEDLINE | ID: mdl-33400648

RESUMO

This paper addresses the task of detecting and recognizing human-object interactions (HOI) in images. Considering the intrinsic complexity and structural nature of the task, we introduce a cascaded parsing network (CP-HOI) for a multi-stage, structured HOI understanding. At each cascade stage, an instance detection module progressively refines HOI proposals and feeds them into a structured interaction reasoning module. Each of the two modules is also connected to its predecessor in the previous stage, enabling efficient cross-stage information propagation. The structured interaction reasoning module is built upon a graph parsing neural network (GPNN), which efficiently models potential HOI structures as graphs and mines rich context for comprehensive relation understanding. In particular, GPNN infers a parse graph that i) interprets meaningful HOI structures by a learnable adjacency matrix, and ii) predicts action (edge) labels. Within an end-to-end, message-passing framework, GPNN blends learning and inference, iteratively parsing HOI structures and reasoning HOI representations (i.e., instance and relation features). Further beyond relation detection at a bounding-box level, we make our framework flexible to perform fine-grained pixel-wise relation segmentation; this provides a new glimpse into better relation modeling. A preliminary version of our CP-HOI model reached 1st place in the ICCV2019 Person in Context Challenge, on both relation detection and segmentation. In addition, our CP-HOI shows promising results on two popular HOI recognition benchmarks, i.e., V-COCO and HICO-DET.

Assuntos

Algoritmos; Redes Neurais de Computação; Humanos; Aprendizagem; Percepção Visual

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Algoritmos / Redes Neurais de Computação Idioma: En Ano de publicação: 2022 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Algoritmos / Redes Neurais de Computação Idioma: En Ano de publicação: 2022 Tipo de documento: Article