Pesquisa | Portal Regional da BVS

LAEO-Net++: Revisiting People Looking at Each Other in Videos.

Marin-Jimenez, Manuel J; Kalogeiton, Vicky; Medina-Suarez, Pablo; Zisserman, Andrew.

IEEE Trans Pattern Anal Mach Intell ; 44(6): 3069-3081, 2022 06.

Artigo em Inglês | MEDLINE | ID: mdl-33382648

RESUMO

Capturing the 'mutual gaze' of people is essential for understanding and interpreting the social interactions between them. To this end, this paper addresses the problem of detecting people Looking At Each Other (LAEO) in video sequences. For this purpose, we propose LAEO-Net++, a new deep CNN for determining LAEO in videos. In contrast to previous works, LAEO-Net++ takes spatio-temporal tracks as input and reasons about the whole track. It consists of three branches, one for each character's tracked head and one for their relative position. Moreover, we introduce two new LAEO datasets: UCO-LAEO and AVA-LAEO. A thorough experimental evaluation demonstrates the ability of LAEO-Net++ to successfully determine if two people are LAEO and the temporal window where it happens. Our model achieves state-of-the-art results on the existing TVHID-LAEO video dataset, significantly outperforming previous approaches. Finally, we apply LAEO-Net++ to a social network, where we automatically infer the social relationship between pairs of people based on the frequency and duration that they LAEO, and show that LAEO can be a useful tool for guided search of human interactions in videos.

Assuntos

Algoritmos , Humanos

Analysing Domain Shift Factors between Videos and Images for Object Detection.

Kalogeiton, Vicky; Ferrari, Vittorio; Schmid, Cordelia.

IEEE Trans Pattern Anal Mach Intell ; 38(11): 2327-2334, 2016 11.

Artigo em Inglês | MEDLINE | ID: mdl-27071159

RESUMO

Object detection is one of the most important challenges in computer vision. Object detectors are usually trained on bounding-boxes from still images. Recently, video has been used as an alternative source of data. Yet, for a given test domain (image or video), the performance of the detector depends on the domain it was trained on. In this paper, we examine the reasons behind this performance gap. We define and evaluate different domain shift factors: spatial location accuracy, appearance diversity, image quality and aspect distribution. We examine the impact of these factors by comparing performance before and after factoring them out. The results show that all four factors affect the performance of the detectors and their combined effect explains nearly the whole performance gap.

RESUMO

Assuntos

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA