Augmented saliency model using automatic 3D head pose detection and learned gaze following in natural scenes.

Parks, Daniel; Borji, Ali; Itti, Laurent

Parks, Daniel; Borji, Ali; Itti, Laurent.

Afiliação

Parks D; Neuroscience Graduate Program, University of Southern California, 3641 Watt Way, Los Angeles, CA 90089, USA. Electronic address: danielfp@usc.edu.
Borji A; Department of Computer Science, University of Wisconsin - Milwaukee, PO Box 784, Milwaukee, WI 53211, USA. Electronic address: borji@uwm.edu.
Itti L; Department of Computer Science, University of Southern California, 3641 Watt Way, Los Angeles, CA 90089, USA; Neuroscience Graduate Program, University of Southern California, 3641 Watt Way, Los Angeles, CA 90089, USA; Department of Psychology, University of Southern California, 3641 Watt Way, Los Angeles, CA 90089, USA. Electronic address: itti@pollux.usc.edu.

Vision Res ; 116(Pt B): 113-26, 2015 Nov.

Article em En | MEDLINE | ID: mdl-25448115

RESUMO

Previous studies have shown that gaze direction of actors in a scene influences eye movements of passive observers during free-viewing (Castelhano, Wieth, & Henderson, 2007; Borji, Parks, & Itti, 2014). However, no computational model has been proposed to combine bottom-up saliency with actor's head pose and gaze direction for predicting where observers look. Here, we first learn probability maps that predict fixations leaving head regions (gaze following fixations), as well as fixations on head regions (head fixations), both dependent on the actor's head size and pose angle. We then learn a combination of gaze following, head region, and bottom-up saliency maps with a Markov chain composed of head region and non-head region states. This simple structure allows us to inspect the model and make comments about the nature of eye movements originating from heads as opposed to other regions. Here, we assume perfect knowledge of actor head pose direction (from an oracle). The combined model, which we call the Dynamic Weighting of Cues model (DWOC), explains observers' fixations significantly better than each of the constituent components. Finally, in a fully automatic combined model, we replace the oracle head pose direction data with detections from a computer vision model of head pose. Using these (imperfect) automated detections, we again find that the combined model significantly outperforms its individual components. Our work extends the engineering and scientific applications of saliency models and helps better understand mechanisms of visual attention.

Assuntos

Movimentos Oculares/fisiologia; Fixação Ocular/fisiologia; Cabeça; Reconhecimento Visual de Modelos/fisiologia; Postura/fisiologia; Feminino; Humanos; Imageamento Tridimensional; Masculino; Probabilidade

Palavras-chave

Eye movements; Fixation prediction; Gaze following; Head pose detection; Saliency modeling; Visual attention

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Reconhecimento Visual de Modelos / Postura / Movimentos Oculares / Fixação Ocular / Cabeça Idioma: En Ano de publicação: 2015 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google