Predicting the Category and Attributes of Visual Search Targets Using Deep Gaze Pooling

Hosnieh Sattar, Andreas Bulling, Mario Fritz
2017 2017 IEEE International Conference on Computer Vision Workshops (ICCVW)  
Predicting the target of visual search from human gaze data is a challenging problem. In contrast to previous work that focused on predicting specific instances of search targets, we propose the first approach to predict a target's category and attributes. However, state-of-the-art models for categorical recognition require large amounts of training data, which is prohibitive for gaze data. We thus propose a novel Gaze Pooling Layer that integrates gaze information and CNN-based features by an
more » ... ttention mechanism -incorporating both spatial and temporal aspects of gaze behaviour. We show that our approach can leverage pre-trained CNN architectures, thus eliminating the need for expensive joint data collection of image and gaze data. We demonstrate the effectiveness of our method on a new 14 participant dataset, and indicate directions for future research in the gaze-based prediction of mental states.
doi:10.1109/iccvw.2017.322 dblp:conf/iccvw/SattarBF17 fatcat:7opkss22bnhxnmwp7cjamwinb4