Functional gaze prediction in egocentric video

Si-Ahmed Naas, Xiaolan Jiang, Stephan Sigg, Yusheng Ji
2020 Proceedings of the 18th International Conference on Advances in Mobile Computing & Multimedia  
Streaming 360°videos to a head-mounted display (HMD) client is challenging due to their high network resource consumption and computational load. This is due to the use of gaze point prediction or image saliency features from the field of view (FoV) since, in real-time scenarios, FoV extraction is computationally demanding. We propose a functional gaze prediction system that addresses these issues by relying on a tiling scheme for gaze prediction. We condition gaze point prediction on virtual
more » ... ality (VR) content and long short-term memory (LSTM)-encoded eye movement history. Further, we encode image flow and saliency maps of RGB images via VGG16, using a convolutional neural network (CNN). Future gaze points are then predicted using a novel sinusoidal encoding technique. In experiments, our tile-based approach outperforms state-of-the-art FoV-based schemes in terms of computational load and predicted gaze position.
doi:10.1145/3428690.3429174 fatcat:xvcrbjzcnffotjyb3ywztgvm6e