A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
DAVE: A Deep Audio-Visual Embedding for Dynamic Saliency Prediction
[article]
2020
arXiv
pre-print
This paper studies audio-visual deep saliency prediction. It introduces a conceptually simple and effective Deep Audio-Visual Embedding for dynamic saliency prediction dubbed "DAVE" in conjunction with our efforts towards building an Audio-Visual Eye-tracking corpus named "AVE". Despite existing a strong relation between auditory and visual cues for guiding gaze during perception, video saliency models only consider visual cues and neglect the auditory information that is ubiquitous in dynamic
arXiv:1905.10693v2
fatcat:5tby44imzrcnvflhdfj4rrasie