A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2019; you can also visit the original URL.
The file type is application/pdf
.
Filters
Excitation Backprop for RNNs
2018
2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
Grounding decisions made by deep networks has been studied in spatial visual content, giving more insight into model predictions for images. ...
Abstract Deep models are state-of-the-art for many vision tasks including video action recognition and video captioning. ...
Acknowledgments We thank Kate Saenko and Vasili Ramanishka for helpful discussions. ...
doi:10.1109/cvpr.2018.00156
dblp:conf/cvpr/BargalZKZMS18
fatcat:oqv3nyo52fbehd3ic3cuvsn5wy
Bidirectional LSTM with saliency-aware 3D-CNN features for human action recognition
2021
Maǧallaẗ al-abḥāṯ al-handasiyyaẗ
The introduced system can learn long-term temporal dependencies and can predict complex human actions. ...
Existing recurrent based pipelines fail to capture long-term motion dynamics in videos with various motion scales and complex actions performed by multiple actors. ...
So, in this way, we can get the subject saliency information and the resulted video is known as saliency-aware video Vs. ...
doi:10.36909/jer.v9i3a.8383
fatcat:55whmd65lfh2zp4tob5gjpspay
Excitation Backprop for RNNs
[article]
2018
arXiv
pre-print
Grounding decisions made by deep networks has been studied in spatial visual content, giving more insight into model predictions for images. ...
Deep models are state-of-the-art for many vision tasks including video action recognition and video captioning. ...
Acknowledgments We thank Kate Saenko and Vasili Ramanishka for helpful discussions. ...
arXiv:1711.06778v3
fatcat:io6onint6zbcvc6b5asfazrnee
Human Action Recognition: Pose-Based Attention Draws Focus to Hands
2017
2017 IEEE International Conference on Computer Vision Workshops (ICCVW)
We propose a new spatio-temporal attention based mechanism for human action recognition able to automatically attend to most important human hands and detect the most discriminative moments in an action ...
Attention is handled in a recurrent manner employing Recurrent Neural Network (RNN) and is fully-differentiable. ...
Yeung et al. report a temporal recurrent attention model for dense labeling of videos [40] . ...
doi:10.1109/iccvw.2017.77
dblp:conf/iccvw/Baradel0M17
fatcat:4or2v2e3njemvfzyztqljuag7a
Human Action Recognition: Pose-based Attention draws focus to Hands
[article]
2017
arXiv
pre-print
We propose a new spatio-temporal attention based mechanism for human action recognition able to automatically attend to the hands most involved into the studied action and detect the most discriminative ...
Attention is handled in a recurrent manner employing Recurrent Neural Network (RNN) and is fully-differentiable. ...
Yeung et al. report a temporal recurrent attention model for dense labeling of videos [40] . ...
arXiv:1712.08002v1
fatcat:tua5ck5rhrgofmp6r33iesdrp4
SalSum: Saliency-based Video Summarization using Generative Adversarial Networks
[article]
2020
arXiv
pre-print
The huge amount of video data produced daily by camera-based systems, such as surveilance, medical and telecommunication systems, emerges the need for effective video summarization (VS) methods. ...
Several fusion approaches are considered for robustness under uncertainty, and personalization. ...
Fig. 4 : 4 Static, Temporal and Final Saliency scores for video #25 of the VSUMM dataset.
Fig. 5 : 5 Representative qualitative results. ...
arXiv:2011.10432v1
fatcat:645u5rfa3fcfhpit7d2g4w7xzu
ViNet: Pushing the limits of Visual Modality for Audio-Visual Saliency Prediction
[article]
2021
arXiv
pre-print
We propose the ViNet architecture for audio-visual saliency prediction. ViNet is a fully convolutional encoder-decoder architecture. ...
Interestingly, we also observe similar behaviour in the previous state-of-the-art models for audio-visual saliency prediction. ...
UNISAL [13] is a unified image and video saliency prediction model that uses MobileNet to extract spatial features and LSTMs for encoding temporal information. ...
arXiv:2012.06170v3
fatcat:tumxsk7ofrbqrl56msgrhuap7y
Human activity prediction using saliency-aware motion enhancement and weighted LSTM network
2021
EURASIP Journal on Image and Video Processing
In this paper, a novel framework named weighted long short-term memory network (WLSTM) with saliency-aware motion enhancement (SME) is proposed for video activity prediction. ...
However, predicting human activity earlier in a video is still a challenging task. ...
Acknowledgements Thanks to all those who have suggested and given guidance for this article.
Authors' contributions All the authors of this article contributed to this article. ...
doi:10.1186/s13640-020-00544-0
fatcat:kuhulehob5e7zcsiwyb4gkwk3e
Unified Image and Video Saliency Modeling
[article]
2020
arXiv
pre-print
Visual saliency modeling for images and videos is treated as two independent tasks in recent computer vision literature. ...
We identify different sources of domain shift between image and video saliency data and between different video saliency datasets as a key challenge for effective joint modelling. ...
Unified Image and Video Saliency Modeling
Code The full code for evaluating and training the UNISAL model is available at https://github.com/rdroste/unisal. ...
arXiv:2003.05477v2
fatcat:mghlfa4vbfe6dmc7tf2did4wqi
Audio-Visual Temporal Saliency Modeling Validated by fMRI Data
2018
2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
In this work we propose an audio-visual model for predicting temporal saliency in videos, that we validate and evaluate in an alternative way by employing fMRI data. ...
its effectiveness and appropriateness in predicting audio-visual saliency for dynamic stimuli. ...
(recurrent vs convolutional). ...
doi:10.1109/cvprw.2018.00269
dblp:conf/cvpr/KoutrasPTM18
fatcat:6hqbdm7dnrbfrjtwmhwbyicvjy
Supersaliency: A Novel Pipeline for Predicting Smooth Pursuit-Based Attention Improves Generalizability of Video Saliency
[article]
2019
arXiv
pre-print
To this end, we (i) use algorithmic and manual annotations of SP and fixations for two well-established video saliency data sets, (ii) train Slicing Convolutional Neural Networks for saliency prediction ...
However, even though most of the available video saliency data sets and models claim to target human observers' fixations, they fail to differentiate them from smooth pursuit (SP), a major eye movement ...
Acknowledgements Supported by the Elite Network Bavaria, funded by the Bavarian State Ministry for Research and Education. ...
arXiv:1801.08925v3
fatcat:dxh2rbrdizgf3kffvepa4qoc5a
Learning brain dynamics for decoding and predicting individual differences
[article]
2021
bioRxiv
pre-print
To decode brain dynamics, we propose an architecture based on recurrent neural networks to uncover distributed spatiotemporal signatures. ...
We believe our approach provides a powerful framework for visualizing, analyzing, and discovering dynamic spatially distributed brain representations during naturalistic conditions. ...
The evolution of saliency for the Home Alone clip was somewhat similar (see also the saliency map video for the Brokovich clip in S1 Video). 375
Predicting behavior
376 Recent studies have employed ...
doi:10.1101/2021.03.27.437315
fatcat:iapz4oqy75awpl3tgvfcxb4zfq
Learning brain dynamics for decoding and predicting individual differences
2021
PLoS Computational Biology
To decode brain dynamics, we propose an architecture based on recurrent neural networks to uncover distributed spatiotemporal signatures. ...
Our approach provides a framework for visualizing, analyzing, and discovering dynamic spatially distributed brain representations during naturalistic conditions. ...
The evolution of saliency for the Home Alone clip was somewhat similar (see also the saliency map video for the Brokovich clip in S1 Video). ...
doi:10.1371/journal.pcbi.1008943
pmid:34478442
pmcid:PMC8445454
fatcat:nxt75mv6xzbwfnjaq72wsmxlpa
Supersaliency: A Novel Pipeline for Predicting Smooth Pursuit-Based Attention Improves Generalisability of Video Saliency
2019
IEEE Access
In this work, we strive for a more meaningful prediction and conceptual understanding of saliency in general. ...
for saliency prediction on either fixation-or SP-salient locations, and (iii) evaluate our and 26 publicly available dynamic saliency models on three data sets against traditional saliency and supersaliency ...
Saliency prediction for videos, however, lacks an established benchmark. ...
doi:10.1109/access.2019.2961835
fatcat:fejio3v3s5a5deynmdmo6cwiny
Review of Visual Saliency Detection with Comprehensive Information
[article]
2018
arXiv
pre-print
RGBD saliency detection, co-saliency detection, or video saliency detection. ...
The goal of video saliency detection model is to locate the motion-related salient object in video sequences, which considers the motion cue and spatiotemporal constraint jointly. ...
for co-saliency detection, and temporal constraint for video saliency detection. ...
arXiv:1803.03391v2
fatcat:htcmhlo32jhczehvvq6nmgzwam
« Previous
Showing results 1 — 15 out of 1,490 results