Filters








1,490 Hits in 4.8 sec

Excitation Backprop for RNNs

Sarah Adel Bargal, Andrea Zunino, Donghyun Kim, Jianming Zhang, Vittorio Murino, Stan Sclaroff
2018 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition  
Grounding decisions made by deep networks has been studied in spatial visual content, giving more insight into model predictions for images.  ...  Abstract Deep models are state-of-the-art for many vision tasks including video action recognition and video captioning.  ...  Acknowledgments We thank Kate Saenko and Vasili Ramanishka for helpful discussions.  ... 
doi:10.1109/cvpr.2018.00156 dblp:conf/cvpr/BargalZKZMS18 fatcat:oqv3nyo52fbehd3ic3cuvsn5wy

Bidirectional LSTM with saliency-aware 3D-CNN features for human action recognition

Sheeraz Arif, Department of Information and Communication Engineering, School of Information and Electronics, Beijing Institute of Technology, Beijing, 100081, China, Jing Wang, Adnan Ahmed Siddiqui, Rashid Hussain, Fida Hussain, Department of Information and Communication Engineering, School of Information and Electronics, Beijing Institute of Technology, Beijing, 100081, China, Department of Computing, Faculty of Engineering Science and Technology, Hamdard University, Karachi, Pakistan, Department of Computing, Faculty of Engineering Science and Technology, Hamdard University, Karachi, Pakistan, School of Electrical and Information Engineering, Jiangsu University, Nanjing, China
2021 Maǧallaẗ al-abḥāṯ al-handasiyyaẗ  
The introduced system can learn long-term temporal dependencies and can predict complex human actions.  ...  Existing recurrent based pipelines fail to capture long-term motion dynamics in videos with various motion scales and complex actions performed by multiple actors.  ...  So, in this way, we can get the subject saliency information and the resulted video is known as saliency-aware video Vs.  ... 
doi:10.36909/jer.v9i3a.8383 fatcat:55whmd65lfh2zp4tob5gjpspay

Excitation Backprop for RNNs [article]

Sarah Adel Bargal, Andrea Zunino, Donghyun Kim, Jianming Zhang, Vittorio Murino, Stan Sclaroff
2018 arXiv   pre-print
Grounding decisions made by deep networks has been studied in spatial visual content, giving more insight into model predictions for images.  ...  Deep models are state-of-the-art for many vision tasks including video action recognition and video captioning.  ...  Acknowledgments We thank Kate Saenko and Vasili Ramanishka for helpful discussions.  ... 
arXiv:1711.06778v3 fatcat:io6onint6zbcvc6b5asfazrnee

Human Action Recognition: Pose-Based Attention Draws Focus to Hands

Fabien Baradel, Christian Wolf, Julien Mille
2017 2017 IEEE International Conference on Computer Vision Workshops (ICCVW)  
We propose a new spatio-temporal attention based mechanism for human action recognition able to automatically attend to most important human hands and detect the most discriminative moments in an action  ...  Attention is handled in a recurrent manner employing Recurrent Neural Network (RNN) and is fully-differentiable.  ...  Yeung et al. report a temporal recurrent attention model for dense labeling of videos [40] .  ... 
doi:10.1109/iccvw.2017.77 dblp:conf/iccvw/Baradel0M17 fatcat:4or2v2e3njemvfzyztqljuag7a

Human Action Recognition: Pose-based Attention draws focus to Hands [article]

Fabien Baradel, Christian Wolf, Julien Mille
2017 arXiv   pre-print
We propose a new spatio-temporal attention based mechanism for human action recognition able to automatically attend to the hands most involved into the studied action and detect the most discriminative  ...  Attention is handled in a recurrent manner employing Recurrent Neural Network (RNN) and is fully-differentiable.  ...  Yeung et al. report a temporal recurrent attention model for dense labeling of videos [40] .  ... 
arXiv:1712.08002v1 fatcat:tua5ck5rhrgofmp6r33iesdrp4

SalSum: Saliency-based Video Summarization using Generative Adversarial Networks [article]

George Pantazis, George Dimas, Dimitris K. Iakovidis
2020 arXiv   pre-print
The huge amount of video data produced daily by camera-based systems, such as surveilance, medical and telecommunication systems, emerges the need for effective video summarization (VS) methods.  ...  Several fusion approaches are considered for robustness under uncertainty, and personalization.  ...  Fig. 4 : 4 Static, Temporal and Final Saliency scores for video #25 of the VSUMM dataset. Fig. 5 : 5 Representative qualitative results.  ... 
arXiv:2011.10432v1 fatcat:645u5rfa3fcfhpit7d2g4w7xzu

ViNet: Pushing the limits of Visual Modality for Audio-Visual Saliency Prediction [article]

Samyak Jain, Pradeep Yarlagadda, Shreyank Jyoti, Shyamgopal Karthik, Ramanathan Subramanian, Vineet Gandhi
2021 arXiv   pre-print
We propose the ViNet architecture for audio-visual saliency prediction. ViNet is a fully convolutional encoder-decoder architecture.  ...  Interestingly, we also observe similar behaviour in the previous state-of-the-art models for audio-visual saliency prediction.  ...  UNISAL [13] is a unified image and video saliency prediction model that uses MobileNet to extract spatial features and LSTMs for encoding temporal information.  ... 
arXiv:2012.06170v3 fatcat:tumxsk7ofrbqrl56msgrhuap7y

Human activity prediction using saliency-aware motion enhancement and weighted LSTM network

Zhengkui Weng, Wuzhao Li, Zhipeng Jin
2021 EURASIP Journal on Image and Video Processing  
In this paper, a novel framework named weighted long short-term memory network (WLSTM) with saliency-aware motion enhancement (SME) is proposed for video activity prediction.  ...  However, predicting human activity earlier in a video is still a challenging task.  ...  Acknowledgements Thanks to all those who have suggested and given guidance for this article. Authors' contributions All the authors of this article contributed to this article.  ... 
doi:10.1186/s13640-020-00544-0 fatcat:kuhulehob5e7zcsiwyb4gkwk3e

Unified Image and Video Saliency Modeling [article]

Richard Droste, Jianbo Jiao, J. Alison Noble
2020 arXiv   pre-print
Visual saliency modeling for images and videos is treated as two independent tasks in recent computer vision literature.  ...  We identify different sources of domain shift between image and video saliency data and between different video saliency datasets as a key challenge for effective joint modelling.  ...  Unified Image and Video Saliency Modeling Code The full code for evaluating and training the UNISAL model is available at https://github.com/rdroste/unisal.  ... 
arXiv:2003.05477v2 fatcat:mghlfa4vbfe6dmc7tf2did4wqi

Audio-Visual Temporal Saliency Modeling Validated by fMRI Data

Petros Koutras, Georgia Panagiotaropoulou, Antigoni Tsiami, Petros Maragos
2018 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)  
In this work we propose an audio-visual model for predicting temporal saliency in videos, that we validate and evaluate in an alternative way by employing fMRI data.  ...  its effectiveness and appropriateness in predicting audio-visual saliency for dynamic stimuli.  ...  (recurrent vs convolutional).  ... 
doi:10.1109/cvprw.2018.00269 dblp:conf/cvpr/KoutrasPTM18 fatcat:6hqbdm7dnrbfrjtwmhwbyicvjy

Supersaliency: A Novel Pipeline for Predicting Smooth Pursuit-Based Attention Improves Generalizability of Video Saliency [article]

Mikhail Startsev, Michael Dorr
2019 arXiv   pre-print
To this end, we (i) use algorithmic and manual annotations of SP and fixations for two well-established video saliency data sets, (ii) train Slicing Convolutional Neural Networks for saliency prediction  ...  However, even though most of the available video saliency data sets and models claim to target human observers' fixations, they fail to differentiate them from smooth pursuit (SP), a major eye movement  ...  Acknowledgements Supported by the Elite Network Bavaria, funded by the Bavarian State Ministry for Research and Education.  ... 
arXiv:1801.08925v3 fatcat:dxh2rbrdizgf3kffvepa4qoc5a

Learning brain dynamics for decoding and predicting individual differences [article]

Luiz Pessoa, Chirag Limbachia, Joyneel Misra, Srinivas Govinda Surampudi, Manasij Venkatesh, Joseph Jaja
2021 bioRxiv   pre-print
To decode brain dynamics, we propose an architecture based on recurrent neural networks to uncover distributed spatiotemporal signatures.  ...  We believe our approach provides a powerful framework for visualizing, analyzing, and discovering dynamic spatially distributed brain representations during naturalistic conditions.  ...  The evolution of saliency for the Home Alone clip was somewhat similar (see also the saliency map video for the Brokovich clip in S1 Video). 375 Predicting behavior 376 Recent studies have employed  ... 
doi:10.1101/2021.03.27.437315 fatcat:iapz4oqy75awpl3tgvfcxb4zfq

Learning brain dynamics for decoding and predicting individual differences

Joyneel Misra, Srinivas Govinda Surampudi, Manasij Venkatesh, Chirag Limbachia, Joseph Jaja, Luiz Pessoa, Daniele Marinazzo
2021 PLoS Computational Biology  
To decode brain dynamics, we propose an architecture based on recurrent neural networks to uncover distributed spatiotemporal signatures.  ...  Our approach provides a framework for visualizing, analyzing, and discovering dynamic spatially distributed brain representations during naturalistic conditions.  ...  The evolution of saliency for the Home Alone clip was somewhat similar (see also the saliency map video for the Brokovich clip in S1 Video).  ... 
doi:10.1371/journal.pcbi.1008943 pmid:34478442 pmcid:PMC8445454 fatcat:nxt75mv6xzbwfnjaq72wsmxlpa

Supersaliency: A Novel Pipeline for Predicting Smooth Pursuit-Based Attention Improves Generalisability of Video Saliency

Mikhail Startsev, Michael Dorr
2019 IEEE Access  
In this work, we strive for a more meaningful prediction and conceptual understanding of saliency in general.  ...  for saliency prediction on either fixation-or SP-salient locations, and (iii) evaluate our and 26 publicly available dynamic saliency models on three data sets against traditional saliency and supersaliency  ...  Saliency prediction for videos, however, lacks an established benchmark.  ... 
doi:10.1109/access.2019.2961835 fatcat:fejio3v3s5a5deynmdmo6cwiny

Review of Visual Saliency Detection with Comprehensive Information [article]

Runmin Cong, Jianjun Lei, Huazhu Fu, Ming-Ming Cheng, Weisi Lin, and Qingming Huang
2018 arXiv   pre-print
RGBD saliency detection, co-saliency detection, or video saliency detection.  ...  The goal of video saliency detection model is to locate the motion-related salient object in video sequences, which considers the motion cue and spatiotemporal constraint jointly.  ...  for co-saliency detection, and temporal constraint for video saliency detection.  ... 
arXiv:1803.03391v2 fatcat:htcmhlo32jhczehvvq6nmgzwam
« Previous Showing results 1 — 15 out of 1,490 results