A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2019; you can also visit the original URL.
The file type is application/pdf
.
Action Classification and Highlighting in Videos
[article]
2017
arXiv
pre-print
Inspired by recent advances in neural machine translation, that jointly align and translate using encoder-decoder networks equipped with attention, we propose an attentionbased LSTM model for human activity recognition. Our model jointly learns to classify actions and highlight frames associated with the action, by attending to salient visual information through a jointly learned soft-attention networks. We explore attention informed by various forms of visual semantic features, including those
arXiv:1708.09522v1
fatcat:4kmeonpinnbdtkula2pt4e6ufu