A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Learning Transferable Self-attentive Representations for Action Recognition in Untrimmed Videos with Weak Supervision
[article]
2019
arXiv
pre-print
Action recognition in videos has attracted a lot of attention in the past decade. In order to learn robust models, previous methods usually assume videos are trimmed as short sequences and require ground-truth annotations of each video frame/sequence, which is quite costly and time-consuming. In this paper, given only video-level annotations, we propose a novel weakly supervised framework to simultaneously locate action frames as well as recognize actions in untrimmed videos. Our proposed
arXiv:1902.07370v1
fatcat:ajhytmeyuran7kl4rnl35vzejy