A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
An End-to-end 3D Convolutional Neural Network for Action Detection and Segmentation in Videos
[article]
2017
arXiv
pre-print
In this paper, we propose an end-to-end 3D CNN for action detection and segmentation in videos. The proposed architecture is a unified deep network that is able to recognize and localize action based on 3D convolution features. A video is first divided into equal length clips and next for each clip a set of tube proposals are generated based on 3D CNN features. Finally, the tube proposals of different clips are linked together and spatio-temporal action detection is performed using these linked
arXiv:1712.01111v1
fatcat:nmaal54fgna7bkxl3vcgyvyh7a