A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Action2Vec: A Crossmodal Embedding Approach to Action Learning
[article]
2019
arXiv
pre-print
Our approach uses a hierarchical recurrent network to capture the temporal structure of video features. ...
We describe a novel cross-modal embedding space for actions, named Action2Vec, which combines linguistic cues from class labels with spatio-temporal features derived from video clips. ...
In order to get a single embedding for an action class, we average the Action2Vec vectors for all videos of that action class. ...
arXiv:1901.00484v1
fatcat:umxyer4iurgjte2wkvl5egglw4