Human action recognition from a single clip per action

Weilong Yang, Yang Wang, Greg Mori
2009 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops  
Learning-based approaches for human action recognition often rely on large training sets. Most of these approaches do not perform well when only a few training samples are available. In this paper, we consider the problem of human action recognition from a single clip per action. Each clip contains at most 25 frames. Using a patch based motion descriptor and matching scheme, we can achieve promising results on three different action datasets with a single clip as the template. Our results are
more » ... mparable to previously published results using much larger training sets. We also present a method for learning a transferable distance function for these patches. The transferable distance function learning extracts generic knowledge of patch weighting from previous training sets, and can be applied to videos of new actions without further learning. Our experimental results show that the transferable distance function learning not only improves the recognition accuracy of the single clip action recognition, but also significantly enhances the efficiency of the matching scheme.
doi:10.1109/iccvw.2009.5457663 dblp:conf/iccvw/Yang0M09 fatcat:6ib727kbu5f5ddwsgozikmudhy