Learning View-Invariant Sparse Representations for Cross-View Action Recognition

Jingjing Zheng, Zhuolin Jiang
2013 2013 IEEE International Conference on Computer Vision  
We present an approach to jointly learn a set of viewspecific dictionaries and a common dictionary for crossview action recognition. The set of view-specific dictionaries is learned for specific views while the common dictionary is shared across different views. Our approach represents videos in each view using both the corresponding view-specific dictionary and the common dictionary. More importantly, it encourages the set of videos taken from different views of the same action to have similar
more » ... ion to have similar sparse representations. In this way, we can align view-specific features in the sparse feature spaces spanned by the viewspecific dictionary set and transfer the view-shared features in the sparse feature space spanned by the common dictionary. Meanwhile, the incoherence between the common dictionary and the view-specific dictionary set enables us to exploit the discrimination information encoded in viewspecific features and view-shared features separately. In addition, the learned common dictionary not only has the capability to represent actions from unseen views, but also makes our approach effective in a semi-supervised setting where no correspondence videos exist and only a few labels exist in the target view. Extensive experiments using the multi-view IXMAS dataset demonstrate that our approach outperforms many recent approaches for cross-view action recognition.
doi:10.1109/iccv.2013.394 dblp:conf/iccv/ZhengJ13 fatcat:v74m6wjzl5da7ncabsk3icepsa