Cholesky Decomposition Based Metric Learning for Video-based Human Action Recognition

Si Chen, Yuanyuan Shen, Yan Yan, Dahan Wang, Shunzhi Zhu
2020 IEEE Access  
Video-based human action recognition can understand human actions and behaviours in the video sequences, and has wide applications for health care, human-machine interaction and so on. Metric learning, which learns a similarity metric, plays an important role in human action recognition. However, learning a full-rank matrix is usually inefficient and easily leads to overfitting. In order to overcome the above issues, a common way is to impose the low-rank constraint on the learned matrix. This
more » ... aper proposes a novel Cholesky decomposition based metric learning (CDML) method for effective video-based human action recognition. Firstly, the improved dense trajectories technique and the vector of locally aggregated descriptor (VLAD) are respectively used for feature detection and feature encoding. Then, considering the high dimensionality of VLAD features, we propose to learn a similarity matrix by taking advantage of Cholesky decomposition, which decomposes the matrix into the product between a lower triangular matrix and its symmetric matrix. Different from the traditional low-rank metric learning methods that explicitly adopt the low-rank constraint to learn the matrix, the proposed algorithm achieves such a constraint by controlling the rank of the lower triangular matrix, thus leading to high computational efficiency. Experimental results on the public video dataset show that the proposed method achieves the superior performance compared with several state-of-the-art methods. INDEX TERMS Human action recognition, metric learning, Cholesky decomposition. VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see http://creativecommons.org/licenses/by/4.0/ SI CHEN received the Ph.D. degree from Xiamen
doi:10.1109/access.2020.2966329 fatcat:vkekokypynghhfohktkeihs5y4