3D CNNs on Distance Matrices for Human Action Recognition

Alejandro Hernandez Ruiz, Lorenzo Porzi, Samuel Rota Bulò, Francesc Moreno-Noguer
2017 Proceedings of the 2017 ACM on Multimedia Conference - MM '17  
In this paper we are interested in recognizing human actions from sequences of 3D skeleton data. For this purpose we combine a 3D Convolutional Neural Network with body representations based on Euclidean Distance Matrices (EDMs), which have been recently shown to be very e ective to capture the geometric structure of the human pose. One inherent limitation of the EDMs, however, is that they are de ned up to a permutation of the skeleton joints, i.e., randomly shu ing the ordering of the joints
more » ... ring of the joints yields many di erent representations. In oder to address this issue we introduce a novel architecture that simultaneously, and in an end-to-end manner, learns an optimal transformation of the joints, while optimizing the rest of parameters of the convolutional network. e proposed approach achieves state-of-the-art results on 3 benchmarks, including the recent NTU RGB-D dataset, for which we improve on previous LSTM-based methods by more than 10 percentage points, also surpassing other CNN-based methods while using almost 1000 times fewer parameters.
doi:10.1145/3123266.3123299 dblp:conf/mm/RuizPBM17 fatcat:niyqtwr2fngtpp2p3amsaquvm4