Local descriptions for human action recognition from 3D reconstruction data

Georgios Th. Papadopoulos, Petros Daras
2014 2014 IEEE International Conference on Image Processing (ICIP)  
In this paper, a view-invariant approach to human action recognition using 3D reconstruction data is proposed. Initially, a set of calibrated Kinect sensors are employed for producing a 3D reconstruction of the performing subjects. Subsequently, a 3D flow field is estimated for every captured frame. For performing action recognition, the 'Bag-of-Words' methodology is followed, where Spatio-Temporal Interest Points (STIPs) are detected in the 4D space (xyzcoordinates plus time). A novel
more » ... el 3D flow descriptor is introduced, which among others incorporates spatial and surface information in the flow representation and efficiently handles the problem of defining 3D orientation at every STIP location. Additionally, typical 3D shape descriptors of the literature are used for producing a more complete representation. Experimental results as well as comparative evaluation using datasets from the Huawei/3DLife 3D human reconstruction and action recognition Grand Challenge demonstrate the efficiency of the proposed approach.
doi:10.1109/icip.2014.7025569 dblp:conf/icip/PapadopoulosD14 fatcat:5ltg2l7aonanzhaid6h3tmpjba