Performance Boosting of Scale and Rotation Invariant Human Activity Recognition (HAR) with LSTM Networks Using Low Dimensional 3D Posture Data in Egocentric Coordinates

Ibrahim Furkan Ince
2020 Applied Sciences  
Human activity recognition (HAR) has been an active area in computer vision with a broad range of applications, such as education, security surveillance, and healthcare. HAR is a general time series classification problem. LSTMs are widely used for time series classification tasks. However, they work well with high-dimensional feature vectors, which reduce the processing speed of LSTM in real-time applications. Therefore, dimension reduction is required to create low-dimensional feature space.
more » ... s it is experimented in previous study, LSTM with dimension reduction yielded the worst performance among other classifiers, which are not deep learning methods. Therefore, in this paper, a novel scale and rotation invariant human activity recognition system, which can also work in low dimensional feature space is presented. For this purpose, Kinect depth sensor is employed to obtain skeleton joints. Since angles are used, proposed system is already scale invariant. In order to provide rotation invariance, body relative direction in egocentric coordinates is calculated. The 3D vector between right hip and left hip is used to get the horizontal axis and its cross product with the vertical axis of global coordinate system assumed to be the depth axis of the proposed local coordinate system. Instead of using 3D joint angles, 8 number of limbs and their corresponding 3D angles with X, Y, and Z axes of the proposed coordinate system are compressed with several dimension reduction methods such as averaging filter, Haar wavelet transform (HWT), and discrete cosine transform (DCT) and employed as the feature vector. Finally, extracted features are trained and tested with LSTM (long short-term memory) network, which is an artificial recurrent neural network (RNN) architecture. Experimental and benchmarking results indicate that proposed framework boosts the performance of LSTM by approximately 30% accuracy in low-dimensional feature space.
doi:10.3390/app10238474 fatcat:gbm66ezmgrg6nai5qzung5fmum