Spatiotemporal representation learning for video anomaly detection

Zhaoyan Li, Yaoshun Li, Zhisheng Gao
2020 IEEE Access  
Video-based anomalous human behavior detection is widely studied in many fields such as security, medical care, education, and energy. However, there are still some open problems in anomalous behavior detection, such as the large and complicated model is difficult to train, the accuracy of anomalous behavior detection is not high enough and the speed is not fast enough. A spatiotemporal representation learning model is proposed in this paper. Firstly, the spatial-temporal features of the video
more » ... re extracted by the constructed multi-scale 3D convolutional neural network. Then the scene background is modeled by the high-dimensional mixed Gaussian model and used for anomaly detection. Finally, the accurate position of anomalous behavior in the video data is achieved by calculating the position of the last output feature, that is, the position of the receptive field. The proposed model does not require specific training. Moreover, the proposed method has the advantages of high versatility, fast calculation speed and high detection accuracy. We validated the proposed algorithm on two representative surveillance scene datasets, the Subway and the UCSDSped2. Results show that proposed algorithm has achieved the detection rate of 18 FPS under the condition of common computing resources, and meet the real-time requirements. Moreover, compared the similar methods, the proposed method has achieved the competitive results in both frame-level accuracy and pixel-level accuracy. INDEX TERMS Spatiotemporal representation learning, anomaly detection, 3D convolutional neural network, mixed Gaussian model.
doi:10.1109/access.2020.2970497 fatcat:tlaxd6pxxngczkvzm3sz3aqcvu