Integration of statistical detector and Gaussian noise injection detector for adversarial example detection in deep neural networks

Weiqi Fan, Guangling Sun, Yuying Su, Zhi Liu, Xiaofeng Lu
2019 Multimedia tools and applications  
Adversarial attack is a technique that causes a malfunction of classification models by adding noise that cannot be distinguished by humans, which poses a threat to a deep learning model. In this paper, we propose an efficient method to detect adversarial images using Gaussian process regression. Existing deep learning-based adversarial detection methods require numerous adversarial images for their training. The proposed method overcomes this problem by performing classification based on the
more » ... atistical features of adversarial images and clean images that are extracted by Gaussian process regression with a small number of images. This technique can determine whether the input image is an adversarial image by applying Gaussian process regression based on the intermediate output value of the classification model. Experimental results show that the proposed method achieves higher detection performance than the other deep learning-based adversarial detection methods for powerful attacks. In particular, the Gaussian process regression-based detector shows better detection performance than the baseline models for most attacks in the case with fewer adversarial examples.
doi:10.1007/s11042-019-7353-6 fatcat:3jlzbd37pzcvneuyis7ai3vkki