Adaptive Median Filtering Algorithm Based on Divide and Conquer and Its Application in CAPTCHA Recognition

Wentao Ma, Jiaohua Qin, Xuyu Xiang, Yun Tan, Yuanjing Luo and Neal N. Xiong
2019 Computers Materials & Continua  
As the first barrier to protect cyberspace, the CAPTCHA has made significant contributions to maintaining Internet security and preventing malicious attacks. By researching the CAPTCHA, we can find its vulnerability and improve the security of CAPTCHA. Recently, many studies have shown that improving the image preprocessing effect of the CAPTCHA, which can achieve a better recognition rate by the state-of-theart machine learning algorithms. There are many kinds of noise and distortion in the
more » ... TCHA images of this experiment. We propose an adaptive median filtering algorithm based on divide and conquer in this paper. Firstly, the filtering window data quickly sorted by the data correlation, which can greatly improve the filtering efficiency. Secondly, the size of the filtering window is adaptively adjusted according to the noise density. As demonstrated in the experimental results, the proposed scheme can achieve superior performance compared with the conventional median filter. The algorithm can not only effectively detect the noise and remove it, but also has a good effect in preservation details. Therefore, this algorithm can be one of the most strong tools for various CAPTCHA image recognition and related applications. while removing noise, so they were quickly replaced by nonlinear denoising. The nonlinear filtering denoising algorithm [Varade, Dhotre and Pahurkar (2013) ] can effectively suppress the interference pulse and random noise, which can also preserve the edge information of the image. In recent years, with the development of computer vision technology such as autonomous driving, the real-time requirements for image preprocessing have been constantly improved, but the conventional median filtering has a slow sorting and cannot meet the real-time requirements, so improvements of the median filtering are always the focus of research. The conventional median filtering used bubble method for sorting pixel values, which needs to sort all pixels in the neighborhood to obtain the median value [Tukey (1974) ; Pitas and Venetsanopoulos (1992) ]. For N N × filter window, 2 2 N (N -1) / 2 comparison operations are required. Taking the 3 3 × filter window as an example, the comparison of median value is 36 times, which is a time-consuming process. Brodland et al. [Brodland and Veldhuis (1998) ] proposed a weighted median filter, the weight value of the central pixel is defined by the degree of noise pollution. This algorithm can effectively suppress noise and greatly reduce the complexity, which can meet the requirements of computer vision detection system in protecting the edge and details. Huang et al. [Huang, Yang and Tang (2003) ] make full use of the data correlation. By considering the relationship between the move-in value, move-out value and median value, which greatly improve the efficiency of the filtering process. On the basis of comparing various fast median filtering, Dai et al. [Dai, Xu and Piao et al. (2017) ] proposed an improved fast median filtering algorithm, which combined the sorting algorithm with the hardware system to effectively improve the processing speed. However, these algorithms only do a lot of work in improving the filtering efficiency, but they have not optimized the adaptiveness of filtering window. The window size of the conventional median filtering is fixed, and it is impossible to simultaneously denoise and preserve the image details [Zhang, Xu and Dong (2006) ]. Therefore, it is necessary to dynamically alter the size of the window during the filtering process. Zhang et al. [Zhang, Tang and Shi (2014) ] proposed a Recursive of Least Square (RLS) adaptive filtering which has good denoising performance and high precision. Bhadouria et al. [Bhadouria, Ghoshal and Siddiqi (2014); Roy, Singha and Manam et al. (2017) ] did a lot of optimization in the adaptiveness of filtering window, and also achieved good results, but they have neglected the complexity. Ding et al. [Ding, Niu and Lu et al. (2018); Fan, Han and Gou et al. (2018) ; Roy, Singha and Devi (2016)] used the convolutional neural network (CNN) and support vector machine (SVM) in the field of image recognition. Although this method has achieved good results, they have not tried to preprocess the original image input by the network, and outstanding image preprocessing algorithm combines with state-of-the-art machine learning algorithm will certainly improve the CAPTCHA recognition accuracy. In this paper, we propose an adaptive median filtering algorithm based on divide and conquer: (1) The conventional median filtering is mostly based on bubble sorting. This paper presents another idea: On the one hand, the median filtering based on the quick sorting algorithm is developed by using divide and conquer, which effectively
doi:10.32604/cmc.2019.05683 fatcat:ue5sf34ph5hfdjj5vo2hshbf6i