A Synthetic Fusion Rule for Salient Region Detection under the Framework of DS-Evidence Theory

Naeem Ayoub, Zhenguo Gao, Bingcai Chen, Muwei Jian
2018 Symmetry  
Saliency detection is one of the most valuable research topics in computer vision. It focuses on the detection of the most significant objects/regions in images and reduces the computational time cost of getting the desired information from salient regions. Local saliency detection or common pattern discovery schemes were actively used by the researchers to overcome the saliency detection problems. In this paper, we propose a bottom-up saliency fusion method by taking into consideration the
more » ... nsideration the importance of the DS-Evidence (Dempster-Shafer (DS)) theory. Firstly, we calculate saliency maps from different algorithms based on the pixels-level, patches-level and region-level methods. Secondly, we fuse the pixels based on the foreground and background information under the framework of DS-Evidence theory (evidence theory allows one to combine evidence from different sources and arrive at a degree of belief that takes into account all the available evidence). The development inclination of image saliency detection through DS-Evidence theory gives us better results for saliency prediction. Experiments are conducted on the publicly available four different datasets (MSRA, ECSSD, DUT-OMRON and PASCAL-S). Our saliency detection method performs well and shows prominent results as compared to the state-of-the-art algorithms. 2 of 15 saliency information extraction. These methods can be categorized into pixels-based, patches-based and region-based methods. A seminal pixel-based saliency detection method was introduced by Itti et al. [1] in which saliency information was obtained from pixel-level features and center-surround differences. Achanta et al. [10] adopted a saliency detection approach based on the frequency features of the images by exploiting the pixel-wise difference with mean shift segmentation and calculate the saliency by disregarding high frequencies arising from texture, noise on each pixel. These methods have some shortcomings like boundary blurring, and poorly segmenting the salient object due to the interior suppression. Due to these shortcomings of pixel-based methods, researchers introduced a patch level saliency detection method. Margolin et al. [11] proposed a method in which they used the Principal Component Analysis to represent the set of patches and ignored all of other patches in the image. Recently, Wang et al. [12] proposed a method based on scene-level analysis and patch-level inference to support nearest semantics to get saliency information. However, they used region based segmentation of image patches and made use of other cues to refine the saliency detection. They also used the scene-level analysis with patch-level to overcome the inefficiency of the patch-level based methods. In other words, pure patch-based methods cannot achieve satisfying results. In contrast to overcoming the patches and pixels based saliency detection methods, region based saliency detection methods have been introduced. In region-based methods, images are segmented according to the region-level. Due to irregular regions in some images, those methods can be sub-categorized and explored on the base of the regions with irregular sizes and shapes [13] [14] [15] [16] and regions with regular sizes and shapes [17] [18] [19] [20] [21] . Wei et al. [13] introduced a region-based saliency detection method by focusing on the background more than the salient region and exploited two common priors about backgrounds boundary and connectivity priors. Perazzi et al. [14] proposed getting saliency information by decomposing the given image into a group of homogeneous elements and generating a pixel-wise saliency map. Cheng et al. [15] used pixels' appearance information based on the spatial distribution and similarity for salient region detection. Cheng et al. [16] evaluated the saliency with spatial weighted coherence scores and global contrast differences, which gives prominent results, but this method can not perform well in all cases. Achanta et al. [17] introduced a simple linear iterative clustering (SLIC) super-pixel method for segmenting the salient region based on mid-level visual features. The absorption Markov chain has been used by Jiang et al. [18] in which the transient nodes on the image boundaries are computed first and absorbing nodes are treated as virtual boundary nodes for estimation of the salient regions based on the background and foreground. Yang et al. [19] used a method for saliency estimation by treating super-pixels as nodes and these nodes are divided into subsets of similarity to background and foreground queries. Hence, saliency is computed based on the two non-overlapping regions, as background and salient region. Xie et al. [20] also used the low and mid-level visual features of the image to define the regions as background and foreground. In this method, firstly, the salient region is estimated via color features and foreground region is defined using convex hull. Secondly, super-pixels are used to define salient regions based on the mid-level visual features. Following Xie's saliency detection framework, Ayoub et al. [21] also employed Bayesian framework for saliency detection. They calculated color frequency features of the images by employing Log-Gabor filter and calculated the salient region by splitting the regions into foreground and background with convex hull. This method shows prominent results as compared to the rest of the methods, but, as it uses color features, this method can not perform on gray scale images. Almost all methods have their importance in conducting image saliency, but, due to the use of the pixels', patches' and region based information, these methods cannot perform well in all cases. Even different saliency detection methods are complemented to each other [22] . Therefore, fusing saliency maps of predefined functions based on the pixels', patches' and regional-level information gives impressive results. For better saliency estimation, in this paper, we introduce a new method to fuse different saliency maps obtained by different predefined algorithms, based on the color, motion, depth, patches, pixels and regional level information with Dempster-Shafer (DS)-Evidence theory. In this paper, the remaining sections are organized as follows. The related work based on pixel-level,
doi:10.3390/sym10060183 fatcat:jqnlft67njhrbgvy6oumu6hcgi