A Multimodal Data Mining Framework for Revealing Common Sources of Spam Images

Chengcui Zhang, Wei-Bang Chen, Xin Chen, Richa Tiwari, Lin Yang, Gary Warner
2009 Journal of Multimedia  
This paper proposes a multimodal framework that clusters spam images so that ones from the same spam source/cluster are grouped together. By identifying the common sources of spam images, we can provide evidence in tracking spam gangs. For this purpose, text recognition and visual feature extraction are performed. Subsequently, a two-level clustering method is applied where images with visually similar illustrations are first grouped together. Then the clustering result from the first level is
more » ... the first level is further refined using the textual clues (if applicable) contained in spam images. Our experimental results show the effectiveness of the proposed framework.
doi:10.4304/jmm.4.5.313-320 fatcat:szbuvmrbqfeqjdgzqjpqwbktra