Splog Detection using Content, Time and Link Structures

Yu-Ru Lin, Hari Sundaram, Yun Chi, Jun Tatemura, Belle Tseng
2007 Multimedia and Expo, 2007 IEEE International Conference on  
This paper focuses on spam blog (splog) detection. Blogs are highly popular, new media social communication mechanisms and splogs corrupt blog search results as well as waste network resources. In our approach we exploit unique blog temporal dynamics to detect splogs. The key idea is that splogs exhibit high temporal regularity in content and post time, as well as consistent linking patterns. Temporal content regularity is detected using a novel autocorrelation of post content. Temporal
more » ... al regularity is determined using the entropy of the post time difference distribution, while the link regularity is computed using a HITS based hub score measure. Experiments based on the annotated ground truth on real world dataset show excellent results on splog detection tasks with 90% accuracy.
doi:10.1109/icme.2007.4285079 dblp:conf/icmcs/LinSCTT07 fatcat:oul5q7j6ovdydiatkodx42kqmq