Web Communities Identification from Random Walks [chapter]

Jiayuan Huang, Tingshao Zhu, Dale Schuurmans
2006 Lecture Notes in Computer Science  
We propose a technique for identifying latent Web communities based solely on the hyperlink structure of the WWW, via random walks. Although the topology of the Directed Web Graph encodes important information about the content of individual Web pages, it also reveals useful meta-level information about user communities. Random walk models are capable of propagating local link information throughout the Web Graph, which can be used to reveal hidden global relationships between different regions
more » ... of the graph. Variations of these random walk models are shown to be effective at identifying latent Web communities and revealing link topology. To efficiently extract these communities from the stationary distribution defined by a random walk, we exploit a computationally efficient form of directed spectral clustering. The performance of our approach is evaluated in real Web applications, where the method is shown to effectively identify latent Web communities based on link topology only.
doi:10.1007/11871637_21 fatcat:g2bjphankbbfjaiapfhtmbtcdi