Web Image Semantic Clustering [chapter]

Zhiguo Gong, Leong Hou U, Chan Wa Cheang
2005 Lecture Notes in Computer Science  
This paper provides a novel Web image clustering methodology based on their associated texts. In our approach, the semantics of Web images are firstly represented into vectors of term-weight pairs. In order to correctly correlate terms to a Web image, the associated text of the Web image is partitioned into semantic blocks according to the semantic structure of the text with respect to the Web images. The weight of a term in the vector of an embedded Web image is calculated with respect to both
more » ... its local occurrence in semantic blocks and the distances of the blocks to the image. With this method, 'Web image clustering' is transformed into 'term vector clustering'. And a feature based solution is employed in our solution. To reach this objective, we define the associate relations between two terms based on their co-occurrence in the associated text of the Web images. Thus, a term semantic network (TSN) is constructed with terms as the nodes and associate relationships as the edges. To cluster terms in TSN, CHAMELEON algorithm is utilized. In order to determine the significances of terms in each cluster, HITS algorithm is applied. Finally, web images are assigned to different clusters based on the similarity between image term vectors and the term vector of the clusters.
doi:10.1007/11575801_30 fatcat:pdl2334v6zeufjgpcs34teg76q