Search-based automatic image annotation via Flickr photos using tag expansion

Liang-Chi Hsieh, Winston H. Hsu
2010 2010 IEEE International Conference on Acoustics, Speech and Signal Processing  
Exponentially growing photo collections motivate the needs for automatic image annotation for effective manipulations (e.g., search, browsing). Most of the prior works rely on supervised learning approaches and are not practical due to poor performance, out-ofvocabulary problem, and being time-consuming in acquiring training data and learning. In this work, we argue automatic image annotation by search over user-contributed photo sites (e.g., Flickr), which have accumulated rich human knowledge
more » ... and billions of photos. The intuition is to leverage surrounding tags from those visually similar Flickr photos for the unlabeled image. However, the tags are generally few and noisy. To tackle such challenges, we propose a novel solution in three folds: (1) a tag expansion method to solve the sparsity of user-contributed tags; (2) improving tag relevance estimation by visual consistency between candidate annotations and the unlabeled image, and (3) the semantic tag consistence among candidate tags. Experimenting over Flickr photo benchmarks and requiring no additional keywords, we show that the proposed method significantly outperforms prior works and even provide more diverse annotations. Index Terms-Search-based automatic image annotation, tag expansion
doi:10.1109/icassp.2010.5496215 fatcat:6ysc67uoejgx3i4bdosuuoztlu