Flickr-tag prediction using multi-modal fusion and meta information

Yu-Chuan Su, Tzu-Hsuan Chiu, Guan-Long Wu, Chun-Yen Yeh, Felix Wu, Winston Hsu
2013 Proceedings of the 21st ACM international conference on Multimedia - MM '13  
We present our evaluation and analysis on Yahoo! Largescale Flickr-tag Image Classification dataset. Our evaluations show that combining multi-features and different classification models, the MAP of tag prediction can be significantly improve over ordinary linear classification. Further analysis shows that some tags are given not because of the visual content but the meta information of images. Our experiments show that we can make more accurate prediction on certain tags using meta
more » ... without any training process, compared with visual content based classifiers. Combine the meta information, multi-features and multimodels fusion, we achieve significantly better performance than simple linear classification. We also evaluate the performance of various mid-level feature, and the results suggest that "Concept Bank" feature may be a promising direction for the task.
doi:10.1145/2502081.2508117 dblp:conf/mm/SuCWYWH13 fatcat:5lsngzlwnjeknnv5ujkao3vtdy