Information-theoretic semantic multimedia indexing

João Magalhães, Stefan Rüger
2007 Proceedings of the 6th ACM international conference on Image and video retrieval - CIVR '07  
To solve the problem of indexing collections with diverse text documents, image documents, or documents with both text and images, one needs to develop a model that supports heterogeneous types of documents. In this paper, we show how information theory supplies us with the tools necessary to develop a unique model for text, image, and text/image retrieval. In our approach, for each possible query keyword we estimate a maximum entropy model based on exclusively continuous features that were
more » ... rocessed. The unique continuous feature-space of text and visual data is constructed by using a minimum description length criterion to find the optimal feature-space representation (optimal from an information theory point of view). We evaluate our approach in three experiments: only text retrieval, only image retrieval, and text combined with image retrieval.
doi:10.1145/1282280.1282368 dblp:conf/civr/MagalhaesR07 fatcat:buqjdgf4dfdf5l76i7xyivjwou