Multimodal representation, indexing, automated annotation and retrieval of image collections via non-negative matrix factorization

Juan C. Caicedo, Jaafar BenAbdallah, Fabio A. González, Olfa Nasraoui
2012 Neurocomputing  
Massive image collections are increasingly available on the Web. These collections often incorporate complementary nonvisual data such as text descriptions, comments, user ratings and tags. These additional data modalities may provide a semantic complement to the image visual content, which could improve the performance of different image content analysis tasks. This paper presents a novel method based on non-negative matrix factorization to generate multimodal image representations that
more » ... te visual features and text information. The proposed approach discovers a set of latent factors that correlate multimodal data in the same representation space. We evaluated the potential of this multimodal image representation in various tasks associated to image indexing and search. Experimental results show that the proposed method highly outperforms the response of the system in both tasks, when compared to multimodal latent semantic spaces generated by a singular value decomposition.
doi:10.1016/j.neucom.2011.04.037 fatcat:6joubook3jd5zljqxj34thjbna