Learning a Multi-concept Video Retrieval Model with Multiple Latent Variables

Amir Mazaheri, Boqing Gong, Mubarak Shah
2016 2016 IEEE International Symposium on Multimedia (ISM)  
Effective and efficient video retrieval has become a pressing need in the "big video" era and how to deal with multi-concept queries is a central component. The objective of this work is to provide a principled model for calculating the ranking scores of video in response to multiple concepts. However, it has been long overlooked and simply implemented by weighted averaging the corresponding concept detectors' scores. Our approach, which can be considered as a latent ranking SVM, integrates the
more » ... advantages of various recent works on text and image retrieval, such as choosing ranking over structured prediction and modeling inter-dependencies between querying concepts and the others. Videos consist of shots and we use latent variables to account for the mutually complementary cues within and across shots. We introduce a simple and effective way to make our model robust to outliers and scarce data. Our approach gives rise to superior performance when it is tested on not only the queries seen at training, but also novel queries, some of which consist of more concepts than the seen queries used for training.
doi:10.1109/ism.2016.0132 dblp:conf/ism/MazaheriGS16 fatcat:mjuacchiv5h3lauentz6mpbfuy