Labelset anchored subspace ensemble (LASE) for multi-label annotation
Proceedings of the 2nd ACM International Conference on Multimedia Retrieval - ICMR '12
In multimedia retrieval, multi-label annotation for image, text and video is challenging and attracts rapidly growing interests in past decades. The main crux of multi-label annotation lies on 1) how to reduce the model complexity when the label space expands exponentially with the increase of the number of labels; and 2) how to leverage the label correlations which have broadly believed useful for boosting annotation performance. In this paper, we propose "labelsets anchored subspace ensemble
... LASE)" to solve both problems in an efficient scheme, whose training is a regularized matrix decomposition and prediction is an inference of group sparse representations. In order to shrink the label space, we firstly introduce "label distilling" extracting the frequent labelsets to replace the original labels. In the training stage, the data matrix is decomposed as the sum of several low-rank matrices and a sparse residual via a randomized optimization, where each low-rank part defines a feature subspace mapped by a labelset. A manifold regularization is applied to map the labelset geometry to the geometry of the obtained subspaces. In the prediction stage, the group sparse representation of a new sample on the subspace ensemble is estimated by group lasso. The selected subspaces indicate the labelsets that the sample should be annotated with. Experiments on several benchmark datasets of texts, images, web data and videos validate the appealing performance of LASE in multi-label annotation.