Learning Robust Graph Regularisation for Subspace Clustering

Elyor Kodirov, Tao Xiang, Zhenyong Fu, Shaogang Gong
2016 Procedings of the British Machine Vision Conference 2016   unpublished
Various subspace clustering methods have benefited from introducing a graph regularisation term in their objective functions [2] . In this work, we identify two critical limitations of the graph regularisation term employed in existing subspace clustering models and provide solutions for both of them. First, the squared l 2 -norm used in the existing term is replaced by a l 1 -norm term to make the regularisation term more robust against outlying data samples and noise. Solving l 1 optimisation
more » ... problems is notoriously expensive and a new formulation and an efficient algorithm are provided to make our model tractable. Second, instead of assuming that the graph topology and weights are known a priori and fixed during learning, we propose to learn the graph [1] and integrate the graph learning into the proposed l 1 -norm graph regularised optimisation problem. Extensive experiments were conducted on five benchmark datasets. Methodology. To address the aforementioned problems, we propose following objective function: , W ≥ 0. (1) where X ∈ R r×N is a data matrix with N rdimensional data feature vectors as columns, D ∈ R r×d is a dictionary with d number of atoms, W is an affinity matrix that captures the topology of the data, A W is a matrix that is obtained by applying eigendecomposition on W, and Y ∈ R d×N is a sparse code matrix. In the following, we give explanation for each term: (1) X − DY 2 F is the reconstruction error term evaluating how well a linear combination of the atoms (columns) of the dictionary D, can approximate the data matrix X. (2) λ 1 Y 1 is a sparsity regularisation term on Y, with a weighting factor λ 1 to favour a small number of atoms to be used for the reconstruction. (3) λ 2 YA W 1 is our proposed robust graph regularisation term. Note that we are using l 1 -norm instead of l 2 -norm weighted by λ 2 . (4) λ 2 YA W 1 + λ 3 W 2 F is the term with proper constraints (W T 1 = 1 and W ≥ 0) for graph learning weighted by λ 2 and λ 3 . The constraints, W T 1 = 1 and W ≥ 0, are there to ensure the validity of the learned graph, while the constraint d i 2 ≤ 1 (d i is a column of D with i = 1, ... , r) enforces the learned dictionary atoms to be compact. (3) and (4) are robust graph regularisation and graph learning terms, while first two terms, (1) and (2) constitute the conventional objective function of dictionary learning. Remark. Terms Optimisation. To solve the objective Eq. (1), we develop an algorithm based on ADMM.
doi:10.5244/c.30.138 fatcat:acv4n7li3ndgfjqukjfplltk4y