Large-Scale Spectral Clustering Based on Representative Points
Mathematical Problems in Engineering
Spectral clustering (SC) has attracted more and more attention due to its effectiveness in machine learning. However, most traditional spectral clustering methods still face challenges in the successful application of large-scale spectral clustering problems mainly due to their high computational complexity οn3, where n is the number of samples. In order to achieve fast spectral clustering, we propose a novel approach, called representative point-based spectral clustering (RPSC), to efficiently
... deal with the large-scale spectral clustering problem. The proposed method first generates two-layer representative points successively by BKHK (balanced k-means-based hierarchical k-means). Then it constructs the hierarchical bipartite graph and performs spectral analysis on the graph. Specifically, we construct the similarity matrix using the parameter-free neighbor assignment method, which avoids the need to tune the extra parameters. Furthermore, we perform the coclustering on the final similarity matrix. The coclustering mechanism takes advantage of the cooccurring cluster structure among the representative points and the original data to strengthen the clustering performance. As a result, the computational complexity can be significantly reduced and the clustering accuracy can be improved. Extensive experiments on several large-scale data sets show the effectiveness, efficiency, and stability of the proposed method.