Privacy-preserving Constrained Spectral Clustering Algorithm for Large-scale Data Sets

Wenfen Liu, Ji Li, Jianghong Wei, Mao Ye, Xuexian Hu
2019 IET Information Security  
With the increasing concern on the preservation of personal privacy, privacy-preserving data mining has become a hot topic in recent years. Spectral clustering is one of the most widely used clustering algorithm for exploratory data analysis and usually has to deal with sensitive data sets. How to conduct privacy-preserving spectral clustering is an urgent problem to be solved. In this study, the authors focus on introducing the notion of differential privacy, which is considered as the de
more » ... standard of privacy-preserving data analysis, into spectral clustering. Specifically, by combining the well-studied constrained spectral clustering with the Wishart mechanism in a novel way, the authors propose a differentially private constrained spectral clustering (DP-CSC) algorithm. The DP-CSC algorithm is proved to capture asymptotic property and achieves ϵ-differential privacy. To illustrate the effectiveness and efficiency of DP-CSC, the authors conduct experiments on five real-word data sets. The results indicate that the DP-CSC algorithm can provide acceptable clustering accuracy with short running time while preserving individual privacy.
doi:10.1049/iet-ifs.2019.0255 fatcat:qpk72f4shzb45k6aomtbrxcbwm