Mining association rules with non-uniform privacy concerns

Yi Xia, Yirong Yang, Yun Chi
2004 Proceedings of the 9th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery - DMKD '04  
Privacy concerns have become an important issue in data mining. A popular way to preserve privacy is to randomize the dataset to be mined in a systematic way and mine the randomized dataset instead. On the other hand, people usually have different privacy concerns for different attributes in data. E.g., in survey data, the sensitivity of questions varies. Appropriate use of this information can lead to more accurate data mining results. However, this information has not been fully utilized by
more » ... fully utilized by many privacy preserving association rule mining algorithms. In this paper, we generalize the privacy preserving association rule mining problem by allowing different attributes to have different levels of privacy, that is, using different randomization factors for values of different attributes in the randomization process. We also propose an efficient algorithm RE (Recursive Estimation) to estimate the support of itemsets under this framework. Both theoretical and empirical results show that the use of non-uniform randomization factors improves the accuracy of the support estimates, compared to the use of one conservative randomization factor.
doi:10.1145/1008694.1008699 dblp:conf/dmkd/XiaYC04 fatcat:shyb3dvvnbhcholakkkws7ictu