Distortion Based Algorithms for Privacy Preserving Frequent Item Set Mining

K Srinivasa Rao, V Chiranjeevi
2011 International Journal of Data Mining & Knowledge Management Process  
Data mining services require accurate input data for their results to be meaningful, but privacy concerns may influence users to provide spurious information. In order to preserve the privacy of the client in data mining process, a variety of techniques based on random perturbation of data records have been proposed recently. We focus on an improved distortion process that tries to enhance the accuracy by selectively modifying the list of items. The normal distortion procedure does not provide
more » ... he flexibility of tuning the probability parameters for balancing privacy and accuracy parameters, and each item's presence/absence is modified with an equal probability. In improved distortion technique, frequent one item-sets, and nonfrequent one item-sets are modified with a different probabilities controlled by two probability parameters fp, nfp respectively. The owner of the data has a flexibility to tune these two probability parameters (fp and nfp) based on his/her requirement for privacy and accuracy. The experiments conducted on real time datasets confirmed that there is a significant increase in the accuracy at a very marginal cost in privacy.
doi:10.5121/ijdkp.2011.1402 fatcat:nquyw2ihmnhezptg6fqshvti6q