Relative privacy threats and learning from anonymized data

Michele Boreale, Fabio Corradi, Cecilia Viscardi
2019 IEEE Transactions on Information Forensics and Security  
We consider group-based anonymization schemes, 1 a popular approach to data publishing. This approach aims 2 at protecting privacy of the individuals involved in a dataset, 3 by releasing an obfuscated version of the original data, where 4 the exact correspondence between individuals and attribute 5 values is hidden. When publishing data about individuals, one 6 must typically balance the learner's utility against the risk 7 posed by an attacker, potentially targeting individuals in the 8
more » ... t. Accordingly, we propose a unified Bayesian model of 9 group-based schemes and a related MCMC methodology to learn 10 the population parameters from an anonymized table. This allows 11 one to analyze the risk for any individual in the dataset to be 12 linked to a specific sensitive value, when the attacker knows 13 the individual's nonsensitive attributes, beyond what is implied 14 for the general population. We call this relative threat analysis. 15 Finally, we illustrate the results obtained with the proposed 16 methodology on a real-world dataset.
doi:10.1109/tifs.2019.2937640 fatcat:oketvtntrvde7hfj2lroufoyma