Quantifying Web-Search Privacy
Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security - CCS '14
Web search queries reveal extensive information about users' personal lives to the search engines and Internet eavesdroppers. Obfuscating search queries through adding dummy queries is a practical and user-centric protection mechanism to hide users' search intentions and interests. Despite few such obfuscation methods and tools, there is no generic quantitative methodology for evaluating users' web-search privacy. In this paper, we provide such a methodology. We formalize adversary's background
... knowledge and attacks, the users' privacy objectives, and the algorithms to evaluate effectiveness of query obfuscation mechanisms. We build upon machine-learning algorithms to learn the linkability between user queries. This encompasses the adversary's knowledge about the obfuscation mechanism and the users' web-search behavior. Then, we quantify privacy of users with respect to linkage attacks. Our generic attack can run against users for which the adversary does not have any background knowledge, as well as for the cases where some prior queries from the target users are already observed. We quantify privacy at the query level (the link between user's queries) and the semantic level (user's topics of interest). We design a generic tool that can be used for evaluating generic obfuscation mechanisms, and users with different web search behavior. To illustrate our approach in practice, we analyze and compare privacy of users for two example obfuscation mechanisms on a set of real web-search logs.