A Survey on Web Tracking: Mechanisms, Implications, and Defenses

Tomasz Bujlow, Valentin Carela-Espanol, Beom-Ryeol Lee, Pere Barlet-Ros
2017 Proceedings of the IEEE  
Privacy seems to be the Achilles' heel of today's web. Most web services make continuous efforts to track their users and to obtain as much personal information as they can from the things they search, the sites they visit, the people they contact, and the products they buy. This information is mostly used for commercial purposes, which go far beyond targeted advertising. Although many users are already aware of the privacy risks involved in the use of Internet services, the particular methods
more » ... nd technologies used for tracking them are much less known. In this survey, we review the existing literature on the methods used by web services to track the users online as well as their purposes, implications, and possible user's defenses. We present 5 main groups of methods used for user tracking, which are based on sessions, client storage, client cache, fingerprinting, and other approaches. A special focus is placed on mechanisms that use web caches, operational caches, and fingerprinting, as they are usually very rich in terms of using various creative methodologies. We also show how the users can be identified on the web and associated with their real names, e-mail addresses, phone numbers, or even street addresses. We show why tracking is being used and its possible implications for the users. For each of the tracking methods, we present possible defenses. Some of them are specific to a particular tracking approach, while others are more universal (block more than one threat). Finally, we present the future trends in user tracking and show that they can potentially pose significant threats to the users' privacy. 4 HTTP sessions 1994 HTTP cookies [13] 2000 Embedding identifiers in cached documents [14] Loading performance tests [14] DNS cache [14] 2007 HTTP authentication cache [15] 2008 window.name DOM property [16] Clickjacking [17] 2009 Flash cookies [18] Evercookies [19] 2010 Java JNLP PersistenceService [20] Silverlight Isolated Storage [20] IE userData storage [20] HTML5 Global, Local, and Session Storage [20] 2011 Flash LocalConnection object [21] ETags and Last-Modified HTTP headers [22] HTTP 301 redirect cache [23] TLS Session Resumption cache and TLS Session IDs [24] 2012 Cookie leaks/syncing [25] Advertising networks [25] HTTP Strict Transport Security cache [26] Browser instance fingerprinting using canvas [27] 2014 Web SQL DB and HTML5 IndexedDB [28] Headers attached to outgoing HTTP requests [29] 2016 ?
doi:10.1109/jproc.2016.2637878 fatcat:vpykj2neezgebjhh5imvulenpq