Hierarchical substring caching for efficient content distribution to low-bandwidth clients

Utku Irmak, Torsten Suel
2005 Proceedings of the 14th international conference on World Wide Web - WWW '05  
While overall bandwidth in the internet has grown rapidly over the last few years, and an increasing number of clients enjoy broadband connectivity, many others still access the internet over much slower dialup or wireless links. To address this issue, a number of techniques for optimized delivery of web and multimedia content over slow links have been proposed, including protocol optimizations, caching, compression, and multimedia transcoding, and several large ISPs have recently begun to
more » ... y promote dialup acceleration services based on such techniques. A recent paper by Rhea, Liang, and Brewer proposed an elegant technique called value-based caching that caches substrings of files, rather than entire files, and thus avoids repeated transmission of substrings common to several pages or page versions. We propose and study a hierarchical substring caching technique that provides significant savings over this basic approach. We describe several additional techniques for minimizing overheads and perform an evaluation on a large set of real web access traces that we collected. In the second part of our work, we compare our approach to a widely studied alternative approach based on delta compression, and show how to integrate the two for best overall performance. The studied techniques are typically employed in a clientproxy environment, with each proxy serving a large number of clients, and an important aspect is how to conserve resources on the proxy while exploiting the significant memory and CPU power available on current clients.
doi:10.1145/1060745.1060757 dblp:conf/www/IrmakS05 fatcat:lk5uopncerey5pfrw3imxwg5fq