The content and access dynamics of a busy Web site

Venkata N. Padmanabhan, Lili Qiu
2000 Computer communication review  
In this paper, we study the dynamics of the MSNBC news site, one of the busiest Web sites in the Internet today. Unlike many other e orts that have analyzed client accesses as seen by proxies, we focus on the server end. We analyze the dynamics of both the server content and client accesses made to the server. The former considers the content creation and modi cation process while the latter considers page popularity and locality i n c l i e n t accesses. Some of our key results are: (a) les
more » ... d to change little when they are modi ed, (b) a small set of les tends to get modi ed repeatedly, (c) le popularity follows a Zipf-like distribution with a parameter that is much larger than reported in previous, proxy-based studies, and (d) there is signi cant temporal stability in le popularity but not much stability in the domains from which c l i e n ts access the popular content. We discuss the implications of these ndings for techniques such a s Web caching (including cache consistency algorithms), and prefetching or server-based \push" of Web content.
doi:10.1145/347057.347413 fatcat:d7x5wx3marb53atww2t5uidceu