A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2015; you can also visit the original URL.
The file type is
Previous studies have highlighted the rapidity with which new content arrives on the web. We study the extent to which this new content can be efficiently discovered in the crawling model. Our study has two parts. First, we employ a maximum cover formulation to study the inherent difficulty of the problem in a setting in which we have perfect estimates of likely sources of links to new content. Second, we relax the requirement of perfect estimates into a more realistic setting in whichdoi:10.1145/1242572.1242630 dblp:conf/www/DasguptaGKOPT07 fatcat:crjgmpw2njerviyer7agf7k77a