A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2008; you can also visit the original URL.
The file type is application/pdf
.
An adaptive crawler for locating hiddenwebentry points
2007
Proceedings of the 16th international conference on World Wide Web - WWW '07
In this paper we describe new adaptive crawling strategies to efficiently locate the entry points to hidden-Web sources. The fact that hidden-Web sources are very sparsely distributed makes the problem of locating them especially challenging. We deal with this problem by using the contents of pages to focus the crawl on a topic; by prioritizing promising links within the topic; and by also following links that may not lead to immediate benefit. We propose a new framework whereby crawlers
doi:10.1145/1242572.1242632
dblp:conf/www/BarbosaF07a
fatcat:jjwhm5ppojbnbfetxuuvmqae6m