A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2019; you can also visit the original URL.
The file type is application/pdf
.
Comparing Topic Coverage in Breadth-First and Depth-First Crawls Using Anchor Texts
[chapter]
2016
Lecture Notes in Computer Science
Web archives preserve the fast changing Web by repeatedly crawling its content. The crawling strategy has an influence on the data that is archived. We use link anchor text of two Web crawls created with different crawling strategies in order to compare their coverage of past popular topics. One of our crawls was collected by the National Library of the Netherlands (KB ) using a depth-first strategy on manually selected websites from the .nl domain, with the goal to crawl websites as completes
doi:10.1007/978-3-319-43997-6_11
fatcat:odndkvcgzbdpjcq7pzznn6yvda