A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is application/pdf
.
A focused crawler combinatory link and content model based on T-Graph principles
2016
Computer Standards & Interfaces
The two significant tasks of a focused Web crawler are finding relevant topic-specific documents on the Web and analytically prioritizing them for later effective and reliable download. For the first task, we propose a sophisticated custom algorithm to fetch and analyze the most effective HTML structural elements of the page as well as the topical boundary and anchor text of each unvisited link, based on which the topical focus of an unvisited page can be predicted and elicited with a high
doi:10.1016/j.csi.2015.07.001
fatcat:plfwvmmtnfaktowcqtslnv4l4e