A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Towards complete coverage in focused web harvesting
2015
Proceedings of the 17th International Conference on Information Integration and Web-based Applications &Services - iiWAS '15
With the goal of harvesting all information about a given entity, in this paper, we try to harvest all matching documents for a given query submitted on a search engine. The objective is to retrieve all information about for instance "Michael Jackson", "Islamic State", or "FC Barcelona" from indexed data in search engines, or hidden data behind web forms, using a minimum number of queries. Policies of web search engines usually do not allow accessing all of the matching query search results for
doi:10.1145/2837185.2837208
dblp:conf/iiwas/KhelghatiHK15
fatcat:35omkmlxzzazfm7ksqmnukrhei