NOVATEUR PUBLICATIONS INTERNATIONAL JOURNAL OF INNOVATIONS IN ENGINEERING RESEARCH AND TECHNOLOGY [IJIERT] SMART CRAWLER: A TWO-STAGE CRAWLER FOR EFFICIENTLY HARVESTING DEEP-WEB INTERFACES

Ms Asmita, D Rathod
2016 unpublished
Deep web growing at a very fast pace, lot of speculations in techniques this techniques has been added that help efficiently locate deep-web interfaces. However, due to the large volume of web resources and the dynamic nature of deep web, achieving wide coverage and high efficiency is a challenging issue. In this paper author has proposed a two-stage framework, namely Smart Crawler, for efficient harvesting deep web interfaces. Smart Crawler performs site-based searching for center pages by
more » ... g search engines, avoiding visiting a large number of pages. To achieve more accurate results for a focused crawl, Smart Crawler techniques prioritize websites to highly relevant ones for a given topic. Smart Crawler achieves fast in-site searching by finding most relevant links with an adaptive link-ranking. To eliminate bias on visiting some relevant links in hidden web directories, author has designed a link tree data structure to achieve wider coverage for a website.
fatcat:kypvo4tnznawnd6hln7mv7bqbm