A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2015; you can also visit the original URL.
The file type is application/pdf
.
An Improvised Algorithm for Relevant Content Extraction from Web Pages
2014
Journal of Emerging Technologies in Web Intelligence
World Wide Web (WWW) is now a famous medium by which people all around the world can spread and gather information of all kind. However, there is large amount of irrelevant redundant and information on web pages also. Such information makes various web mining tasks web page crawling, web page classification, link based ranking and topic distillation complex. Previously, the relevant content was extracted only from textual part of web pages. But now-a-days the content on web page is not only in
doi:10.4304/jetwi.6.2.226-230
fatcat:alwwwgjqhfhcvmxvgabxoseo5m