1 Hit in 0.035 sec

Template Extraction from Heterogeneous Web Pages with Cosine Similarity

Kulkarni A.H., Patil B. M.
2014 International Journal of Computer Applications  
Now a day's detection of templates from a large number of web pages has received a lot of attention. Template detection technique improves the performance of clustering, classification & search engines. In our work we proposed a novel algorithm by using cosine similarity based Template Extraction. We are using the cosine similarity approach to cluster the web documents. With the help of underlying structure of web documents we found the template for individual cluster. Our experimental
more » ... perimental evaluation show that our approach is effective in terms of computing Time and Clustering cost.
doi:10.5120/15186-3546 fatcat:l5xkmqhynvcuriq4jm4sn3agli