Vision Based Page Segmentation Algorithm: Extended and Perceived Success [chapter]

M. Elgin Akpınar, Yeliz Yes̨ilada
2013 Lecture Notes in Computer Science  
Web pages consist of different visual segments, serving different purposes. Typical structural segments are header, right or left columns and main content. Segments can also have nested structure which means some segments may include other segments. Understanding these segments is important in properly displaying web pages for small screen devices and in alternative forms such as audio for screen reader users. There exist different techniques in identifying visual segments in a web page. One
more » ... cessful approach is Vision Based Segmentation Algorithm (VIPS Algorithm) which uses both the underlying source code and also the visual rendering of a web page. However, there are some limitations of this approach and this paper explains how we have extended and improved VIPS and built it in Java. We have also conducted some online user evaluations to investigate how people perceive the success of the segmentation approach and in which granularity they prefer to see a web page segmented. This paper presents the preliminary results which show that, people perceive segmentation with higher granularity as better segmentation regardless of the web page complexity.
doi:10.1007/978-3-319-04244-2_22 fatcat:xkrfifegynbivg7cxeckrqfybe