A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is application/pdf
.
Analysis of book documents' table of content based on clustering
2009
2009 10th International Conference on Document Analysis and Recognition
Table of contents (TOC) recognition has attracted a great deal of attention in recent years. After reviewing the merits and drawbacks of the existing TOC recognition methods, we have observed that book documents are multi-page documents with intrinsic local format consistency. Based on this finding we introduce an automatic TOC analysis method through clustering. This method first detects the decorative elements in TOC pages. Then it learns a layout model used in the TOC pages through
doi:10.1109/icdar.2009.143
dblp:conf/icdar/GaoTLTC09
fatcat:s36r7ryr7fcf5dtwsvrjexcunq