Learning Image Anchor Templates for Document Classification and Data Extraction

Prateek Sarkar
2010 2010 20th International Conference on Pattern Recognition  
Image anchor templates are used in document image analysis for document classification, data localization, and other tasks. Current tools allow human operators to mark out small sub-images from documents to act as anchor templates. However, this requires time, and expertise because operators have to make informed decisions based on behavior of the template matching algorithms, and the expected degradations patterns in documents. We propose learning templates for a task automatically and quickly
more » ... from a few training examples. Document classification or data localization can be done more robustly by combining evidence from many more discriminating templates (e.g., hundreds) than would be practicable for operators to specify.
doi:10.1109/icpr.2010.837 dblp:conf/icpr/Sarkar10 fatcat:zw5hwuekljh37kcj3ohvjvb2aa