A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2015; you can also visit the original URL.
The file type is
Layout Analysis for Arabic Historical Document Images Using Machine Learning
2012 International Conference on Frontiers in Handwriting Recognition
Page layout analysis is a fundamental step of any document image understanding system. We introduce an approach that segments text appearing in page margins (a.k.a side-notes text) from manuscripts with complex layout format. Simple and discriminative features are extracted in a connected-component level and subsequently robust feature vectors are generated. Multilayer perception classifier is exploited to classify connected components to the relevant class of text. A voting scheme is thendoi:10.1109/icfhr.2012.227 dblp:conf/icfhr/BukhariBAE12 fatcat:npw7po3mjrad7cg62bti736uzq