A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2015; you can also visit the original URL.
The file type is application/pdf
.
Layout Analysis for Arabic Historical Document Images Using Machine Learning
2012
2012 International Conference on Frontiers in Handwriting Recognition
Page layout analysis is a fundamental step of any document image understanding system. We introduce an approach that segments text appearing in page margins (a.k.a side-notes text) from manuscripts with complex layout format. Simple and discriminative features are extracted in a connected-component level and subsequently robust feature vectors are generated. Multilayer perception classifier is exploited to classify connected components to the relevant class of text. A voting scheme is then
doi:10.1109/icfhr.2012.227
dblp:conf/icfhr/BukhariBAE12
fatcat:npw7po3mjrad7cg62bti736uzq