Segmentation Accuracy for Offline Arabic Handwritten Recognition Based on Bounding Box Algorithm

Ismail A.
2016 Communications on Applied Electronics  
Character segmentation plays an important role in the Arabic optical character recognition (OCR) system, because the letters incorrectly segmented perform to unrecognized character. Accuracy of character recognition depends mainly on the segmentation algorithm used. The domain of off-line handwriting in the Arabic script presents unique technical challenges and has been addressed more recently than other domains. Many different segmentation algorithms for off-line Arabic handwriting recognition
more » ... have been proposed and applied to various types of word images. This paper provides modify segmentation algorithm based on bounding box to improve segmentation accuracy using two main stages: preprocessing stage and segmentation stage. In preprocessing stage, used a set of methods such as noise removal, binarization, skew correction, thinning and slant correction, which retains shape of the character. In segmentation stage, the modify bounding box algorithm is done. In this algorithm a distance analysis use on bounding boxes of two connected components (CCs): main (CCs), auxiliary (CCs). The modified algorithm is presented and taking place according to three cases. Cut points also determined using structural features for segmentation character. The modified bounding box algorithm has been successfully tested on 450 word images of Arabic handwritten words. The results were very promising, indicating the efficiency of the suggested approach.
doi:10.5120/cae2016652364 fatcat:g7ohziifvrfr3peknbbrbd6dwm