A Survey on Methods for Basic Unit Segmentation in Off-line Handwritten Text Recognition

Aysadet Abliz, Wujiahemaiti Simayi, Kamil Moydin, Askar Hamdulla
2016 International Journal of Future Generation Communication and Networking  
Studies on recognizing different kind of handwritten texts have been conducted and achieved great success for some letters. This paper reviews the segmentation techniques on English handwritten recognition, which is one of the most successful one up to date. Also, considering the very much relations between Arabic and Uyghur which we are aiming to get progress on its handwritten recognition technology, references from Arabic handwritten recognition are very much hoped to get. Characteristics of
more » ... Uyghur handwriting texts and some of the encountered difficulties are described. Then referencing the successful work on English and Arabic basic unit segmentation, this paper tries to give some suggestions for Uyghur basic unit segmentation research. An English word is made of one or more English letters. In normal writing of English, there are distances between each letters and between words. Usually, the distance between words is larger than the distance between the letters in a word. Of course, randomness during handwriting, adhesion between words and overlaps are common in handwritten document recognition. Therefore, in English word segmentation, a lot of researchers start from these writing characteristics of English to solve the problems of word segmentation. (a) shows words that overlap horizontally; (b) shows an inter-character gap (between the digits 2 and 7) that is larger than an inter-word gap (between the character A and the digit 5); (c) shows a text line where many inter-character and inter-word gaps with similar size.
doi:10.14257/ijfgcn.2016.9.11.13 fatcat:auxz4zldijendkvqscv34rqtvu