Wavelet domain textual coding of Ottoman script images

Oemer N. Gerek, Enis A. Cetin, Ahmed H. Tewfik, Rashid Ansari, Mark J. T. Smith
1996 Visual Communications and Image Processing '96  
Image coding using Wavelet Transform, DCT and similar transform techniques is well established. On the other hand, these coding methods neither take into account the special characteristics of the images in a database nor are they suitable for fast database search. In this paper, the digital archiving of Ottoman printings is considered. Ottoman documents are printed in Arabic letters. In [1] , Witten et al. describes a scheme based on finding the characters in binary document images and
more » ... the positions of the repeated characters. This method efficiently compresses document images and is suitable for database search, but it cannot be applied to Ottoman or Arabic documents as the concept of character is different in Ottoman or Arabic. Typically, one has to deal with compound structures consisting of a group of letters. Therefore, the matching criterion will be according to those compound structures. Furthermore, the text images are gray tone or color images for Ottoman scripts for the reasons that will be described in the paper. In our method the compound structure matching is carried out in wavelet domain which reduces the search space and increases the compression ratio. In addition to the wavelet transformation which corresponds to the linear subband decomposition, we also used nonlinear subband decomposition. The filters in the nonlinear subband decomposition have the property of preserving edges in the low resolution subband image.
doi:10.1117/12.233272 fatcat:p3w22vlkpbdwncbcwk6cp7f2aq