Improving Keyword Spotting on Degraded Historical Mongolian Document Images Using Markov Random Field for Restoration

Hongxi Wei, Guanglai Gao
2016 Innovative Computing Information and Control Express Letters, Part B: Applications  
Due to aging, the scanned images of historical Mongolian document are degraded. In order to realize keyword spotting, the corresponding word images are segmented from the degraded document images. However, the problem of rupture and lack of stroke results in decreasing the performance of keyword spotting. In this paper, an approach based on Markov Random Field has been applied to improve the quality of the degraded word images. Each degraded word image is modeled by a Markov Random Field, in
more » ... ch the prior probability of the hidden-layer can be obtained by a codebook. The codebook is formulated by a training set of high quality binary word images. And the probability density of the observation-layer can be estimated on the global threshold of the input gray-level word images. In this way, the degraded gray-level word images can be converted into binary word images with better quality. The experimental results show that the Markov Random Field model can reduce degradation of word images so as to improve the performance of keyword spotting.
doi:10.24507/icicelb.07.08.1769 fatcat:wuwjq3obvfebhcallk2h7akk24