A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2012; you can also visit the original URL.
The file type is application/pdf
.
A word spotting framework for historical machine-printed documents
2010
International Journal on Document Analysis and Recognition
In this paper, we propose a word spotting framework for accessing the content of historical machine-printed documents without the use of an optical character recognition engine. A preprocessing step is performed in order to improve the quality of the document images, while word segmentation is accomplished with the use of two complementary segmentation methodologies. In the proposed methodology, synthetic word images are created from keywords, and these images are compared to all the words in
doi:10.1007/s10032-010-0134-4
fatcat:2vqu3k6qjzbclagqmebyszmt4y