Information Retrieval System for Handwritten Documents [chapter]

Sargur Srihari, Anantharaman Ganesh, Catalin Tomai, Yong-Chul Shin, Chen Huang
2004 Lecture Notes in Computer Science  
The design and performance of a content-based information retrieval system for handwritten documents is described. System indexing and retrieval is based on writer characteristics, textual content as well as document meta data such as writer profile. Documents are indexed using global image features, e.g., stroke width, slant, word gaps, as well local features that describe shapes of characters and words. Image indexing is done automatically using page analysis, page segmentation, line
more » ... n, word segmentation and recognition of characters and words. Several types of queries are permitted: (i) entire document image; (ii) a region of interest (ROI) of a document; (iii) a word image; and (iv) textual. Retrieval is based on a probabilistic model of information retrieval. The system has been implemented using Microsoft Visual C++ and a relational database system. This paper reports on the performance of the system for retrieving documents based on same and different content.
doi:10.1007/978-3-540-28640-0_28 fatcat:dih37sul6nf5tm5irpkqehlnlm