A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is application/pdf
.
Reverse annotation based retrieval from large document image collections
2010
Proceeding of the 33rd international ACM SIGIR conference on Research and development in information retrieval - SIGIR '10
A number of projects are dedicated to creating digital libraries from scanned books, such as Google Books, UDL, Digital Library of India (DLI), etc. The ability to search in the content of document images is essential for the usability and popularity of these DLs. In this work, we aim toward building a retrieval system over 120K document images coming from 1000 scanned books of Telugu literature. This is a very hard problem because: i) OCRs are not robust enough for Indian languages, especially
doi:10.1145/1835449.1835694
dblp:conf/sigir/Sankar10
fatcat:6xqmyuetpbfclps6ced75fbsbm