Improving Accessibility of Archived Raster Dictionaries of Complex Script Languages

Sawood Alam, Fateh ud din B. Mehmood, Michael L. Nelson
2015 Proceedings of the 15th ACM/IEEE-CE on Joint Conference on Digital Libraries - JCDL '15  
We propose an approach to index raster images of dictionary pages which in turn would require very little manual effort to enable direct access to the appropriate pages of the dictionary for lookup. Accessibility is further improved by feedback and crowdsourcing that enables highlighting of the specific location on the page where the lookup word is found, annotation, digitization, and fielded searching. This approach is equally applicable on simple scripts as well as complex writing systems.
more » ... ng our proposed approach, we have built a Web application called "Dictionary Explorer" which supports word indexes in various languages and every language can have multiple dictionaries associated with it. Word lookup gives direct access to appropriate pages of all the dictionaries of that language simultaneously. The application has exploration features like searching, pagination, and navigating the word index through a tree-like interface. The application also supports feedback, annotation, and digitization features. Apart from the scanned images, "Dictionary Explorer" aggregates results from various sources and user contributions in Unicode. We have evaluated the time required for indexing dictionaries of different sizes and complexities in the Urdu language and examined various tradeoffs in our implementation. Using our approach, a single person can make a dictionary of 1,000 pages searchable in less than an hour.
doi:10.1145/2756406.2756926 dblp:conf/jcdl/AlamMN15 fatcat:njfwersbhrhlbfjmv6ml45xofq