Software tools and test data for research and testing of page-reading OCR systems

Thomas A. Nartker, Stephen V. Rice, Steven E. Lumos, Elisa H. Barney Smith, Kazem Taghva
2005 Document Recognition and Retrieval XII  
We announce the availability of the UNLV/ISRI Analytic Tools for OCR Evaluation together with a large and diverse collection of scanned document images with the associated ground-truth text. This combination of tools and test data will allow anyone to conduct a meaningful test comparing the performance of competing page-reading algorithms. The value of this collection of software tools and test data is enhanced by knowledge of the past performance of several systems using exactly these tools
more » ... this data. These performance comparisons were published in previous ISRI Test Reports and are also provided. Another value is that the tools can be used to test the character accuracy of any page-reading OCR system for any language included in the Unicode standard. The paper concludes with a summary of the programs, test data, and documentation that is available and gives the URL where they can be located.
doi:10.1117/12.587293 dblp:conf/drr/NartkerRL05 fatcat:il4ehpf7sjdlvnebzttekknf6e