Optical character recognition: an illustrated guide to the frontier

George Nagy, Thomas A. Nartker, Stephen V. Rice, Daniel P. Lopresti, Jiangying Zhou
1999 Document Recognition and Retrieval VII  
We offer a perspective on the performance of current OCR systems by illustrating and explaining actual OCR errors made by three commercial devices. After discussing briefly the character recognition abilities of humans and computers, we present illustrated examples of recognition errors. The top level of our taxonomy of the causes of errors consists of Imaging Defects, Similar Symbols, Punctuation, and Typography. The analysis of a series of "snippets" from this perspective provides insight
more » ... the strengths and weaknesses of current systems, and perhaps a road map to future progress. The examples were drawn from the large-scale tests conducted by the authors at the Information Science Research Institute of the University of Nevada, Las Vegas. By way of conclusion, we point to possible approaches for improving the accuracy of today's systems. The talk is based on our eponymous monograph, recently published in The Kluwer
doi:10.1117/12.373511 dblp:conf/drr/NagyNR00 fatcat:yzyw6gx5zzasvcjokaas4gkkli