A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2006; you can also visit the original URL.
The file type is application/pdf
.
POST-EDITING THROUGH APPROXIMATION AND GLOBAL CORRECTION
1995
International journal of pattern recognition and artificial intelligence
This paper describes a new automatic spelling correction program to deal with OCR generated errors. The method used here is based on three principles: 1. Approximate string matching between the misspellings and the terms occuring in the database as opposed to the entire dictionary 2. Local information obtained from the individual documents 3. The use of a confusion matrix, which contains information inherently specific to the nature of errors caused by the particular OCR device This system is
doi:10.1142/s0218001495000377
fatcat:jyczv7w7ynhd7amxu77uqhdoti