Memory-based text correction for preposition and determiner errors

Antal van den Bosch, Peter Berck
2012 Workshop on Innovative Use of NLP for Building Educational Applications  
We describe the Valkuil.net team entry for the HOO 2012 Shared Task. Our systems consists of four memory-based classifiers that generate correction suggestions for middle positions in small text windows of two words to the left and to the right. Trained on the Google 1TB 5gram corpus, the first two classifiers determine the presence of a determiner or a preposition between all words in a text in which the actual determiners and prepositions are masked. The second pair of classifiers determines
more » ... hich is the most likely correction given a masked determiner or preposition. The hyperparameters that govern the classifiers are optimized on the shared task training data. We point out a number of obvious improvements to boost the medium-level scores attained by the system.
dblp:conf/bea/BoschB12 fatcat:4lcjzzsmezgjjpdvhoy2sxcb3e