A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2004; you can also visit the original URL.
The file type is application/pdf
.
Single n-gram stemming
2003
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval - SIGIR '03
Stemming can improve retrieval accuracy, but stemmers are language-specific. Character n-gram tokenization achieves many of the benefits of stemming in a language independent way, but its use incurs a performance penalty. We demonstrate that selection of a single n-gram as a pseudo-stem for a word can be an effective and efficient language-neutral approach for some languages.
doi:10.1145/860435.860528
dblp:conf/sigir/MayfieldM03
fatcat:kimj6rjgwbajhl5gk4aovy54aq