A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2019; you can also visit the original URL.
The file type is
Language independent 'bag-of-words' representations are surprisingly effective for text classification. The representation is high dimensional though, containing many nonconsistent words for text categorization. These non-consistent words result in reduced generalization performance of subsequent classifiers, e.g., from ill-posed principal component transformations. In this communication our aim is to study the effect of reducing the least relevant words from the bagof-words representation. Wedoi:10.1109/icpr.2004.1334270 dblp:conf/icpr/MadsenSHL04 fatcat:lto3ltt6src43hcv7buqkamnbi