Linear and Non-Linear Dimensional Reduction via Class Representatives for Text Classification

Dimitrios Zeimpekis, Efstratios Gallopoulos
2006 IEEE International Conference on Data Mining. Proceedings  
We address the problem of building fast and effective text classification tools. We describe a "representatives methodology" related to feature extraction and illustrate its performance using as vehicles a centroid based method and a method based on clustered LSI that were recently proposed as useful tools for low rank matrix approximation and cost effective alternatives to LSI. The methodology is very flexible, providing the means for accelerating existing algorithms. It is also combined with
more » ... ernel techniques to enable the analysis of data for which linear techniques are insufficient. Numerous classification examples indicate that the proposed technique is effective and efficient with an overall performance superior than existing linear and nonlinear LSI-based approaches.
doi:10.1109/icdm.2006.98 dblp:conf/icdm/ZeimpekisG06 fatcat:hpmaa3jgxrf7tbkmc6gy3lkp6e