A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2003; you can also visit the original URL.
The file type is application/pdf
.
Document classification using a finite mixture model
1997
Proceedings of the eighth conference on European chapter of the Association for Computational Linguistics -
unpublished
We propose a new method of classifying documents into categories. We define for each category a finite mixture model based on soft clustering of words. We treat the problem of classifying documents as that of conducting statistical hypothesis testing over finite mixture models, and employ the EM algorithm to efficiently estimate parameters in a finite mixture model. Experimental results indicate that our method outperforms existing methods.
doi:10.3115/979617.979623
fatcat:crtiexqsvbh27gopxtdyjjk6ou