A Level-wise Hierarchical Document Clustering method for Categorization

Kil Hong Joo, Nam Hun Park
2015 unpublished
For document categorization, numerous words appearing in similar documents are divided into stopwords and keywords and to precisely describe documentary characteristics, documents are expressed by keywords without stopwords. For enhanced clustering precision, this paper proposed SHODC algorithm, a seed cluster-based hierarchical document clustering method, and DHODC method through domain stopwrod removal and tree structure expansion for document categorization. Through several experiments, it
more » ... s found that the deeper the domain levels, the more precise results were produced by the suggested method compared to other algorithm. The suggested algorithm.
doi:10.14257/astl.2015.99.22 fatcat:ygh5kvgavzcwrivm5pakkablqi