A Framework for Clustering Massive Text and Categorical Data Streams [chapter]

Charu C. Aggarwal, Philip S. Yu
2006 Proceedings of the 2006 SIAM International Conference on Data Mining  
Many applications such as news group filtering, text crawling, and document organization require real time clustering and segmentation of text data records. The categorical data stream clustering problem also has a number of applications to the problems of customer segmentation and real time trend analysis. We will present an online approach for clustering massive text and categorical data streams with the use of a statistical summarization methodology. We present results illustrating the effectiveness of the technique.
doi:10.1137/1.9781611972764.44 dblp:conf/sdm/AggarwalY06 fatcat:aahqzvbqijda5awlnfwownpvry