A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2008; you can also visit the original URL.
The file type is
In this paper we explore the connection between clustering categorical data and entropy: clusters of similar poi lower entropy than those of dissimilar ones. We use this connection to design an incremental heuristic algorithm, COOL-CAT, which is capable of efficiently clustering large data sets of records with categorical attributes, and data streams. In contrast with other categorical clustering algorithms published in the past, COOLCAT's clustering results are very stable for different sampledoi:10.1145/584792.584888 dblp:conf/cikm/BarbaraLC02 fatcat:bttsjzl4tna7hpfw64iyv23w6y