A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is application/pdf
.
An XML clustering algorithm should process both structural and content information of XML documents in order to improve the accuracy and meaning of the clustering solution. However, the inclusion of both kinds of information in the clustering process results in a huge overhead for the underlying clustering algorithm because of the high dimensionality of the data. This paper introduces a novel approach that first determines structural similarity in the form of frequent subtrees and then uses
doi:10.1145/1645953.1646216
dblp:conf/cikm/KuttyNL09
fatcat:abj64ezia5ayhbrfd4mgteqmci