A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2018; you can also visit the original URL.
The file type is application/pdf
.
XML Document Probabilistic Clustering Based on Structure and Content
2016
International Journal of Information Technology Control and Automation
Large volume of information is stored in XML format in the Web, and clustering is a management method for this documents. Most of current methods for clustering XML documents consider only one of these two aspects. In this paper, we propose SCEM (Expectation Maximization Structure and Content) for XML documents which is used to effectively cluster XML documents by combining content and structural features. The other contribution of this paper is that we used probabilistic distributions in such
doi:10.5121/ijitca.2016.6101
fatcat:jzj37tafrbfmnjtatlhny57rwq