A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2009; you can also visit the original URL.
The file type is
Lecture Notes in Computer Science
Most of the existing methods we know to tackle datasets of XML documents directly work on the trees representing these XML documents. We investigate in this paper the use of a different kind of representation for the manipulation of XML documents. Our idea is to transform the trees into sets of attribute-values, so as to be able to apply various existing methods of classification and clustering on such data, and benefit from their strengths. We apply this strategy both for the classificationdoi:10.1007/11766278_36 fatcat:5rlorgbosneljk4ixhq5ev67ym