FASST Mining: Discovering Frequently Changing Semantic Structure from Versions of Unordered XML Documents [chapter]

Qiankun Zhao, Sourav S. Bhowmick
2005 Lecture Notes in Computer Science  
In this paper, we present a FASST mining approach to extract the frequently changing semantic structures (FASSTs), which are a subset of semantic substructures that change frequently, from versions of unordered XML documents. We propose a data structure, H-DOM + , and a FASST mining algorithm, which incorporates the semantic issue and takes the advantage of the related domain knowledge. The distinct feature of this approach is that the FASST mining process is guided by the user-defined concept
more » ... ierarchy. Rather than mining all the frequent changing structures, only these frequent changing structures that are semantically meaningful are extracted. Our experimental results show that the H-DOM + structure is compact and the FASST algorithm is efficient with good scalability. We also design a declarative FASST query language, FASSTQUEL, to make the FASST mining process interactive and flexible.
doi:10.1007/11408079_66 fatcat:7gtk33xbmzb4hknvbwowhis434