XML schema clustering with semantic and hierarchical similarity measures

Richi Nayak, Wina Iryadi
2007 Knowledge-Based Systems  
With the growing popularity of XML as the data representation language, collections of XML data have exploded in numbers. The methods are required to manage and discover the useful information from them for improved document handling. We present a schema clustering process by organising heterogeneous XML schemas into groups. The methodology considers not only the linguistic and the context of the elements but also the hierarchical structure similarity. We support our findings with experiments and analysis.
doi:10.1016/j.knosys.2006.08.006 fatcat:ufwnk45e7jezldm3qai5q4fnpi