Filters








23,477 Hits in 4.2 sec

Clustering XML Documents by Structure [chapter]

Theodore Dalamagas, Tao Cheng, Klaas-Jan Winkel, Timos Sellis
2004 Lecture Notes in Computer Science  
This paper presents a framework for clustering XML documents by structure.  ...  Grouping together structurally similar XML documents refers to the application of clustering methods using distances that estimate the similarity between tree structures in terms of the hierarchical relationships  ...  Methods were discussed to cluster a set of existing XML documents by structure at once.  ... 
doi:10.1007/978-3-540-24674-9_13 fatcat:6teg5mmjajcjbizz3llq6li2ky

Clustering XML Documents by Structure [chapter]

Anna Lesniewska
2010 Lecture Notes in Computer Science  
The issue of clustering XML documents by structure is being considered in this paper. Two different and independent methods of clustering XML documents by structure are being proposed.  ...  In this paper, it is suggested that the proposed methods may improve the accuracy of XML clustering by structure.  ...  The clustering of XML documents by structure has many applications.  ... 
doi:10.1007/978-3-642-12082-4_30 fatcat:l5z2s3savba2zklfm66mnsjztu

A methodology for clustering XML documents by structure

Theodore Dalamagas, Tao Cheng, Klaas-Jan Winkel, Timos Sellis
2006 Information Systems  
This paper presents a framework for clustering XML documents by structure.  ...  Grouping together structurally similar XML documents refers to the application of clustering methods using distances that estimate the similarity between tree structures in terms of the hierarchical relationships  ...  Methods were discussed to cluster a set of existing XML documents by structure at once.  ... 
doi:10.1016/j.is.2004.11.009 fatcat:pxvevu7vafevtm4f5oich2cvim

A Tree-Based Approach to Clustering XML Documents by Structure [chapter]

Gianni Costa, Giuseppe Manco, Riccardo Ortale, Andrea Tagarelli
2004 Lecture Notes in Computer Science  
The idea is to equip each cluster with an XML cluster representative, i.e. an XML document subsuming the most typical structural specifics of a set of XML documents.  ...  We propose a novel methodology for clustering XML documents on the basis of their structural similarities.  ...  In this paper we propose a novel methodology for clustering XML documents by structure, which is based on the notion of XML cluster representative.  ... 
doi:10.1007/978-3-540-30116-5_15 fatcat:qln5t6e2yfc2blxcepupy5jpdu

An efficient and scalable algorithm for clustering XML documents by structure

Wang Lian, D.W.-l. Cheung, N. Mamoulis, Siu-Ming Yiu
2004 IEEE Transactions on Knowledge and Data Engineering  
We propose a hierarchical algorithm (S-GRACE) for clustering XML documents based on structural information in the data.  ...  Experiments on real data show that our algorithm can discover clusters not easily identified by manual inspection.  ...  We develop an algorithm S-GRACE which clusters XML documents by structure.  ... 
doi:10.1109/tkde.2004.1264824 fatcat:kwmcq2tmffdepacb7wdhkyt4ja

Evaluating the Performance of XML Document Clustering by Structure Only [chapter]

Tien Tran, Richi Nayak
Lecture Notes in Computer Science  
The PCXSS method is a progressive clustering method that computes the similarity between a new XML document and existing clusters by considering the structures within documents.  ...  This paper reports the results and experiments performed on the INEX 2006 Document Mining Challenge Corpus with the PCXSS clustering method.  ...  The PCXSS, originally developed for the purpose of clustering of heterogeneous XML schemas, has been modified and applied to cluster the INEX 2006 XML documents by considering only the structure of XML  ... 
doi:10.1007/978-3-540-73888-6_44 fatcat:wseawkzxorda3lsf52yvn45ppy

XML Document Probabilistic Clustering Based on Structure and Content

Hassan Naderi, Mojtaba Rashidi
2016 International Journal of Information Technology Control and Automation  
In this paper, we propose SCEM (Expectation Maximization Structure and Content) for XML documents which is used to effectively cluster XML documents by combining content and structural features.  ...  Most of current methods for clustering XML documents consider only one of these two aspects.  ...  PROBABILISTIC CLUSTERING To clustering XML document by SCEM, we need some preprocessing.  ... 
doi:10.5121/ijitca.2016.6101 fatcat:jzj37tafrbfmnjtatlhny57rwq

XEdge

Panagiotis Antonellis, Christos Makris, Nikos Tsirakis
2008 Proceedings of the 2008 ACM symposium on Applied computing - SAC '08  
In this paper we propose a unified clustering algorithm for both homogeneous and heterogeneous XML documents.  ...  Depending on the type of the XML documents, the proposed algorithm modifies its distance metric in order to properly adapt to the special structural characteristics of homogeneous and heterogeneous XML  ...  partitional clustering algorithm. • Every cluster is represented by a compact cluster representative structure that summarizes the properties and characteristics of the XML documents included in the appropriate  ... 
doi:10.1145/1363686.1363940 dblp:conf/sac/AntonellisMT08 fatcat:lf5v6tjbobhz5j5wws5nyz5rra

An Optimistic Approach for Clustering Multi-version XML Documents Using Compressed Delta

Vijay R Sonawane, D. Rajeswara Rao
2015 International Journal of Electrical and Computer Engineering (IJECE)  
Evolving size of XML document is reduced by applying homomorphic compression before clustering them which retains its original structure.  ...  This paper proposes optimistic approach to Re-cluster multi-version XML documents which change in time by reassessing distance between them by using knowledge from initial clustering solution and changes  ...  Author in [34] proposed a framework for clustering XML documents by structure, they model the XML documents as rooted ordered labelled trees, then studied the usage of structural distance metrics in  ... 
doi:10.11591/ijece.v5i6.pp1472-1479 fatcat:o5av7hoeyfekxpyathrf6us6na

An Efficient Association Rule Based Clustering of XML Documents

A. Muralidhar, V. Pattabiraman
2015 Procedia Computer Science  
Therefore, this paper proposes a hybrid approach which discovers the frequent XML documents by association rule mining and then find the clustering of XML documents by classical k-means algorithm.  ...  Hence, the key contribution of the work is to find the meaningful clustered based associations by association rule based clustering.  ...  Step 2 retrieves the frequent XML documents. The contents and structure for every frequent XML documents is extracted by the mechanism of xpath .  ... 
doi:10.1016/j.procs.2015.04.024 fatcat:grcy6suwybghtevnqx2ysk5mzm

Tag Name Structure-based Clustering of XML Documents

Mohamad Alishahi, Mahmoud Naghibzadeh, Baharak Shakeri Aski
2010 International Journal of Computer and Electrical Engineering  
Many algorithms have been developed for the clustering of XML documents.  ...  The concern of this paper is to extract knowledge from XML documents.  ...  XCLS INCREMENTAL ALGORITHM XCLS (XML Clustering by Level Structure) algorithm tries to cluster XML documents by considering their structures.  ... 
doi:10.7763/ijcee.2010.v2.124 fatcat:aopptsx3srdudpcy3k7abk23qy

XML data clustering

Alsayed Algergawy, Marco Mesiti, Richi Nayak, Gunter Saake
2011 ACM Computing Surveys  
In the last few years we have observed a proliferation of approaches for clustering XML documents and schemas based on their structure and content.  ...  We aim at introducing an integrated view that is useful when comparing XML data clustering approaches, when developing a new clustering algorithm, and when implementing an XML clustering component.  ...  Acknowledgements: Alsayed Algergawy has been supported by the Egyptian Ministry of Higher Education and Tanta University, Egypt.  ... 
doi:10.1145/1978802.1978804 fatcat:zgparleb6nbkdnoxlcxn3vyrhm

Report on the XML mining track at INEX 2005 and INEX 2006

Ludovic Denoyer, Patrick Gallinari
2007 SIGIR Forum  
This article is a report concerning the two years of the XML Mining track at INEX (2005 and 2006) . We focus here on the classication and clustering of XML documents.  ...  We detail these two tasks and the corpus used for this challenge and then present a summary of the dierent methods proposed by the participants.  ...  -Clustering -2005 and 2006 The papers by Nayak 8 et al. ([12] and [13]) denes a similarity measure between an XML document and a cluster of XML documents.  ... 
doi:10.1145/1273221.1273230 fatcat:pv56jdndhjfpngmqubiqh57ufu

Similarity Measure and Clustering Technique for XML Documents by a Parent-Child Matrix
부모-자식 행렬을 사용한 XML 문서 유사도 측정과 군집 기법

Yun-Gu Lee, Woosaeng Kim
2015 The Journal of the Korean Institute of Information and Communication Engineering  
In this paper, we propose a parent-child matrix to cluster XML documents efficiently. A parent-child matrix analyzes both the content and structural features of an XML document.  ...  Then, the similarity between two XML documents can be measured by the similarity between two corresponding parent-child matrices. The experiment shows that our proposed method has good performance.  ...  부모-자식 행렬들의 결합 CLUSTERING XML DOCUMENTS BY PARENT-CHILD MATRIXWe use a hierarchical clustering algorithm to cluster XML documents.  ... 
doi:10.6109/jkiice.2015.19.7.1599 fatcat:w7lcaughnnblnivqklpci4f34u

Managing Multiversion Xml Documents with Compressed Delta

2020 VOLUME-8 ISSUE-10, AUGUST 2019, REGULAR ISSUE  
Content and structure of Dynamic XML documents changes frequently based on user behavior and produces multiple versions of it.  ...  Multiversion XML documents are having huge applicability which demands for their effective organization. Clustering is better solution to retain these documents.  ...  A various techniques for clustering sequence of heterogeneous XML documents projected by authors [32] .  ... 
doi:10.35940/ijitee.b7534.019320 fatcat:rgau2zef7zdqvoxhxwollbw7ky
« Previous Showing results 1 — 15 out of 23,477 results