Filters








49,216 Hits in 3.9 sec

XML Document Probabilistic Clustering Based on Structure and Content

Hassan Naderi, Mojtaba Rashidi
2016 International Journal of Information Technology Control and Automation  
In this paper, we propose SCEM (Expectation Maximization Structure and Content) for XML documents which is used to effectively cluster XML documents by combining content and structural features.  ...  Large volume of information is stored in XML format in the Web, and clustering is a management method for this documents.  ...  XML Document Similarity: Content And Structure Similarity Based on content and structure similarity definitions, we could evaluate document similarity by putting together these two definitions with special  ... 
doi:10.5121/ijitca.2016.6101 fatcat:jzj37tafrbfmnjtatlhny57rwq

XML document-grammar comparison: related problems and applications

Joe Tekli, Richard Chbeir, Agma Traina, Caetano Traina
2011 Open Computer Science  
Nonetheless, the process of comparing XML documents with XML grammars, i.e., XML document and grammar similarity evaluation, has not yet received the attention it deserves.  ...  In this paper, we provide an overview on existing research related to XML document/grammar comparison, presenting the background and discussing the various techniques related to the problem.  ...  Acknowledgement This work was supported in part by the Research Support Foundation of the State of Sao Paulo, FAPESP Post-doctoral Fellowship n# 2010/00330-2.  ... 
doi:10.2478/s13537-011-0005-1 fatcat:d3xfbs7yz5bwtfyceautkzvnsa

Tag Name Structure-based Clustering of XML Documents

Mohamad Alishahi, Mahmoud Naghibzadeh, Baharak Shakeri Aski
2010 International Journal of Computer and Electrical Engineering  
Both algorithms are implemented and evaluated It is shown that the performance of the new algorithm is enhanced in comparison to the previous one.  ...  The concern of this paper is to extract knowledge from XML documents.  ...  Similarity in structure level measures and describes three set of data: 1) Structure and level similarity between documents 2) Structural similarity between documents and schemas 3) Structure and level  ... 
doi:10.7763/ijcee.2010.v2.124 fatcat:aopptsx3srdudpcy3k7abk23qy

A Novel XML Document Structure Comparison Framework Based-On Subtree Commonalities and Label Semantics

Joe M. Tekli, Richard Chbeir
2012 Social Science Research Network  
XML similarity evaluation has become a central issue in the database and information communities, its applications ranging over document clustering, version control, data integration and ranked retrieval  ...  In this paper, we provide an integrated and fine-grained comparison framework to deal with both structural and semantic similarities in XML documents (detecting the occurrences and repetitions of structurally  ...  This work is funded in part by the Research Support Foundation of the State of Sao Paulo, Brazil, FAPESP Postdoctoral Fellowship n# 2010/00330-2.  ... 
doi:10.2139/ssrn.3198935 fatcat:sd3qw5omeffv3p5mcstesjh5hq

A novel XML document structure comparison framework based-on sub-tree commonalities and label semantics

Joe Tekli, Richard Chbeir
2012 Journal of Web Semantics  
XML similarity evaluation has become a central issue in the database and information communities, its applications ranging over document clustering, version control, data integration and ranked retrieval  ...  In this paper, we provide an integrated and fine-grained comparison framework to deal with both structural and semantic similarities in XML documents (detecting the occurrences and repetitions of structurally  ...  This work is funded in part by the Research Support Foundation of the State of Sao Paulo, Brazil, FAPESP Postdoctoral Fellowship n# 2010/00330-2.  ... 
doi:10.1016/j.websem.2011.10.002 fatcat:my7zr64bgzdebawdwoxqjwnt64

XS3

Joe Tekli, Richard Chbeir, Kokou Yetongnon
2008 Proceeding of the 16th ACM international conference on Multimedia - MM '08  
in evaluating XML element/attribute label similarity [10] .  ...  In this demonstration, we aim to present XS 3 , a system for XML Structural and Semantic Similarity assessment.  ... 
doi:10.1145/1459359.1459559 dblp:conf/mm/TekliCY08 fatcat:txt3twjyhnbz3gpgtrg26tzlkq

XML data clustering

Alsayed Algergawy, Marco Mesiti, Richi Nayak, Gunter Saake
2011 ACM Computing Surveys  
In the last few years we have observed a proliferation of approaches for clustering XML documents and schemas based on their structure and content.  ...  These applications need data in the form of similar contents, tags, paths, structures and semantics.  ...  Evaluating similarity between XML documents and their schemas can be exploited in various domains, such as for classifying XML documents against a set of DTDs/schemas declared in an XML data, XML document  ... 
doi:10.1145/1978802.1978804 fatcat:zgparleb6nbkdnoxlcxn3vyrhm

A cluster-based approach to XML similarity joins

Leonardo A. Ribeiro, Theo Härder, Fernanda S. Pimenta
2009 Proceedings of the 2009 International Database Engineering & Applications Symposium on - IDEAS '09  
Therefore, correlating XML documents, which are similar in content an structure, is a fundamental operation.  ...  We present a thorough experimental evaluation to validate our techniques in the context of a native XML DBMS.  ...  Consider the illustration in Specific XML-related Requirements To perform similarity joins on XML datasets, the similarity predicate should address the hierarchical structure of the documents.  ... 
doi:10.1145/1620432.1620451 dblp:conf/ideas/RibeiroHP09 fatcat:mtdwccb2mje6ndtokepnkxkwr4

Synopsis Data Structures for XML Databases: Models, Issues, and Research Perspectives

Angela Bonifati, Alfredo Cuzzocrea
2007 18th International Conference on Database and Expert Systems Applications (DEXA 2007)  
To overcome the above limitations, a possible solution consists in computing synopsis data structures from XML databases, i.e. compressed representations providing a "succinct" description of the original  ...  Due to the lack of efficient native XML database management systems, XML data manipulation and query evaluation may be resource-consuming, and represent a bottleneck for several computationally intensive  ...  As an example, looking at the structure of documents (e.g., [19] ) is useful in the context of algorithms for clustering XML documents, and algorithms for detecting similarities among XML documents.  ... 
doi:10.1109/dexa.2007.100 dblp:conf/dexaw/BonifatiC07 fatcat:squr6j3spfepflniynyxprvnoq

Synopsis Data Structures for XML Databases: Models, Issues, and Research Perspectives

Angela Bonifati, Alfredo Cuzzocrea
2007 Database and Expert Systems Applications  
To overcome the above limitations, a possible solution consists in computing synopsis data structures from XML databases, i.e. compressed representations providing a "succinct" description of the original  ...  Due to the lack of efficient native XML database management systems, XML data manipulation and query evaluation may be resource-consuming, and represent a bottleneck for several computationally intensive  ...  As an example, looking at the structure of documents (e.g., [19] ) is useful in the context of algorithms for clustering XML documents, and algorithms for detecting similarities among XML documents.  ... 
doi:10.1109/dexa.2007.4312849 fatcat:m67tg7g5zvarbn6ye3iolaz37a

A Hybrid Approach for XML Similarity [chapter]

Joe Tekli, Richard Chbeir, Kokou Yetongnon
2007 Lecture Notes in Computer Science  
In this paper, we integrate IR semantic similarity assessment in an edit distance algorithm, seeking to amend similarity judgments when comparing XML-based documents.  ...  Various algorithms for comparing hierarchically structured data, e.g. XML documents, have been proposed in the literature.  ...  In this study, we integrate semantic similarity assessment in a structured XML similarity approach, in order to provide an improved XML similarity measure for comparing heterogeneous XML documents 1 .  ... 
doi:10.1007/978-3-540-69507-3_68 fatcat:vw2qymqshjdfvlaomgju6rtv3a

An overview on XML similarity: Background, current trends and future directions

Joe Tekli, Richard Chbeir, Kokou Yetongnon
2009 Computer Science Review  
In this paper, we provide an overview of XML similarity/comparison by presenting existing research related to XML similarity.  ...  Owing to an unparalleled increasing use of the XML standard, developing efficient techniques for comparing XML-based documents becomes essential in the database and information retrieval communities.  ...  Document/Grammar (DTD) Structure-only − Evaluating structural similarity between XML documents and DTD grammars.  ... 
doi:10.1016/j.cosrev.2009.03.001 fatcat:c3mvd7her5ae3ohbip25c753b4

A matching algorithm for measuring the structural similarity between an XML document and a DTD and its applications

Elisa Bertino, Giovanna Guerrini, Marco Mesiti
2004 Information Systems  
In this paper we propose a matching algorithm for measuring the structural similarity between an XML document and a DTD.  ...  The evaluation of commonalities and differences gives raise to a numerical rank of the structural similarity. Moreover, in the paper, some applications of the matching algorithm are discussed.  ...  In [22] Nierman and Jagadish measure the structural similarity among XML documents.  ... 
doi:10.1016/s0306-4379(03)00031-0 fatcat:rpcb5v5aovep3pf54p3zcnm46u

Search of Information Based Content in Semi-Structured Documents Using Interference Wave

Larbi GUEZOULI, Hassane ESSAFI
2016 International Journal of Computational Science Information Technology and Control Engineering  
We have developed CASISS (Calculation of Similarity of Semi-Structured documents) method to quantify how two given texts are similar.  ...  This paper proposes a semi-structured information retrieval model based on a new method for calculation of similarity.  ...  In order to measure the capacity of our system to distinguish the XML documents similar to the required XML document, we use the Recall/Precision measure.  ... 
doi:10.5121/ijcsitce.2016.3303 fatcat:74vlgjpyurdorik65aeqmay7zm

Measuring similarity of semi-structured documents with context weights

Christopher C. Yang, Nan Liu
2006 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '06  
In this work, we study similarity measures for text-centric XML documents based on an extended vector space model, which considers both document content and structure.  ...  Experimental results based on a benchmark showed superior performance of the proposed measure over the baseline which ignores structural knowledge of XML documents.  ...  Evaluation: The dataset used for evaluation consists of 1894 XML documents describing items in the Museum of Qin Terracotta Warriors and Horses with a total size of 7.8 MB.  ... 
doi:10.1145/1148170.1148334 dblp:conf/sigir/YangL06 fatcat:wq5q6ybhybf7vl5tkdv4564u7i
« Previous Showing results 1 — 15 out of 49,216 results