A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2018; you can also visit the original URL.
The file type is application/pdf
.
Filters
XML Document Probabilistic Clustering Based on Structure and Content
2016
International Journal of Information Technology Control and Automation
In this paper, we propose SCEM (Expectation Maximization Structure and Content) for XML documents which is used to effectively cluster XML documents by combining content and structural features. ...
Large volume of information is stored in XML format in the Web, and clustering is a management method for this documents. ...
XML Document Similarity: Content And Structure Similarity Based on content and structure similarity definitions, we could evaluate document similarity by putting together these two definitions with special ...
doi:10.5121/ijitca.2016.6101
fatcat:jzj37tafrbfmnjtatlhny57rwq
XML document-grammar comparison: related problems and applications
2011
Open Computer Science
Nonetheless, the process of comparing XML documents with XML grammars, i.e., XML document and grammar similarity evaluation, has not yet received the attention it deserves. ...
In this paper, we provide an overview on existing research related to XML document/grammar comparison, presenting the background and discussing the various techniques related to the problem. ...
Acknowledgement This work was supported in part by the Research Support Foundation of the State of Sao Paulo, FAPESP Post-doctoral Fellowship n# 2010/00330-2. ...
doi:10.2478/s13537-011-0005-1
fatcat:d3xfbs7yz5bwtfyceautkzvnsa
Tag Name Structure-based Clustering of XML Documents
2010
International Journal of Computer and Electrical Engineering
Both algorithms are implemented and evaluated It is shown that the performance of the new algorithm is enhanced in comparison to the previous one. ...
The concern of this paper is to extract knowledge from XML documents. ...
Similarity in structure level measures and describes three set of data: 1) Structure and level similarity between documents 2) Structural similarity between documents and schemas 3) Structure and level ...
doi:10.7763/ijcee.2010.v2.124
fatcat:aopptsx3srdudpcy3k7abk23qy
A Novel XML Document Structure Comparison Framework Based-On Subtree Commonalities and Label Semantics
2012
Social Science Research Network
XML similarity evaluation has become a central issue in the database and information communities, its applications ranging over document clustering, version control, data integration and ranked retrieval ...
In this paper, we provide an integrated and fine-grained comparison framework to deal with both structural and semantic similarities in XML documents (detecting the occurrences and repetitions of structurally ...
This work is funded in part by the Research Support Foundation of the State of Sao Paulo, Brazil, FAPESP Postdoctoral Fellowship n# 2010/00330-2. ...
doi:10.2139/ssrn.3198935
fatcat:sd3qw5omeffv3p5mcstesjh5hq
A novel XML document structure comparison framework based-on sub-tree commonalities and label semantics
2012
Journal of Web Semantics
XML similarity evaluation has become a central issue in the database and information communities, its applications ranging over document clustering, version control, data integration and ranked retrieval ...
In this paper, we provide an integrated and fine-grained comparison framework to deal with both structural and semantic similarities in XML documents (detecting the occurrences and repetitions of structurally ...
This work is funded in part by the Research Support Foundation of the State of Sao Paulo, Brazil, FAPESP Postdoctoral Fellowship n# 2010/00330-2. ...
doi:10.1016/j.websem.2011.10.002
fatcat:my7zr64bgzdebawdwoxqjwnt64
in evaluating XML element/attribute label similarity [10] . ...
In this demonstration, we aim to present XS 3 , a system for XML Structural and Semantic Similarity assessment. ...
doi:10.1145/1459359.1459559
dblp:conf/mm/TekliCY08
fatcat:txt3twjyhnbz3gpgtrg26tzlkq
XML data clustering
2011
ACM Computing Surveys
In the last few years we have observed a proliferation of approaches for clustering XML documents and schemas based on their structure and content. ...
These applications need data in the form of similar contents, tags, paths, structures and semantics. ...
Evaluating similarity between XML documents and their schemas can be exploited in various domains, such as for classifying XML documents against a set of DTDs/schemas declared in an XML data, XML document ...
doi:10.1145/1978802.1978804
fatcat:zgparleb6nbkdnoxlcxn3vyrhm
A cluster-based approach to XML similarity joins
2009
Proceedings of the 2009 International Database Engineering & Applications Symposium on - IDEAS '09
Therefore, correlating XML documents, which are similar in content an structure, is a fundamental operation. ...
We present a thorough experimental evaluation to validate our techniques in the context of a native XML DBMS. ...
Consider the illustration in
Specific XML-related Requirements To perform similarity joins on XML datasets, the similarity predicate should address the hierarchical structure of the documents. ...
doi:10.1145/1620432.1620451
dblp:conf/ideas/RibeiroHP09
fatcat:mtdwccb2mje6ndtokepnkxkwr4
Synopsis Data Structures for XML Databases: Models, Issues, and Research Perspectives
2007
18th International Conference on Database and Expert Systems Applications (DEXA 2007)
To overcome the above limitations, a possible solution consists in computing synopsis data structures from XML databases, i.e. compressed representations providing a "succinct" description of the original ...
Due to the lack of efficient native XML database management systems, XML data manipulation and query evaluation may be resource-consuming, and represent a bottleneck for several computationally intensive ...
As an example, looking at the structure of documents (e.g., [19] ) is useful in the context of algorithms for clustering XML documents, and algorithms for detecting similarities among XML documents. ...
doi:10.1109/dexa.2007.100
dblp:conf/dexaw/BonifatiC07
fatcat:squr6j3spfepflniynyxprvnoq
Synopsis Data Structures for XML Databases: Models, Issues, and Research Perspectives
2007
Database and Expert Systems Applications
To overcome the above limitations, a possible solution consists in computing synopsis data structures from XML databases, i.e. compressed representations providing a "succinct" description of the original ...
Due to the lack of efficient native XML database management systems, XML data manipulation and query evaluation may be resource-consuming, and represent a bottleneck for several computationally intensive ...
As an example, looking at the structure of documents (e.g., [19] ) is useful in the context of algorithms for clustering XML documents, and algorithms for detecting similarities among XML documents. ...
doi:10.1109/dexa.2007.4312849
fatcat:m67tg7g5zvarbn6ye3iolaz37a
A Hybrid Approach for XML Similarity
[chapter]
2007
Lecture Notes in Computer Science
In this paper, we integrate IR semantic similarity assessment in an edit distance algorithm, seeking to amend similarity judgments when comparing XML-based documents. ...
Various algorithms for comparing hierarchically structured data, e.g. XML documents, have been proposed in the literature. ...
In this study, we integrate semantic similarity assessment in a structured XML similarity approach, in order to provide an improved XML similarity measure for comparing heterogeneous XML documents 1 . ...
doi:10.1007/978-3-540-69507-3_68
fatcat:vw2qymqshjdfvlaomgju6rtv3a
An overview on XML similarity: Background, current trends and future directions
2009
Computer Science Review
In this paper, we provide an overview of XML similarity/comparison by presenting existing research related to XML similarity. ...
Owing to an unparalleled increasing use of the XML standard, developing efficient techniques for comparing XML-based documents becomes essential in the database and information retrieval communities. ...
Document/Grammar (DTD) Structure-only − Evaluating structural similarity between XML documents and DTD grammars. ...
doi:10.1016/j.cosrev.2009.03.001
fatcat:c3mvd7her5ae3ohbip25c753b4
A matching algorithm for measuring the structural similarity between an XML document and a DTD and its applications
2004
Information Systems
In this paper we propose a matching algorithm for measuring the structural similarity between an XML document and a DTD. ...
The evaluation of commonalities and differences gives raise to a numerical rank of the structural similarity. Moreover, in the paper, some applications of the matching algorithm are discussed. ...
In [22] Nierman and Jagadish measure the structural similarity among XML documents. ...
doi:10.1016/s0306-4379(03)00031-0
fatcat:rpcb5v5aovep3pf54p3zcnm46u
Search of Information Based Content in Semi-Structured Documents Using Interference Wave
2016
International Journal of Computational Science Information Technology and Control Engineering
We have developed CASISS (Calculation of Similarity of Semi-Structured documents) method to quantify how two given texts are similar. ...
This paper proposes a semi-structured information retrieval model based on a new method for calculation of similarity. ...
In order to measure the capacity of our system to distinguish the XML documents similar to the required XML document, we use the Recall/Precision measure. ...
doi:10.5121/ijcsitce.2016.3303
fatcat:74vlgjpyurdorik65aeqmay7zm
Measuring similarity of semi-structured documents with context weights
2006
Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '06
In this work, we study similarity measures for text-centric XML documents based on an extended vector space model, which considers both document content and structure. ...
Experimental results based on a benchmark showed superior performance of the proposed measure over the baseline which ignores structural knowledge of XML documents. ...
Evaluation: The dataset used for evaluation consists of 1894 XML documents describing items in the Museum of Qin Terracotta Warriors and Horses with a total size of 7.8 MB. ...
doi:10.1145/1148170.1148334
dblp:conf/sigir/YangL06
fatcat:wq5q6ybhybf7vl5tkdv4564u7i
« Previous
Showing results 1 — 15 out of 49,216 results