A survey on XML streaming evaluation techniques

Xiaoying Wu, Dimitri Theodoratos
2012 The VLDB journal  
XML is currently the most popular format for exchanging and representing data on the web. It is used in various applications and for different types of data including structured, semistructured, and unstructured heterogeneous data types. During the period, XML was establishing itself, data streaming applications have gained increased attention and importance. Because of these developments, the querying and efficient processing of XML streams has became a central issue. In this study, we survey
more » ... he state of the art in XML streaming evaluation techniques. We focus on both the streaming evaluation of XPath expressions and of XQuery queries. We classify the XPath streaming evaluation approaches according to the main data structure used for the evaluation into three categories: automaton-based approach, array-based approach, and stack-based approach. We review, analyze, and compare the major techniques proposed for each approach. We also review multiple query streaming evaluation techniques. For the XQuery streaming evaluation problem, we identify and discuss four processing paradigms adopted by the existing XQuery stream query engines: the transducer-based paradigm, the algebra-based paradigm, the automata-algebra paradigm, and the pull-based paradigm. In addition, we review optimization techniques for XQuery streaming evaluation. We address the problem of optimizing Electronic supplementary material The online version of this article (XQuery streaming evaluation as a buffer optimization problem. For all techniques discussed, we describe the research issues and the proposed algorithms and we compare them with other relevant suggested techniques.
doi:10.1007/s00778-012-0281-y fatcat:snwe5pqhy5eptnfcjmhqfz7am4