From searching text to querying XML streams

Dan Suciu
2004 Journal of Discrete Algorithms  
XML data is queried with a limited form of regular expressions, in a language called XPath. New XML stream processing applications, such as content-based routing or selective dissemination of information, require thousands or millions of XPath expressions to be evaluated simultaneously on the incoming XML stream at a high, sustained rate. In its simplest approximation, the XPath evaluation problem is analogous to the text search problem, in which one or several regular expressions need to be
more » ... ched to a given text. At a finer level, it is related to the tree pattern matching problem. However, unlike the traditional setting, the number of regular expressions here is much larger, while the "text" is much shorter, since it corresponds to the depth of the XML stream. In this paper we examine techniques that have been proposed for XML stream processing and describe a few open problems.
doi:10.1016/s1570-8667(03)00063-7 fatcat:jdf7cnduxbb5zlc32zztfgcqzi