Indexing and Querying XML Data for Regular Path Expressions

Quanzhong Li, Bongki Moon
2001 Very Large Data Bases Conference  
With the advent of XML as a standard for data representation and exchange on the Internet, storing and querying XML data becomes more and more important. Several XML query languages have been proposed, and the common feature of the languages is the use of regular path expressions to query XML data. This poses a new challenge concerning indexing and searching XML data, because conventional approaches based on tree traversals may not meet the processing requirements under heavy access requests.
more » ... this paper, we propose a new system for indexing and storing XML data based on a numbering scheme for elements. This numbering scheme quickly determines the ancestor-descendant relationship between elements in the hierarchy of XML data. We also propose several algorithms for processing regular path expressions, namely, (1) -Join for searching paths from an element to another, (2) -Join for scanning sorted elements and attributes to find element-attribute pairs, and (3) Ã -Join for finding Kleene-Closure on repeated paths or elements. The -Join algorithm is highly effective particularly for searching paths that are very long or whose lengths are unknown. Experimental results from our prototype system implementation show that the proposed algorithms can process XML queries with regular path expressions by up to an or-£ This work was sponsored in part by National Science Foundation CAREER Award (IIS-9876037) and Research Infrastructure program EIA-0080123. The authors assume all responsibility for the contents of the paper.
dblp:conf/vldb/LiM01 fatcat:gvghwgy4irgf3bwbvxxddl33ia