PowerDB-XML: A Platform for Data–Centric and Document–Centric XML Processing [chapter]

Torsten Grabs, Hans-Jörg Schek
2003 Lecture Notes in Computer Science  
Relational database systems are well-suited as a platform for data-centric XML processing. Data-centric applications process regularly structured XML documents using precise predicates. However, these approaches come too short when XML applications also require document-centric processing, i.e., processing of less rigidly structured documents using vague predicates in the sense of information retrieval. The PowerDB-XML project at ETH Zurich aims to address this drawback and to cover both these
more » ... ypes of XML applications on a single platform. In this paper, we investigate the requirements of documentcentric XML processing and propose to refine state-of-the-art retrieval models for unstructured flat document such that they meet the flexibility of the XML format. To do so, we rely on so-called query-specific statistics computed dynamically at query runtime to reflect the query scope. Moreover, we show that document-centric XML processing is efficiently feasible using relational database systems for storage management and standard SQL. This allows us to combine document-centric processing with data-centric XML-to-database mappings. Our XML engine named PowerDB-XML therefore supports the full range of XML applications on the same integrated platform.
doi:10.1007/978-3-540-39429-7_7 fatcat:3yvno6vrd5endi3pgkd6w6vvxm