Scalable processing of flexible graph pattern queries on the cloud

Padmashree Ravindra, Kemafor Anyanwu
2013 Proceedings of the 22nd International Conference on World Wide Web - WWW '13 Companion  
Flexible exploration of large RDF datasets with unknown relationships can be enabled using 'unbound-property' graph pattern queries. Relational-style processing of such queries using normalized relations, results in redundant information in intermediate results due to the repetition of adjoining bound (fixed) properties. Such redundancy negatively impacts the disk I/O, network transfer costs, and the required disk space while processing RDF query workloads on MapReduce-based systems. This work
more » ... roposes packing and lazy unpacking strategies to minimize the redundancy in intermediate results while processing unbound-property queries. In addition to keeping the results compact, this work evaluates RDF queries using the Nested TripleGroup Data Model and Algebra (NTGA) that enables shorter MapReduce execution workflows. Experimental results demonstrate the benefit of this work over RDF query processing using relational-style systems such as Apache Pig and Hive.
doi:10.1145/2487788.2487872 dblp:conf/www/RavindraA13 fatcat:35i2ttqk5rdslgapqzxiw5fmai