129 Hits in 3.5 sec

An Intermediate Algebra for Optimizing RDF Graph Pattern Matching on MapReduce [chapter]

Padmashree Ravindra, HyeongSik Kim, Kemafor Anyanwu
2011 Lecture Notes in Computer Science  
In this paper, we propose an approach for optimizing graph pattern matching by reinterpreting certain join tree structures as grouping operations.  ...  This cost is prohibitive for RDF graph pattern matching queries which typically involve several join operations.  ...  Conclusion In this paper, we presented an intermediate algebra (NTGA) that enables more natural and efficient processing for graph pattern queries on RDF data.  ... 
doi:10.1007/978-3-642-21064-8_4 fatcat:znr6unrnezdp3hnb22mugka5fq

Efficient processing of RDF graph pattern matching on MapReduce platforms

Padmashree Ravindra, Seokyong Hong, HyeongSik Kim, Kemafor Anyanwu
2011 Proceedings of the second international workshop on Data intensive computing in the clouds - DataCloud-SC '11  
Thus, answering queries (typically graph pattern matching queries) over RDF data requires several join operations to reassemble related data.  ...  In addition, most of the existing techniques for optimizing RDF data processing do not transfer well to the MapReduce model and often require significant lead time for pre-processing.  ...  two complimentary strategies that can be used to optimize graph pattern matching queries without the need for time consuming preprocessing, (i) algebraic optimization based on a new algebra called the  ... 
doi:10.1145/2087522.2087527 fatcat:fjacm3udwrgrvckodltcszzfsi

Optimizing RDF(S) queries on cloud platforms

HyeongSik Kim, Padmashree Ravindra, Kemafor Anyanwu
2013 Proceedings of the 22nd International Conference on World Wide Web - WWW '13 Companion  
In this demonstration, we will present RAPID+, an extended Apache Pig system that uses an algebraic approach for optimizing queries on RDF data models including queries involving inferencing.  ...  The MapReduce paradigm is emerging as a platform of choice for large scale data processing and analytics due to its ease of use, cost effectiveness, and potential for unlimited scaling.  ...  In this demonstration, we will present an extended Apache Pig system called RAPID+ [2] , that uses an alternative algebraic framework for optimizing queries on RDF data models including inferencebased  ... 
doi:10.1145/2487788.2487917 dblp:conf/www/KimRA13 fatcat:4v2uckujt5grvjwvklibbqqili


Alexander Schätzle, Martin Przyjaciel-Zablocki, Georg Lausen
2011 Proceedings of the International Workshop on Semantic Web Information Management - SWIM '11  
As underlying platform we use Apache Hadoop, an open source implementation of Google's MapReduce for massively parallelized computations on a computer cluster.  ...  In this paper we investigate the scalable processing of complex SPARQL queries on very large RDF datasets.  ...  A SPARQL query defines a graph pattern P that is matched against an RDF graph G.  ... 
doi:10.1145/1999299.1999303 dblp:conf/sigmod/SchatzlePL11 fatcat:w7wu57nqrbbvhjozkjsjhpyfq4

Towards scalable RDF graph analytics on MapReduce

Padmashree Ravindra, Vikas V. Deshpande, Kemafor Anyanwu
2010 Proceedings of the 2010 Workshop on Massive Data Analytics on the Cloud - MDAC '10  
There have been MapReduce-based approaches for pattern matching [13], [14] by decomposing graphs into RDF molecules.  ...  However, these systems offer only relational algebra style operators which would require an iterative n-tuple reassembly process in which intermediate results need to be materialized.  ...  In this paper, we continue our work on analytical queries but focus on optimizing the pattern matching phase.  ... 
doi:10.1145/1779599.1779604 fatcat:tvk3s4hhhrazbo5gxn4i4pus44

Scalable Ontological Query Processing over Semantically Integrated Life Science Datasets using MapReduce [article]

HyeongSik Kim, Kemafor Anyanwu
2016 arXiv   pre-print
In this paper, we present an approach for dealing such complex queries on big data using MapReduce, along with an evaluation on existing real-world datasets and benchmark queries.  ...  The reason is that besides the traditional challenges of processing graph-structured data, complete query answering requires inferencing to explicate implicitly represented facts.  ...  Conclusion and Future Work In this paper, we presented a comparative discussion on evaluating union graph pattern queries on MapReduce using relational-style algebra optimizations and optimizations based  ... 
arXiv:1602.01040v1 fatcat:ouw42v4amfcqddqjuqvwjqgxme

An Effective and Efficient MapReduce Algorithm for Computing BFS-Based Traversals of Large-Scale RDF Graphs

Alfredo Cuzzocrea, Mirel Cosulschi, Roberto de Virgilio
2016 Algorithms  
In line with this trend, in this paper, we present an approach for efficiently implementing traversals of large-scale RDF graphs over MapReduce that is based on the Breadth First Search (BFS) strategy  ...  for visiting (RDF) graphs to be decomposed and processed according to the MapReduce framework.  ...  This allows for reducing the number of map jobs significantly. Contribution [71] considers the research challenge of optimizing RDF graph pattern matching on MapReduce.  ... 
doi:10.3390/a9010007 fatcat:wxorzsnnovbjvpnoc4prrucnty

RDF in the clouds: a survey

Zoi Kaoudi, Ioana Manolescu
2014 The VLDB journal  
Cloud computing is an emerging paradigm massively adopted in many applications for the scalability, fault-tolerance and elasticity features it provides, enabling the easy deployment of distributed and  ...  The Resource Description Framework (RDF) pioneered by the W3C is increasingly being adopted to model data in a variety of scenarios, in particular data to be published or exchanged on the Web.  ...  In RAPID+ [51, 70] an intermediate nested algebra is proposed for increasing the degree of parallelism when evaluating joins and thus reducing the number of MapReduce jobs.  ... 
doi:10.1007/s00778-014-0364-z fatcat:qyp6euinnvexxlliqe2wf45ona

A MapReduce Approach to NoSQL RDF Databases [article]

Albert Haque
2016 arXiv   pre-print
Distributed graph databases must carefully optimize queries before generating MapReduce query plans as network traffic for large datasets can become prohibitive if the query is executed naively.  ...  The evaluation spans several benchmarks, including the two most commonly used in triplestore evaluation, the Berlin SPARQL Benchmark, and the DBpedia benchmark, a query workload that operates an RDF representation  ...  The SPARQL query defines a set of triple patterns in the WHERE clause for which we are performing a graph search over our database and identifying matching subgraphs.  ... 
arXiv:1601.01770v1 fatcat:sru7e2hrfne45i7duxfth6ntbu

Querying Semantic Knowledge Bases with SQL-on-Hadoop

Martin Przyjaciel-Zablocki, Alexander Schätzle, Georg Lausen
2017 Proceedings of the 4th Algorithms and Systems on MapReduce and Beyond - BeyondMR'17  
In this paper, we continue our work on TriAL-QL, an expressive (SQL-like) RDF query language based on the Triple Algebra with Recursion [31] .  ...  The constant growth of semantically-annotated data and an increasing interest in cross-domain knowledge bases raises the need for expressive query languages for RDF and novel approaches that enable their  ...  Sempala is built on top of Impala. Its RDF data layout is highly optimized for starshaped queries and enables interactive querying times on large RDF graphs.  ... 
doi:10.1145/3070607.3070610 dblp:conf/sigmod/Przyjaciel-Zablocki17 fatcat:wt42dwt6infpzn2ywq6n564vlu


Alexander Schätzle, Martin Przyjaciel-Zablocki, Simon Skilevic, Georg Lausen
2016 Proceedings of the VLDB Endowment  
RDF has become very popular for semantic data publishing due to its flexible and universal graph-like data model.  ...  Existing approaches often favor certain query pattern shapes while performance drops significantly for other shapes.  ...  SPARQL is the W3C recommended query language for RDF. A SPARQL query Q defines a graph pattern P that is matched against an RDF graph G.  ... 
doi:10.14778/2977797.2977806 fatcat:kehcu2c43rhczorh4nl7vkxlwu

SPARQL2Flink: Evaluation of SPARQL Queries on Apache Flink

Oscar Ceballos, Carlos Alberto Ramírez Restrepo, María Constanza Pabón, Andres M. Castillo, Oscar Corcho
2021 Applied Sciences  
Several approaches have been developed in this context proposing the storage and querying of RDF data in a distributed fashion, mainly using the MapReduce Programming Model and Hadoop-based ecosystems.  ...  Acknowledgments: The scalability test results on local cluster presented in this paper were obtained thanks to ViveLab Nariño, an initiative of Ministerio de Tecnologías de la Información y las Comunicaciones-MinTIC  ...  URIs represent common global identifiers for resources across the Web. Definition 2 (RDF Graph). An RDF graph is a set of RDF triples.  ... 
doi:10.3390/app11157033 fatcat:kqtyvqp645bctbpriwhwb5qgxu

S2RDF: RDF Querying with SPARQL on Spark [article]

Alexander Schätzle, Martin Przyjaciel-Zablocki, Simon Skilevic, Georg Lausen
2016 arXiv   pre-print
S2RDF achieves sub-second runtimes for majority of queries on a billion triples RDF graph.  ...  RDF has become very popular for semantic data publishing due to its flexible and universal graph-like data model.  ...  A SPARQL query Q defines a graph pattern P that is matched against an RDF graph G.  ... 
arXiv:1512.07021v3 fatcat:b3inj3oy7nbetlppjndv7hl4s4

A Survey of RDF Stores SPARQL Engines for Querying Knowledge Graphs [article]

Waqas Ali, Muhammad Saleem, Bin Yao, Aidan Hogan, Axel-Cyrille Ngonga Ngomo
2021 arXiv   pre-print
This survey paper provides a comprehensive review of techniques and systems for querying RDF knowledge graphs.  ...  RDF has seen increased adoption in recent years, prompting the standardization of the SPARQL query language for RDF, and the development of local and distributed engines for processing SPARQL queries.  ...  We define the core of SPARQL in terms of basic graph patterns that express the core pattern matched against an RDF graph; navigational graph patterns that match arbitrarylength paths; complex graph patterns  ... 
arXiv:2102.13027v4 fatcat:phontczhbfcvdjt5y75n3hfcge

Storage, Indexing, Query Processing, and Benchmarking in Centralized and Distributed RDF Engines: A Survey [article]

Waqas Ali, Muhammad Saleem, Bin Yao, Aidan Hogan, Axel-Cyrille Ngonga Ngomo
2020 arXiv   pre-print
., SPARQL or SQL) used for query execution is a crucial optimization component of the RDF storage solutions.  ...  The type of indexing approach used in RDF engines is critical for fast data lookup.  ...  To achieve this goal, an intermediate algebra called the Nested TripleGroup Algebra (NTGA) was introduced in combination with a predicate-based indexing structure.  ... 
arXiv:2009.10331v2 fatcat:ou4nctjyj5c6jbh4osclewj62e
« Previous Showing results 1 — 15 out of 129 results