Filters








195 Hits in 4.1 sec

PigSPARQL

Alexander Schätzle, Martin Przyjaciel-Zablocki, Georg Lausen
2011 Proceedings of the International Workshop on Semantic Web Information Management - SWIM '11  
In this paper we investigate the scalable processing of complex SPARQL queries on very large RDF datasets.  ...  We introduce PigSPARQL, a system which gives us the opportunity to process complex SPARQL queries on a MapReduce cluster.  ...  Graph. A SPARQL query dataset is a collection of RDF graphs with one default graph and zero or more additional named graphs. In general, a graph pattern is applied to the default graph.  ... 
doi:10.1145/1999299.1999303 dblp:conf/sigmod/SchatzlePL11 fatcat:w7wu57nqrbbvhjozkjsjhpyfq4

Optimizing RDF(S) queries on cloud platforms

HyeongSik Kim, Padmashree Ravindra, Kemafor Anyanwu
2013 Proceedings of the 22nd International Conference on World Wide Web - WWW '13 Companion  
In this demonstration, we will present RAPID+, an extended Apache Pig system that uses an algebraic approach for optimizing queries on RDF data models including queries involving inferencing.  ...  The demo will show a comparative evaluation of NTGA query plans vs. relational algebra-like query plans used by Apache Pig and Hive.  ...  s2) ߪ (p=) (T) ߪ (p=௨௧௬) (T) ߪ ்ீ(ୀ௩ௗ ∨ ୀ ∨⋯ ) (T) ߛ ்ீ ‫)ݏ(‬ T Q TG = {tg 1 , tg 2 } (&V1, label, "V1"), (&V1, PROCESSING RDF GRAPH PATTERN QUERIES ON MAPREDUCE RDF and SPARQL.  ... 
doi:10.1145/2487788.2487917 dblp:conf/www/KimRA13 fatcat:4v2uckujt5grvjwvklibbqqili

Efficient processing of RDF graph pattern matching on MapReduce platforms

Padmashree Ravindra, Seokyong Hong, HyeongSik Kim, Kemafor Anyanwu
2011 Proceedings of the second international workshop on Data intensive computing in the clouds - DataCloud-SC '11  
Thus, answering queries (typically graph pattern matching queries) over RDF data requires several join operations to reassemble related data.  ...  In this position paper, we argue that some of these challenges can be overcome by rethinking the operators for graph pattern processing, as well as adopting dynamic optimization techniques that exploit  ...  two complimentary strategies that can be used to optimize graph pattern matching queries without the need for time consuming preprocessing, (i) algebraic optimization based on a new algebra called the  ... 
doi:10.1145/2087522.2087527 fatcat:fjacm3udwrgrvckodltcszzfsi

An Intermediate Algebra for Optimizing RDF Graph Pattern Matching on MapReduce [chapter]

Padmashree Ravindra, HyeongSik Kim, Kemafor Anyanwu
2011 Lecture Notes in Computer Science  
This cost is prohibitive for RDF graph pattern matching queries which typically involve several join operations.  ...  A comparative performance evaluation of the traditional Pig approach and RAPID+ (Pig extended with NTGA) for graph pattern matching queries on the BSBM benchmark dataset is presented.  ...  Another observation [8] is that graph pattern matching queries on RDF data often consist of multiple star-structured graph sub patterns.  ... 
doi:10.1007/978-3-642-21064-8_4 fatcat:znr6unrnezdp3hnb22mugka5fq

Sempala: Interactive SPARQL Query Processing on Hadoop [chapter]

Alexander Schätzle, Martin Przyjaciel-Zablocki, Antony Neu, Georg Lausen
2014 Lecture Notes in Computer Science  
Indeed, existing SPARQL-on-Hadoop (MapReduce) approaches have already demonstrated very good scalability, however, query runtimes are rather slow due to the underlying batch processing framework.  ...  Our evaluation shows performance improvements by an order of magnitude compared to existing approaches, paving the way for interactive-time SPARQL query processing on Hadoop.  ...  RDF Loader Query Compiler SPARQL Query SPARQL Parser Algebra Compiler Algebra Optimizer Impala SQL Compiler Impala Syntax Tree Algebra Tree Algebra Tree Impala SQL File RDF Graph  ... 
doi:10.1007/978-3-319-11964-9_11 fatcat:nbt5xlop2jahlapxvfikcqkn5y

Scalable Ontological Query Processing over Semantically Integrated Life Science Datasets using MapReduce [article]

HyeongSik Kim, Kemafor Anyanwu
2016 arXiv   pre-print
In this paper, we present an approach for dealing such complex queries on big data using MapReduce, along with an evaluation on existing real-world datasets and benchmark queries.  ...  However, due to the richness of most biomedical ontologies relative to other domain ontologies, the queries resulting from the query rewriting technique are often more complex than existing query optimization  ...  Conclusion and Future Work In this paper, we presented a comparative discussion on evaluating union graph pattern queries on MapReduce using relational-style algebra optimizations and optimizations based  ... 
arXiv:1602.01040v1 fatcat:ouw42v4amfcqddqjuqvwjqgxme

Towards scalable RDF graph analytics on MapReduce

Padmashree Ravindra, Vikas V. Deshpande, Kemafor Anyanwu
2010 Proceedings of the 2010 Workshop on Massive Data Analytics on the Cloud - MDAC '10  
There have been MapReduce-based approaches for pattern matching [13], [14] by decomposing graphs into RDF molecules.  ...  In this paper, we propose UDFs that (i) re-factor analytical processing on RDF graphs in a way that enables more parallelized processing (ii) perform a look-ahead processing to reduce the cost of subsequent  ...  queries on RDF graph models.  ... 
doi:10.1145/1779599.1779604 fatcat:tvk3s4hhhrazbo5gxn4i4pus44

An Effective and Efficient MapReduce Algorithm for Computing BFS-Based Traversals of Large-Scale RDF Graphs

Alfredo Cuzzocrea, Mirel Cosulschi, Roberto de Virgilio
2016 Algorithms  
When RDF graphs are defined on top of big (Web) data, they lead to the so-called large-scale RDF graphs, which reasonably populate the next-generation Semantic Web.  ...  In line with this trend, in this paper, we present an approach for efficiently implementing traversals of large-scale RDF graphs over MapReduce that is based on the Breadth First Search (BFS) strategy  ...  Contribution [71] considers the research challenge of optimizing RDF graph pattern matching on MapReduce.  ... 
doi:10.3390/a9010007 fatcat:wxorzsnnovbjvpnoc4prrucnty

RDF in the clouds: a survey

Zoi Kaoudi, Ioana Manolescu
2014 The VLDB journal  
The Resource Description Framework (RDF) pioneered by the W3C is increasingly being adopted to model data in a variety of scenarios, in particular data to be published or exchanged on the Web.  ...  Managing large volumes of RDF data is challenging, due to the sheer size, the heterogeneity, and the further complexity brought by RDF reasoning.  ...  The results of each triple pattern guide the exploration of the graph for the next one.  ... 
doi:10.1007/s00778-014-0364-z fatcat:qyp6euinnvexxlliqe2wf45ona

Querying Semantic Knowledge Bases with SQL-on-Hadoop

Martin Przyjaciel-Zablocki, Alexander Schätzle, Georg Lausen
2017 Proceedings of the 4th Algorithms and Systems on MapReduce and Beyond - BeyondMR'17  
In this paper, we continue our work on TriAL-QL, an expressive (SQL-like) RDF query language based on the Triple Algebra with Recursion [31] .  ...  We use our system to study the application of multiple evaluation algorithms, storage strategies and optimizations on Impala and SPARK while highlighting their properties.  ...  Sempala is built on top of Impala. Its RDF data layout is highly optimized for starshaped queries and enables interactive querying times on large RDF graphs.  ... 
doi:10.1145/3070607.3070610 dblp:conf/sigmod/Przyjaciel-Zablocki17 fatcat:wt42dwt6infpzn2ywq6n564vlu

S2RDF

Alexander Schätzle, Martin Przyjaciel-Zablocki, Simon Skilevic, Georg Lausen
2016 Proceedings of the VLDB Endowment  
minimize query input size regardless of its pattern shape and diameter.  ...  Our prototype system S2RDF is built on top of Spark and uses SQL to execute SPARQL queries over ExtVP.  ...  SPARQL is the W3C recommended query language for RDF. A SPARQL query Q defines a graph pattern P that is matched against an RDF graph G.  ... 
doi:10.14778/2977797.2977806 fatcat:kehcu2c43rhczorh4nl7vkxlwu

Optimizing queries over semantically integrated datasets on MapReduce platforms

HyeongSik Kim, Kemafor Anyanwu
2013 2013 IEEE International Conference on Big Data  
In this poster, we focus on optimizing UNION queries (e.g., unions of conjunctives for inference) and present an algebraic interpretation of the query rewritings which are more amenable to efficient processing  ...  Querying such databases typically involves complex graph patterns, and evaluating such patterns poses challenges when MapReduce-based platforms are used to scale up processing, translating to long execution  ...  The efficiency of the evaluation of union queries on MapReduce-based platforms depends on the algebraic interpretation of the query.  ... 
doi:10.1109/bigdata.2013.6691788 dblp:conf/bigdataconf/KimA13 fatcat:3ie6dd44jjdqjk4h6hy2xcj4fe

A new system for massive RDF data management using Big Data query languages Pig, Hive, and Spark

Banane Mouad et. al.
2020 International Journal of Computing and Digital Systems  
We conducted that an experiment on three datasets containing a large volume of distributed RDF data on a powerful server cluster to validate our approach.  ...  For this kind of data, a specialized tool called SPARQL is dedicated to query semantic data represented by the Resource Description Framework or RDF.  ...  Note that a SPARQL query is addressed to the algebra part and that the expression of the SPARQL algebra is interpreted as a tree, this expression will be evaluated upwards through an optimizer.  ... 
doi:10.12785/ijcds/090211 fatcat:en4ndsz26vf65iyikvc47un4nm

SPARQL2Flink: Evaluation of SPARQL Queries on Apache Flink

Oscar Ceballos, Carlos Alberto Ramírez Restrepo, María Constanza Pabón, Andres M. Castillo, Oscar Corcho
2021 Applied Sciences  
Several approaches have been developed in this context proposing the storage and querying of RDF data in a distributed fashion, mainly using the MapReduce Programming Model and Hadoop-based ecosystems.  ...  Existing SPARQL query engines and triple stores are continuously improved to handle more massive datasets.  ...  Acknowledgments: The scalability test results on local cluster presented in this paper were obtained thanks to ViveLab Nariño, an initiative of Ministerio de Tecnologías de la Información y las Comunicaciones-MinTIC  ... 
doi:10.3390/app11157033 fatcat:kqtyvqp645bctbpriwhwb5qgxu

A MapReduce Approach to NoSQL RDF Databases [article]

Albert Haque
2016 arXiv   pre-print
Distributed graph databases must carefully optimize queries before generating MapReduce query plans as network traffic for large datasets can become prohibitive if the query is executed naively.  ...  The growth of machine-readable RDF triples has prompted both industry and academia to develop new database systems, called NoSQL, with characteristics that differ from classical databases.  ...  The optimized tree is then converted into a physical plan as a directed acyclic graph of MapReduce jobs [35] .  ... 
arXiv:1601.01770v1 fatcat:sru7e2hrfne45i7duxfth6ntbu
« Previous Showing results 1 — 15 out of 195 results