321 Hits in 3.4 sec

Towards scalable RDF graph analytics on MapReduce

Padmashree Ravindra, Vikas V. Deshpande, Kemafor Anyanwu
2010 Proceedings of the 2010 Workshop on Massive Data Analytics on the Cloud - MDAC '10  
MapReduce-based parallel processing systems like Pig have gained success in processing scalable analytical workloads.  ...  In this paper, we propose UDFs that (i) re-factor analytical processing on RDF graphs in a way that enables more parallelized processing (ii) perform a look-ahead processing to reduce the cost of subsequent  ...  queries on RDF graph models.  ... 
doi:10.1145/1779599.1779604 fatcat:tvk3s4hhhrazbo5gxn4i4pus44

An Effective and Efficient MapReduce Algorithm for Computing BFS-Based Traversals of Large-Scale RDF Graphs

Alfredo Cuzzocrea, Mirel Cosulschi, Roberto de Virgilio
2016 Algorithms  
When RDF graphs are defined on top of big (Web) data, they lead to the so-called large-scale RDF graphs, which reasonably populate the next-generation Semantic Web.  ...  In line with this trend, in this paper, we present an approach for efficiently implementing traversals of large-scale RDF graphs over MapReduce that is based on the Breadth First Search (BFS) strategy  ...  MapReduce Algorithms for RDF Graphs Paper [70] focuses the attention on the problem of effectively and efficiently supporting scalable storage and retrieval of large volumes of in-memory-representations  ... 
doi:10.3390/a9010007 fatcat:wxorzsnnovbjvpnoc4prrucnty


Alexander Schätzle, Martin Przyjaciel-Zablocki, Georg Lausen
2011 Proceedings of the International Workshop on Semantic Web Information Management - SWIM '11  
In this paper we investigate the scalable processing of complex SPARQL queries on very large RDF datasets.  ...  Pig Latin programs are executed by a series of MapReduce jobs on a Hadoop cluster.  ...  [25] use UDFs to reduce I/O costs in analytical queries over RDF graphs with Pig Latin.  ... 
doi:10.1145/1999299.1999303 dblp:conf/sigmod/SchatzlePL11 fatcat:w7wu57nqrbbvhjozkjsjhpyfq4

A Data-flow Language for Big RDF Data Processing

Fadi Maali
2014 International Semantic Web Conference  
On the other hand, a graph-based data model and support for pattern matching as in SPARQL are to be adopted.  ...  Giving the focus on large-scale data, scalability and efficiency are critical requirements. In this paper, I report on my research plan and describe some preliminary results.  ...  Finally, the graph is topologically sorted and the MapReduce jobs are scheduled to execute on the cluster.  ... 
dblp:conf/semweb/Maali14a fatcat:mju6owljxjenfi7vr7kiqxnaue

ICDE conference 2015 detailed author index

2015 2015 IEEE 31st International Conference on Data Engineering  
in Event-Based Social Networks Li, Yanhua 1376 Growing the Charging Station Network for Electric Vehicles with Trajectory Data Analytics Li, Youhuan 1508 A Graph-Based RDF Triple Store Li,  ...  Plans for Massively Parallel RDF Queries 1432 CliqueSquare in Action: Flat Plans for Massively Parallel RDF Queries 1541 Reasoning on Web Data: Algorithms and Performance G continues on next page  ... 
doi:10.1109/icde.2015.7113260 fatcat:ep7pomkm55f45j33tkpoc5asim

Detailed author index

2015 2015 31st IEEE International Conference on Data Engineering Workshops  
d Towards Web-Scale How-Provenance Goasdoue, Fran<;:ois 71 '!:!d Efficient OLAP Operations for RDF Analytics Gruenwald, Le 34 '!:!  ...  Madsen, Kasper Grud Skat 10 m Dynamic Resource Management in a MapReduce-Style Platform for Fast Data Processing Manolescu, Ioana 71 m Efficient OLAP Operations for RDF Analytics Meng, Rui 216 m On Bottleneck-Aware  ... 
doi:10.1109/icdew.2015.7129529 fatcat:4vkbzbkin5fvhmiibrbqxegjaq

Clause-iteration with MapReduce to scalably query datagraphs in the SHARD graph-store

Kurt Rohloff, Richard E. Schantz
2011 Proceedings of the fourth international workshop on Data-intensive distributed computing - DIDC '11  
The Clause-Iteration algorithms form the basis of our scalable, SHARD graph-store built on the Hadoop implementation of MapReduce.  ...  We present a scalable cloud-based approach to process queries on graph data utilizing the MapReduce model. We call this approach the Clause-Iteration approach.  ...  We introduce a scalable cloud-based approach to process queries on graph data based on iterating over clauses in graph queries to construct query responses using the MapReduce paradigm [5] .  ... 
doi:10.1145/1996014.1996021 dblp:conf/hpdc/RohloffS11 fatcat:kdxufwpnbrbtnnrce7suz6q4bu

Sempala: Interactive SPARQL Query Processing on Hadoop [chapter]

Alexander Schätzle, Martin Przyjaciel-Zablocki, Antony Neu, Georg Lausen
2014 Lecture Notes in Computer Science  
Indeed, existing SPARQL-on-Hadoop (MapReduce) approaches have already demonstrated very good scalability, however, query runtimes are rather slow due to the underlying batch processing framework.  ...  Driven by initiatives like, the amount of semantically annotated data is expected to grow steadily towards massive scale, requiring cluster-based solutions to query it.  ...  Due to space limitations, we focus on the most relevant key points in the following. Every SPARQL query defines a graph pattern to be matched against an RDF graph.  ... 
doi:10.1007/978-3-319-11964-9_11 fatcat:nbt5xlop2jahlapxvfikcqkn5y

Towards an RDF Analytics Language: Learning from Successful Experiences

Fadi Maali, Stefan Decker
2013 International Semantic Web Conference  
These languages are carefully designed to run on top of distributed computation platforms.  ...  In particular, design decisions related to the data model, schema restrictions, data transformation and the programming paradigm are examined and a number of related challenges for defining an RDF analytics  ...  The ideas in this paper benefited from valuable discussions with Aidan Hogan and Marcel Karnstedt and from the material of the "Introduction to Data Science" course on Coursera by Bill Howe.  ... 
dblp:conf/semweb/MaaliD13 fatcat:ept4br3gorejxd3hulwg26evz4

A distributed graph engine for web scale RDF data

Kai Zeng, Jiacheng Yang, Haixun Wang, Bin Shao, Zhongyuan Wang
2013 Proceedings of the VLDB Endowment  
Furthermore, since the data is stored in its native graph form, the system can support other operations (e.g., random walks, reachability) on RDF graphs as well.  ...  Furthermore, many useful and general purpose graph-based operations (e.g., random walk, reachability, community discovery) on RDF data are not supported, as most existing systems store and index data in  ...  We also note that since Trinity.RDF models data as a native graph, we enable a large range of advanced graph analytics on RDF data.  ... 
doi:10.14778/2535570.2488333 fatcat:onw7ewf765b3vho4jyhloeybcu

Towards Making Distributed RDF Processing FLINKer

Amr Azzam, Sabrina Kirrane, Axel Polleres
2018 2018 4th International Conference on Big Data Innovations and Applications (Innovate-Data)  
In this position paper, based on an indepth analysis of the state of the art, we propose to manage large RDF datasets in Flink, a well-known scalable distributed Big Data processing framework.  ...  Our approach, which we refer to as FLINKer extends the native graph abstraction of Flink, called Gelly, with RDF graph and SPARQL query processing capabilities.  ...  H2RDF+ [28] is a distributed RDF data store based on MapReduce processing and Hbase 10 indexes.  ... 
doi:10.1109/innovate-data.2018.00009 dblp:conf/obd/AzzamKP18 fatcat:s7la6h4c7fgvrf6e4qz37xz34e

Storing, Indexing and Querying Large Provenance Data Sets as RDF Graphs in Apache HBase

Artem Chebotko, John Abraham, Pearl Brazier, Anthony Piazza, Andrey Kashlev, Shiyong Lu
2013 2013 IEEE Ninth World Congress on Services  
In this work, we explore and address the challenge of efficient and scalable storage and querying of large collections of provenance graphs serialized as RDF graphs in an Apache HBase database.  ...  Specifically, we propose: (i) novel storage and indexing techniques for RDF data in HBase that are better suited for provenance datasets rather than generic RDF graphs and (ii) novel SPARQL query evaluation  ...  Efficient approaches to analytical query processing and distributed reasoning on RDF graphs in MapReduce-based systems are proposed in [20] and [21] .  ... 
doi:10.1109/services.2013.32 dblp:conf/services/ChebotkoABPKL13 fatcat:iodufllupzerzittn5prlweir4

WGB: Towards a Universal Graph Benchmark [chapter]

Khaled Ammar, M. Tamer Özsu
2014 Lecture Notes in Computer Science  
Unfortunately, with the exception of RDF stores, every system uses different datasets and queries to assess its scalability and efficiency.  ...  to real-life ones.  ...  Despite the scalability and simplicity of MapReduce systems, the model has a few shortcomings in processing graph analysis algorithms.  ... 
doi:10.1007/978-3-319-10596-3_6 fatcat:meji5npkx5g4rpcfsrppgatzb4

Distributed Semantic Web Data Management in HBase and MySQL Cluster

Craig Franke, Samuel Morin, Artem Chebotko, John Abraham, Pearl Brazier
2011 2011 IEEE 4th International Conference on Cloud Computing  
In this work, we study and compare two approaches to distributed RDF data management based on emerging cloud computing technologies and traditional relational database clustering technologies.  ...  In particular, we design distributed RDF data storage and querying schemes for HBase and MySQL Cluster and conduct an empirical comparison of these approaches on a cluster of commodity machines using datasets  ...  Efficient approaches to analytical query processing and distributed reasoning on RDF graphs in MapReduce-based systems are proposed in [7] and [8] , respectively.  ... 
doi:10.1109/cloud.2011.19 dblp:conf/IEEEcloud/FrankeMCAB11 fatcat:c73umymenvfhri5dqadjx2y25e

RDF in the clouds: a survey

Zoi Kaoudi, Ioana Manolescu
2014 The VLDB journal  
The Resource Description Framework (RDF) pioneered by the W3C is increasingly being adopted to model data in a variety of scenarios, in particular data to be published or exchanged on the Web.  ...  Cloud computing is an emerging paradigm massively adopted in many applications for the scalability, fault-tolerance and elasticity features it provides, enabling the easy deployment of distributed and  ...  other than MapReduce, and finally (iv) based on graph partitioning and graph traversals.  ... 
doi:10.1007/s00778-014-0364-z fatcat:qyp6euinnvexxlliqe2wf45ona
« Previous Showing results 1 — 15 out of 321 results