Load Balanced Semantic Aware Distributed RDF Graph [article]

Ami Pandat, Nidhi Gupta, Minal Bhise
2021 pre-print
The modern day semantic applications store data as Resource Description Framework (RDF) data.Due to Proliferation of RDF Data, the efficient management of huge RDF data has become essential. A number of approaches pertaining to both relational and graph-based have been devised to handle this huge data. As the relational approach suffers from query joins, we propose a semantic aware graph based partitioning method. The partitioned fragments are further allocated in a load balanced way. For
more » ... ent query processing, partial replication is implemented. It reduces Inter node Communication thereby accelerating queries on distributed RDF Graph. This approach has been demonstrated in two phases partitioning and Distribution of Linked Observation Data (LOD). The time complexity for partitioning and distribution of Load Balanced Semantic Aware RDF Graph (LBSD) is O(n) where n is the number of triples which is demonstrated by linear increment in algorithm execution time (AET) for LOD data scaled from 1x to 5x. LBSD has been found to behave well till 4x. LBSD is compared with the state of the art relational and graph-based partitioning techniques. LBSD records 71% QET gain when averaged over all the four query types. For most frequent query types, Linear and Star, on an average 65% QET gain is recorded over original configuration for scaling experiments. The optimal replication level has been found to be 12% of original data.
doi:10.1145/3472163.347216 arXiv:2107.10831v1 fatcat:4xlwpm6u25f3fmyiamuksjrl5i