A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is application/pdf
.
Distributed data provenance for large-scale data-intensive computing
2013
2013 IEEE International Conference on Cluster Computing (CLUSTER)
It has become increasingly important to capture and understand the origins and derivation of data (its provenance). A key issue in evaluating the feasibility of data provenance is its performance, overheads, and scalability. In this paper, we explore the feasibility of a general metadata storage and management layer for parallel file systems, in which metadata includes both file operations and provenance metadata. We experimentally investi
doi:10.1109/cluster.2013.6702685
dblp:conf/cluster/ZhaoSMR13
fatcat:yxktbz7zu5gf7m2umseien36de