Provenance Verification [chapter]

Rupak Majumdar, Roland Meyer, Zilong Wang
2013 Lecture Notes in Computer Science  
The provenance of an object is the history of its origin and derivation. Provenance tracking records the provenance of an object as it evolves. In computer science, provenance tracking has been studied in many different settings, such as databases [7, 3, 2] , scientific workflows [13, 5] , and program analysis [4, 12, 9] , often under different names (lineage, dependence analysis, taint analysis) and with varying degrees of (in)formality. Provenance information can be used in many ways, for
more » ... ple, to identify which sources of data led to a result, to ensure reproducibility of a scientific workflow, or to check security properties such as information flow. We study provenances tracking in the context of distributed message-passing programs. These programs consist of principals, who communicate with each other, and associate additional information with messages -the provenance. In a simple setting, the provenance records the sequence of principals that accessed the message in the past (with principals potentially appearing multiple times). We study the provenance verification problem: the problem of statically checking whether the provenances of all messages belong to a specified regular set of provenances along all possible executions of the program. We give a unifying view of provenance tracking for distributed messagepassing programs. Following Souilah, Francalanza, and Sassone [14], we model distributed systems in the π-calculus and give a provenance-carrying semantics. This semantics is relative to a domain of provenance annotations. Besides the regular word-languages mentioned above, we use the domains of provenance sets and regular tree-languages. We focus on the algorithmic verification of provenances. Since the provenance-verification problem is undecidable for the full π-calculus, we consider restricted classes of programs. Our main result shows that provenance verification is decidable for the class of depth-bounded π-calculus processes [11], an expressive class that subsumes most known decidable subclasses of the πcalculus. Intuitively, depth-boundedness is a restriction on the communication topologies which limits the length of acyclic paths. Depth-bounded systems strictly generalize Petri nets, and are expressive enough to capture common programming models, such as asynchronous programs [6], actor-like programs [15] , and some further generalizations. We show a reduction from provenance verification to coverability of depthbounded processes, a problem shown to be decidable [15] . Our proof uses wellstructuredness arguments [1] with symbolic representations of automata. Interestingly, the general method is strong enough to recover the decidability of provenance verification for asynchronous message passing programs with finite data domains [10] .
doi:10.1007/978-3-642-41036-9_3 fatcat:5qrlyikhizgsbmfjl6yehhc3oe