An introduction to snapshot algorithms in distributed computing

A D Kshemkalyani, M Raynal, M Singhal
1995 Distributed Systems Engineering  
Recording on-the-fly global states of distributed executions is an important paradigm when one is interested in analysing, testing, or verifying properties associated with these executions. Since Chandy and Lamport's seminal paper on this topic, this problem is called the snapshot problem. Unfortunately, the lack of both a globally shared memory and a global clock in a distributed system, added to the fact that transfer delays in these systems are finite but unpredictable, makes this problem
more » ... -trivial. This paper first discusses issues which have to be addressed to compute distributed snapshots in a consistent way. Then several algorithms which determine on-the-fly such snapshots are presented for several types of networks (according to the properties of their communication channels, namely, FIFO, non-FIFO, and causal delivery). Columbus, OH 43210, USA Example: Let S1 and S2 be two distinct sites of a distributed system which maintain bank accounts A and B, respectively. A site refers to a process in this example. Let the communication channels from site S1 to site S2 and from site S2 to site S1 be denoted by Cl2 and Czl, 0967-1&16/95/040224d0519.50 0 1995 The British Computer Society, The Institution of Electrical Engineers and IOP Publishing Ltd
doi:10.1088/0967-1846/2/4/005 fatcat:jea5erm5xvg45ht7qe36ot4beq