Agreement in synchronous networks with ubiquitous faults
Theoretical Computer Science
In this paper we are interested in synchronous distributed systems subject to transient and ubiquitous failures. This includes systems where failures will occur on any communication link, systems where every processor will experience at one time or another send or receive failure, etc., and, following a failure, normal functioning resuming after a finite time. Notice that these cases cannot be handled by the traditional component failure models. The model we use is the communication failure
... l, also called the transmission failure or dynamic faults or mobile faults model. Using this model, we study the fundamental problem of agreement in synchronous networks of arbitrary topology with ubiquitous faults. We establish bounds on the number of dynamic faults that make any non-trivial form of agreement (even strong majority) impossible; in turn, these bounds express connectivity requirements that must be met to achieve any meaningful form of agreement. We also provide, constructively, bounds on the number of dynamic faults in spite of which any non-trivial form of agreement (even unanimity) is possible. These bounds are shown to be tight for a large class of networks, which includes hypercubes, toruses, rings, and complete graphs; incidentally, we close the existing gap between possibility and impossibility of non-trivial agreement in complete graphs in the presence of dynamic Byzantine faults. None of these results is derivable in the component failure models; in particular, all our possibility results hold in situations for which those models indicate impossibility.