Implementing Trustworthy Services Using Replicated State Machines [chapter]

Fred B. Schneider, Lidong Zhou
2005 Lecture Notes in Computer Science  
ivide and conquer" can be a powerful tool for disentangling complexity when designing a computing system. However, some aspects of system design are inseparable. Treating these as though they were independent leads to one interfering with the other, and "divide and be conquered" perhaps better characterizes the consequences. For some years, we have been investigating how to construct systems that continue functioning despite component failures and attacks. A question we have pondered is to what
more » ... extent does divide and conquer apply? Somewhat less than you might hope is, unfortunately, the answer. One could argue that attacks can be seen as just another cause for component failure. The Byzantine fault model asserts that a faulty component can exhibit arbitrarily malicious (so-called "Byzantine") behavior; a system that tolerates Byzantine faults should then be able to handle anything. Moreover, because any component can be viewed abstractly in terms of its state and a set of possible next-state transitions-in short, a state machine-fault-tolerant services can be built by assembling enough state-machine copies so that outputs from the ones exhibiting Byzantine behavior are outvoted by the correctly functioning ones. The fault-tolerance of the ensemble thus exceeds the fault-tolerance of any individual state machine, and a distributed fault-tolerance is the result. A closer look at such replicated state machines, however, reveals problems when attacks are possible. Specific difficulties with the approach and how we can overcome these are described later in this article, but the overall vision remains compelling: place more trust in an ensemble than in any of its individual components. In analogy with distributed fault-tolerance, then, we are seeking ways to implement distributed trust.
doi:10.1007/978-3-642-11294-2_8 fatcat:pjsapx7n4fgv5ms7v2fhvg4yjq