The failure and recovery problem for replicated databases

Philip A. Bernstein, Nathan Goodman
1983 Proceedings of the second annual ACM symposium on Principles of distributed computing - PODC '83  
A replicated database is a distributed database in which some data items are stored redundantly at multiple sites. The main goal is to improve system reliability. By storing critical data at multiple sites, the system can operate even though some sites have failed. However, few distributed database systems support replicated data, because it is difficult to manage as sites fail and recover. A replicated data algorithm has two |)arts. One is a discipline for reading and writing data item copies.
more » ... The other is a concurrency control algorithm for synchronizing those operations. The read-write discipline ensures that if one transaction writes logical data item x, and another transaction reads or writes x, there is some physical manifestation of that logical conflict. The concurrency control algorithm synchronizes physical conflicts; it knows nothing about logical conflicts. In a correct replicated data algorithm, the physical manifestation of conflicts rnust be strong enough so that synchronizing physical conflicts is sufficient for correctness. This paper presents a theory for proving the correctness of algorithms that manage replicated data. The theory is an extension of serializabi-lity theory. We apply it to three replicated
doi:10.1145/800221.806714 dblp:conf/podc/BernsteinG83 fatcat:hgkfy6ug2zgd3k5jixxq2twrsq