Critical issues in the design of a fault-tolerant multiprocessor database server

S.O. Hvasshovd, T. Saeter, O. Torbjornsen
1991 [1991] Proceedings Pacific Rim International Symposium on Fault Tolerant Systems  
Critical design issues when designing a fault-tolerant multiprocessor database server are identied as: 1) System design must be based on components with well dened states and state transitions. 2) Component redundancy is the foundation for a fault-tolerant design. 3) The system should be a shared nothing architecture using homogeneous coarse grained nodes. 4) Basic fault detection must be done in hardware. Fault masking and repair should be done in software to achieve exibility and dynamic
more » ... gurability. 5) A database system is well-adapted to fault-tolerance because of its well-dened transaction concept. This concept is available externally and should be used internally. 6) The interconnection network should be multipath, have large communication capacity, and must be able to handle crashed nodes or channels. The communication protocol should be optimized for the actual application (a DBMS in this case). 7) The communication system should be central in internode error detection and reconguration of a fault-tolerant multiprocessor. 8) Data distribution and replication should be based on software mechanisms (as opposed to disc controller mechanisms) and data should be dynamically recongured in
doi:10.1109/rfts.1991.212941 fatcat:chspq2z6mvarpoqunipf5gggzu