Using program analysis to identify and compensate for nondeterminism in fault-tolerant, replicated systems

J.G. Slember, P. Narasimhan
2004 Proceedings of the 23rd IEEE International Symposium on Reliable Distributed Systems, 2004.  
Fault-tolerant replicated applications are typically assumed to be deterministic, in order to ensure reproducible, consistent behavior and state across a distributed system. Real applications often contain nondeterministic features that cannot be eliminated. Through the novel application of program analysis to distributed CORBA applications, we decompose an application into its constituent structures, and discover the kinds of nondeterminism present within the application. We target the
more » ... s of nondeterminism that can be compensated for automatically, and highlight to the application programmer those instances of nondeterminism that need to be manually rectified. We demonstrate our approach by compensating for specific forms of nondeterminism and by quantifying the associated performance overheads. The resulting code growth is typically limited to one extra line for every instance of nondeterminism, and the runtime overhead is minimal, compared to a fault-tolerant application with no compensation for nondeterminism.
doi:10.1109/reldis.2004.1353026 dblp:conf/srds/SlemberN04 fatcat:ac3yhkjpwrhwbcwb5d2yu36o3u