Hardware-assisted replay of multiprocessor programs

David F. Bacon, Seth Copen Goldstein
1991 SIGPLAN notices  
Shared-memory parallel programs can be highly nondeterministic due to the unpredictable order in which shared references are satisfied. However, deterministic execution is extremely important for debugging' and can also be used for fault-tolerance and other replay-based algorithms. We present a hardware/software design that allows the order of memory references in a parallel program to be logged efficiently by recording a subset of the cache traffic between memory and the CPU 's. This log can
more » ... en be used along with hardware and software control to replay execution. Simulation of several parallel programs shows that our device records no more than 1.17 MB/second for an application exhibiting fine-grained sharing behavior on a 16-way multiprocessor consisting of 12 MIP CPU'S. In addition, no probe effect or performance degradation is introduced. This represents several orders of magnitude improvement in both performance and log size over purely software-based methods proposed previously. Permission to copy without fee all or part of this material is 2 Previous Work granted provided that the copies are not made or distributed for direct commercial advantage, the ACM copyright notice and the title of the publication and its date appear, and notice is given Debugging by replaying program execution has been of that copying is by permission of the Association for Computing Machinery. To copy otherwise, or to republish, requires a fee interest to the research community for over 20 years [2]. and/or specific permission. However, the cost of recording the necessary trace data
doi:10.1145/127695.122777 fatcat:7vavzj2jbrbynp3wgfxskpmdxe