A "flight data recorder" for enabling full-system multiprocessor deterministic replay

Min Xu, Rastislav Bodik, Mark D. Hill
2003 Proceedings of the 30th annual international symposium on Computer architecture - ISCA '03  
Debuggers have been proven indispensable in improving software reliability. Unfortunately, on most real-life software, debuggers fail to deliver their most essential feature -a faithful replay of the execution. The reason is non-determinism caused by multithreading and non-repeatable inputs. A common solution to faithful replay has been to record the non-deterministic execution. Existing recorders, however, either work only for datarace-free programs or have prohibitive overhead. As a step
more » ... ds powerful debugging, we develop a practical low-overhead hardware recorder for cachecoherent multiprocessors, called Flight Data Recorder (FDR). Like an aircraft flight data recorder, FDR continuously records the execution, even on deployed systems, logging the execution for post-mortem analysis. FDR is practical because it piggybacks on the cache coherence hardware and logs nearly the minimal threadordering information necessary to faithfully replay the multiprocessor execution. Our studies, based on simulating a four-processor server with commercial workloads, show that when allocated less than 7% of system's physical memory, our FDR design can capture the last one second of the execution at modest (less than 2%) slowdown.
doi:10.1145/859618.859633 fatcat:vyyaokj7ljel5fc64u5oduaj2q