DwarfCode: A Performance Prediction Tool for Parallel Applications

Weizhe Zhang, Albert M. K. Cheng, Jaspal Subhlok
2016 IEEE transactions on computers  
We present DwarfCode, a performance prediction tool for MPI applications on diverse computing platforms. The goal is to accurately predict the running time of applications for task scheduling and job migration. First, DwarfCode collects the execution traces to record the computing and communication events. Then, it merges the traces from different processes into a single trace. After that, DwarfCode identifies and compresses the repeating patterns in the final trace to shrink the size of the
more » ... nts. Finally, a dwarf code is generated to mimic the original program behavior. This smaller running benchmark is replayed in the target platform to predict the performance of the original application. In order to generate such a benchmark, two major challenges are to reduce the time complexity of trace merging and repeat compression algorithms. We propose an O(mpn) trace merging algorithm to combine the traces generated by separate MPI processes, where m denotes the upper bound of tracing distance, p denotes the number of processes, and n denotes the maximum of event numbers of all the traces. More importantly, we put forward a novel repeat compression algorithm, whose time complexity is O(nlogn). Experimental results show that DwarfCode can accurately predict the running time of MPI applications. The error rate is below 10 percent for compute and communication intensive applications. This toolkit has been released for free download as a GNU General Public License v3 software. NPB Application BT CG EP FT IS LU MG SP The largest number of communication events 17,111 41,954 5 47 38 324,355 10,043 26,891 The smallest number of communication events 17,111 41,954 5 47 36 162,189 9,329 26,891
doi:10.1109/tc.2015.2417526 fatcat:jx5zt2ozw5dnviw62ihxuue6zy