Dynamic slicing on Java bytecode traces

Tao Wang, Abhik Roychoudhury
2008 ACM Transactions on Programming Languages and Systems  
Dynamic slicing is a well-known technique for program analysis, debugging and understanding. Given a program P and input I, it finds all program statements which directly/indirectly affect the values of some variables' occurrences when P is executed with I. In this paper, we develop a dynamic slicing method for sequential Java programs. Our technique proceeds by backwards traversal of the bytecode trace produced by an input I in a given program P. Since such traces can be huge, we use results
more » ... om data compression to compactly represent bytecode traces. The major space savings in our method come from the optimized representation of (a) data addresses used as operands by memory reference bytecodes, and (b) instruction addresses used as operands by control transfer bytecodes. We show how dynamic slicing algorithms can directly traverse our compact bytecode traces without resorting to costly decompression. We also extend our dynamic slicing algorithm to perform "relevant slicing"; the resultant slices can be used to explain omission errors that is, why some events did not happen during program execution. Detailed experimental results on space/time overheads of tracing and slicing are reported in the paper. The slices computed at the bytecode level are translated back by our tool to the source code level with the help of information available in Java class files. Our JSlice dynamic slicing tool has been integrated with the Eclipse platform and is available for usage in research and development. proceeds by collecting the execution trace corresponding to I. The data and control dependencies between the statement occurrences in the execution trace can be pre-computed or computed on demand during slicing [Zhang et al. 2005 ]. Due to the presence of objects and pointers in programs, static data dependence computations are often conservative, leading to very large static slices. On the other hand, dynamic slices capture the closure of dynamic data and control dependencies, hence they are much more precise, and more helpful for narrowing the attention of the programmer. Furthermore, since dynamic slices denote the program fragment affecting the slicing criterion for a particular input, they naturally support the task of debugging via running of selected test inputs. Though dynamic slicing was originally proposed for debugging [Korel and Laski 1988; Agrawal and Horgan 1990] , they have subsequently also been used for program comprehension in many other innovative ways. In particular, dynamic slices (or their variants which also involve computing the closure of dependencies by trace traversal) have been used for studying causes of program performance degradation [Zilles and Sohi 2000], identifying isomorphic instructions in terms of their run-time behaviors [Sazeides 2003] and analyzing spurious counter-example traces produced by software model checking [Majumdar and Jhala 2005] . Even in the context of debugging, dynamic slices have been used in unconventional ways e.g. [Akgul et al. 2004 ] studies reverse execution along a dynamic slice. Thus, dynamic slicing forms the core of many tasks in program development and it is useful to develop efficient methods for computing dynamic slices. In this paper, we present an infrastructure for dynamic slicing of Java programs. Our method operates on bytecode traces; we work at the bytecode level since slice computation may involve looking inside library methods and the source code of libraries may not always be available. First, the bytecode stream corresponding to an execution trace of a Java program for a given input is collected. The trace collection is done by modifying a virtual machine; we have used the Kaffe Virtual Machine in our experiments. We then perform a backward traversal of the bytecode trace to compute dynamic data and control dependencies on-the-fly. The slice is updated as these dependencies are encountered during trace traversal. Computing the dynamic data dependencies on bytecode traces is complicated due to Java's stack based architecture. The main problem is that partial results of a computation are often stored in the Java Virtual Machine's operand stack. This results in implicit data dependencies between bytecodes (involving data transfer via the operand stack). For this reason, our backwards dynamic slicing performs a "reverse" stack simulation while traversing the bytecode trace from the end. Dynamic slicing methods typically involve traversal of the execution trace. This traversal may be used to pre-compute a dynamic dependence graph or the dynamic dependencies can be computed on demand during trace traversal. Thus, the representation of execution traces is important for dynamic slicing. This is particularly the case for backwards dynamic slicing where the trace is traversed from the end (and hence needs to be stored). In practice, traces tend to be huge; [Zhang et al. 2005] reports experiences in dynamic slicing programs like gcc and perl where the execution trace runs into several hundred million instructions. It might be inefficient to perform post-mortem analysis over such huge traces. Consequently, it is useful to develop a compact representation for execution traces which capture both control flow and memory reference information. This compact trace should be generated on-the-fly during program execution. Our method proceeds by on-the-fly construction of a compact bytecode trace during pro-
doi:10.1145/1330017.1330021 fatcat:omzoc3dicnc3he5bn3yugu72uu