Encoding the Java Virtual Machine's Instruction Set
Electronical Notes in Theoretical Computer Science
New toolkits that parse, analyze, and transform Java Bytecode are frequently developed from scratch to obtain a representation suitable for a particular purpose. But, while the functionality implemented by these toolkits to read in class files and do basic control-and data-flow analyses is comparable, it is implemented over and over again. Differences manifest themselves mainly in minor technical issues. To avoid the repetitive development of similar functionality, we have developed an
... language for specifying bytecode-based instruction sets. Using this language, we have encoded the instruction set of the Java Virtual Machine such that it can directly be used, e.g., to generate the skeleton of bytecode-based tools. The XML format hereby specifies both the format of the instructions and their effect on the stack and the local registers upon execution. This enables developers of static analyses to generate generic control-and data-flow analyses, e.g., an analysis that transforms Java Bytecode into static single assignment form. To assess the usefulness of our approach, we have used the encoding of the Java Virtual Machine's instruction set to develop a framework for the analysis and transformation of Java class files. The evaluation shows that using the specification significantly reduces the development effort when compared to manual development.