Link-time binary rewriting techniques for program compaction

Bjorn De Sutter, Bruno De Bus, Koen De Bosschere
2005 ACM Transactions on Programming Languages and Systems  
Small program size is an important requirement for embedded systems with limited amounts of memory. We describe how link-time compaction through binary rewriting can achieve code size reductions of up to 62% for statically bound languages such as C, C++, and Fortran, without compromising on performance. We demonstrate how the limited amount of information about a program at link time can be exploited to overcome overhead resulting from separate compilation. This is done with scalable,
more » ... scalable, cost-effective, whole-program analyses, optimizations, and duplicate code and data elimination techniques. The discussed techniques are evaluated and their cost-effectiveness is quantified with SQUEEZE++, a prototype link-time compactor. • 883 memories can only be made as small as the programs that need to be stored in them, smaller programs imply cheaper, smaller, lighter, and more autonomous devices. Our goal is to produce the most compact applications while retaining the same or similar levels of functionality, performance, and other key criteria. Developing compact applications is not simple, however. Object-orientation, component-based programming, and other modern software engineering techniques increase programmer productivity, improve software reliability, and shorten time-to-market by hiding lower-level issues from the programmer and by enabling code reuse. Unfortunately, this often comes at the expense of program size. Reusable code libraries, for example, are written with general applicability in mind and provide more functionality than is typically needed by any single application. Unless the unused functionality can be eliminated, an application will be larger than necessary. Moreover, program optimizations performed at compilation time, either on application code or on reusable library code, are limited because the whole program is not available for optimization. Again, the result is increased program size. This article discusses link-time binary rewriting techniques to overcome the discrepancy between modern software engineering practices and the need for compact programs. The discussed techniques are applicable on programs written in statically bound languages such as Fortran, C, or C++. Their goal is to eliminate unnecessary computations and duplicated code and data from a program. Link-time compaction offers several potential advantages over compile-time optimization. First, all code is available for inspection and compaction at link time, even for mixed-language programs. This includes library code that is statically linked with a program, even if this library code is distributed in a machine code format only. Link-time rewriting therefore requires no change to the often-used business models under which software is distributed in a machine code format. Second, at link time, machine-specific optimizations are possible because the link-time techniques are applied on assembly code. Finally, link-time rewriting for compaction only requires modifying the linker, while all other tools in program development chains, such as compilers, need not be modified. • B. De Sutter et al.
doi:10.1145/1086642.1086645 fatcat:je5zdjknkzh6lm7q6i4nj3z4su