Automatic generation of library bindings using static analysis

Tristan Ravitch, Steve Jackson, Eric Aderhold, Ben Liblit
2009 Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation - PLDI '09  
High-level languages are growing in popularity. However, decades of C software development have produced large libraries of fast, timetested, meritorious code that are impractical to recreate from scratch. Cross-language bindings can expose low-level C code to high-level languages. Unfortunately, writing bindings by hand is tedious and error-prone, while mainstream binding generators require extensive manual annotation or fail to offer the language features that users of modern languages have
more » ... me to expect. We present an improved binding-generation strategy based on static analysis of unannotated library source code. We characterize three high-level idioms that are not uniquely expressible in C's lowlevel type system: array parameters, resource managers, and multiple return values. We describe a suite of interprocedural analyses that recover this high-level information, and we show how the results can be used in a binding generator for the Python programming language. In experiments with four large C libraries, we find that our approach avoids the mistakes characteristic of hand-written bindings while offering a level of Python integration unmatched by prior automated approaches. Among the thousands of functions in the public interfaces of these libraries, roughly 40% exhibit the behaviors detected by our static analyses.
doi:10.1145/1542476.1542516 dblp:conf/pldi/RavitchJAL09 fatcat:eypoxwd4b5ebpoy5xh6qnr4jea