A simple typed intermediate language for object-oriented languages

Juan Chen, David Tarditi
2005 Proceedings of the 32nd ACM SIGPLAN-SIGACT sysposium on Principles of programming languages - POPL '05  
Traditional class and object encodings are difficult to use in practical type-preserving compilers because of the complexity of the encodings. We propose a simple typed intermediate language for compiling object-oriented languages and prove its soundness. The key ideas are to preserve lightweight notions of classes and objects instead of compiling them away and to separate name-based subclassing from structure-based subtyping. The language can express standard implementation techniques for both
more » ... dynamic dispatch and runtime type tests. It has decidable type checking even with subtyping between quantified types with different bounds. Because of its simplicity, the language is a more suitable starting point for a practical type-preserving compiler than traditional encoding techniques. One reason compilers for object-oriented languages have not adopted type-preserving compilation is the complexity of traditional class and object encodings. A practical compiler requires simple, general and efficient type systems. First, compiler writers who are not type theorists should be able to understand the type system. Second, the type system needs to cover a large set of realistic language features and compiler transformations. Third, the type system needs to express standard implementation techniques without introducing extra runtime overhead. Traditional encodings are not a good match for these goals. We discuss these encodings more in Section 7. This paper describes a simple typed intermediate language LILC (Low-level Intermediate Language with Classes) for compiling class-based object-oriented languages. LILC is lower level than JVML [25] or CIL [14] . The key ideas are to preserve lightweight notions of classes and objects instead of compiling them away and to separate name-based subclassing from structure-based subtyping. LILC divides types into two parts: one part uses class names to type objects and keeps the name-based class hierarchy; the other part uses record types and has structural subtyping. Each class has a corresponding record type that represents its object layout. Objects and records can be coerced to each other with no runtime overhead. Keeping classes and objects has a low cost because most interesting work, such as field fetching, method invocation and cast, is done on records. Our approach simplifies the type system. First, structural recursive types are not necessary because each record type can refer to any class name, including the class to which the record type corresponds. Second, it simplifies the bounded quantification that is needed to express inheritance. The bounds for type variables are in terms of subclassing, not subtyping as in traditional bounded quantification. Thus, the bounds must be classes or type variables, not arbitrary types. As a result, LILC has decidable type checking. The contributions of this work include: • LILC is simpler and more natural than traditional encodings. It is also sound and efficient. • LILC can express standard implementation strategies for self application, dynamic type dispatch, and runtime type tests. LILC uses existential types and invariant array types to describe covariant source-level array types. It can also express runtime "store checks".
doi:10.1145/1040305.1040309 dblp:conf/popl/ChenT05 fatcat:5kjy2rnkmvdmhnbplkcml3mt7m