Embedded parser generators

Jonas Duregård, Patrik Jansson
2012 SIGPLAN notices  
This thesis explores rapid experimental development of programming languages, with particular emphasis on effective semi-automatic testing. Our results are actualised in two Haskell libraries: BNFC-meta and Feat. BNFC-meta is an extension of the BNF Converter (BNFC) tool. As such it is capable of building a complete compiler front end from a single high level language specification. We merge this with the practice of embedding languages in Haskell, both by embedding BNFC itself and embedding
more » ... languages defined using BNFC-meta. Embedding is carried out by means of quasi-quotation enabling use of the languages concrete syntax inside Haskell code. A simple extension to the grammar formalism adds anti-quoting, in turn allowing Haskell code embedded in the concrete syntax of the embedded languages. The end user can thus seamlessly mix concrete and abstract syntax. Our automatic approach improve on existing manually defined Haskell anti-quoters by not polluting the AST datatypes. Our second major contribution, Feat (Functional Enumeration of Algebraic Types) automatically enables property based testing on the large AST types generated by BNFC-meta and such tools, but it is useful more generally for algebraic types. Feat is based on the mathematical notion of an enumeration as a bijective function from natural numbers to an enumerated set. This means that unlike previous list-based enumeration methods it is not intrinsically serial and can be used for both random and exhaustive testing. We describe a theory of functional enumeration as a simple algebra closed under sums, products, guarded recursion and bijections. We implement these ideas in a library and show that it compares favourably to existing tools when testing AST types. v This thesis is based on the work contained in the following papers.
doi:10.1145/2096148.2034689 fatcat:yfziky2dafh7bfmser7apt36qa