Incremental forest: a DSL for efficiently managing filestores

Jonathan DiLorenzo, Richard Zhang, Erin Menzies, Kathleen Fisher, Nate Foster
2016 Proceedings of the 2016 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications - OOPSLA 2016  
File systems are often used to store persistent application data, but manipulating file systems using standard APIs can be difficult for programmers. Forest is a domain-specific language that bridges the gap between the on-disk and inmemory representations of file system data. Given a highlevel specification of the structure, contents, and properties of a collection of directories, files, and symbolic links, the Forest compiler generates tools for loading, storing, and validating that data.
more » ... rtunately, the initial implementation of Forest offered few mechanisms for controlling cost-e.g., the runtime system could load gigabytes of data, even if only a few bytes were needed. This paper introduces Incremental Forest (iForest), an extension to Forest with an explicit delay construct that programmers can use to precisely control costs. We describe the design of iForest using a series of running examples, present a formal semantics in a core calculus, and define a simple cost model that accurately characterizes the resources needed to use a given specification. We propose skins, which allow programmers to modify the delay structure of a specification in a compositional way, and develop a static type system for ensuring compatibility between specifications and skins. We prove the soundness and completeness of the type system and a variety of algebraic properties of skins. We describe an OCaml implementation and evaluate its performance on applications developed in collaboration with watershed hydrologists. * Work performed at Cornell University.
doi:10.1145/2983990.2984034 dblp:conf/oopsla/DiLorenzoZMFF16 fatcat:7tiuo6eumjhyxbmg47cvgylkdu