Environment analysis via ΔCFA

Matthew Might, Olin Shivers
2006 SIGPLAN notices  
We describe a new program-analysis framework, based on CPS and procedure-string abstractions, that can handle critical analyses which the k-CFA framework cannot. We present the main theorems concerning correctness, show an application analysis, and describe a running implementation. Conventions Boldface v, and brackets v1, . . . , vn denote vectors. Vectors may be implicitly promoted to sets with the obvious meaning. We implicitly lift functions element-wise over sets and tuples, and pointwise
more » ... les, and pointwise over functions. f |D is the function f restricted to domain D. S is the set complement of set S. We assume natural definitions for lattice components , ⊥, , and . For a lattice P(X), we define x P(X) y iff ∀x ∈ X : ∃y ∈ Y : x X y, that is P(X) is not ⊆. A large, multi-line curly brace { or } indicates logical conjunction. 127 Partitioned CPS Our analysis operates over a syntactically partitioned continuationpassing style (CPS) input [13], intended for use as an intermediate form generated from programs written in a direct-style λcalculus language, with user-level access to full, first-class continuations, such as Scheme or SML/NJ. By "partitioned," we mean that all the forms (variables, call arguments, calls and λ expressions) are statically marked as belonging to either the "user" world, or the "continuation" world. We adopt the term "user world," as continuation forms cannot be expressed directly by the programmer in the original, direct-style source. (What Scheme programmers think of as continuations, that is, the values created by the call-with-current-continuation procedure, are, with respect to this partition, still user-world values. They just happen to be user-world procedures that internally capture a continuationworld value.) When translating from direct-style code to CPS, each λ expression from the source maps to a "user" λ expression, while return points or evaluation context in the direct-style form are mapped to continuation λ expressions. The CPS conversion also provides two static constraints: only user procedures take continuation arguments, and every user procedure takes at least one. So continuations are never passed to continuations. Figure 2 shows the resulting grammar. Code points are marked by means of unique labels attached to λ expressions and call sites. We assume two distinct sets of labels, one for userworld items and one for continuation-world items. This is how we mark our user/continuation partition. (It also means that we can treat the two worlds uniformly simply by ignoring labels, which is convenient at times.) A user λ expression, ulam, is tagged with a user-world label ; its formal parameters are partitioned into zero or more user-world parameters, u, and one or more continuation parameters, k. Having multiple continuation parameters allows us to encode conditional-control operators as functions and easily encode multi-return function calls [11] . A continuation λ expression, clam, is marked with a continuation-world label, γ, and has only user-world formals, u. Call sites (ucall and ccall ) are marked and partitioned in a similar way. To improve precision, we also require the program to be alphatised, that is, no two bound variables have the same name. We use the function free to denote the free variables of a term. The function L pr ∈ LAB → LAM + CALL maps labels to terms for a program pr . We use B pr ∈ LAB → P(VAR) to map the label of a λ expression to the variables it binds. For instance, B pr (ψ) = {x, y, k} if (λ ψ (x y k) call ) is in pr . For compactness, let B pr (ψ) mean i B (ψi). When the program pr is clear from context, we omit it from the notation.
doi:10.1145/1111320.1111049 fatcat:kwuxc4ebzrdm5fsu4hxp6c7cyu