Non-linear reasoning for invariant synthesis

Zachary Kincaid, John Cyphert, Jason Breck, Thomas Reps
2017 Proceedings of the ACM on Programming Languages  
Automatic generation of non-linear loop invariants is a long-standing challenge in program analysis, with many applications. For instance, reasoning about exponentials provides a way to find invariants of digital-filter programs, and reasoning about polynomials and/or logarithms is needed for establishing invariants that describe the time or memory usage of many well-known algorithms. An appealing approach to this challenge is to exploit the powerful recurrence-solving techniques that have been
more » ... ques that have been developed in the field of computer algebra, which can compute exact characterizations of non-linear repetitive behavior. However, there is a gap between the capabilities of recurrence solvers and the needs of program analysis: (1) loop bodies are not merely systems of recurrence relations-they may contain conditional branches, nested loops, non-deterministic assignments, etc., and (2) a client program analyzer must be able to reason about the closed-form solutions produced by a recurrence solver (e.g., to prove assertions). This paper presents a method for generating non-linear invariants of general loops based on analyzing recurrence relations. The key components are an abstract domain for reasoning about non-linear arithmetic, a semantics-based method for extracting recurrence relations from loop bodies, and a recurrence solver that avoids closed forms that involve complex or irrational numbers. Our technique has been implemented in a program analyzer that can analyze general loops and mutually recursive procedures. Our experiments show that our technique shows promise for non-linear assertion-checking and resource-bound generation. Compositional recurrence analysis (CRA) [Farzan and Kincaid 2015] is another line of work, which focuses on over-approximate analysis of general loops rather than precise analysis of syntactically restricted loops. Recent work [Kincaid et al. 2017 ] extends the generality of CRA even further, showing how the approach can be applied to recursive procedures as well as loops. The key idea that makes CRA so broadly applicable is that it represents loop behavior using logical formulas, and uses semantics-based techniques to find implied recurrence relations. However, CRA's ability to reason about non-linear behavior is limited by the fact that it uses SMT solving, linear algebra, and polyhedral techniques to extract recurrences from loop bodies, and polynomial summation to solve them. In particular, CRA is only capable of extracting recurrence relations that can be expressed in linear integer arithmetic and that have polynomial closed forms-effectively exploiting only a fraction of what recurrence solvers (e.g., the ones used in [Kovács 2008; Rodríguez-Carbonell and Kapur 2004] ) are capable of. In this paper, we present extensions to the numerical-reasoning techniques underlying CRA, and demonstrate that these extensions enable CRA to establish many non-linear numerical invariants. The contributions of the paper are three-fold: • We present the wedge abstract domain, a numerical abstract domain capable of reasoning about non-linear arithmetic. Just as convex polyhedra represent properties in the conjunctive fragment of linear arithmetic, wedges represent properties in the conjunctive fragment of non-linear arithmetic (including polynomials, exponentials, and logarithms). The deductive power of wedges is due to polyhedral and Gröbner-basis techniques, congruence closure, and simple inference rules for non-linear functions. The key operation supported by the domain is symbolic abstraction [Reps et al. 2004; Thakur and Reps 2012] , which, given an arbitrary non-linear formula φ, computes a wedge that over-approximates φ. (See §4.) • We present a semantics-based algorithm for extracting recurrence relations that are entailed by a loop-body formula. The algorithm is based on first over-approximating the loop body by a wedge, and then using techniques from linear algebra to extract recurrences from the wedge. The algorithm can extract recurrences involving non-linear arithmetic and interdependent program variables; the class of recurrences that can be extracted by this algorithm corresponds to C-finite sequences [Kauers and Paule 2011, §4.2]. (See §5.) • We present an algorithm, OCRS, that is able to solve these recurrences, and find closedform solutions that include polynomials, exponentials, and logarithms. OCRS is based on an automated and enhanced form of the discrete operational calculus of Berg [1967]. Classically, the closed forms of C-finite sequences involve algebraic irrational or algebraic complex numbers, 1 but OCRS avoids non-rational numbers by using what we call implicitly interpreted functions. Each implicitly interpreted function is associated with a term in the logic of OCRS that exactly characterizes the function, but outside of the recurrence solver (and in particular, within the wedge domain) an implicitly interpreted function is treated as an uninterpreted function symbol. (See §6.) Our approach builds upon the recent work of Kincaid et al. [2017], which extended CRA so that it can analyze recursive programs using essentially the same approach that it uses to handle loops. Organization. §2 illustrates the main features of our method via a series of examples. §3 presents relevant background material. §4 presents the wedge abstract domain. §5 describes the method used in our system for extracting a recurrence relation from a wedge. §6 presents OCRS. §7 presents experimental results. §8 discusses related work. 1 An algebraic number is a complex number that is a root of a non-zero univariate polynomial with rational coefficients. Henceforth, we shorten "algebraic irrational" and "algebraic complex" to "irrational" and "complex, " respectively.
doi:10.1145/3158142 dblp:journals/pacmpl/KincaidCBR18 fatcat:rzj7q45oi5aihjxhduy7e6suqu