QL: Object-oriented Queries on Relational Data

Pavel Avgustinov, Oege De Moor, Michael Jones, Max Schäfer
2016 26 Leibniz International Proceedings in Informatics Schloss Dagstuhl-Leibniz-Zentrum für Informatik   unpublished
This paper describes QL, a language for querying complex, potentially recursive data structures. QL compiles to Datalog and runs on a standard relational database, yet it provides familiar-looking object-oriented features such as classes and methods, reinterpreted in logical terms: classes are logical properties describing sets of values, subclassing is implication, and virtual calls are dispatched dynamically by considering the most specific classes containing the receiver. Furthermore, types
more » ... n QL are prescriptive and actively influence program evaluation rather than just describing it. In combination, these features enable the development of concise queries based on reusable libraries, which are written in a purely declarative style, yet can be efficiently executed even on very large data sets. In particular, we have used QL to implement static analyses for various programming languages, which scale to millions of lines of code. 1 Introduction QL is a declarative, object-oriented logic programming language for querying complex, potentially recursive data structures encoded in a relational data model. It is a general-purpose query language, but its strong support for recursion and aggregates makes it particularly well suited for implementing static analyses, code queries and software metrics. Although this paper is not about static analysis per se, it is in this area that QL, being the technical basis of Semmle's engineering analytics platform, has seen most use so far, so we will use it as our main source of motivating examples. A static analysis implemented in QL is simply a query run on a special database: the database contains a representation of the program to analyse (encoding, say, its abstract syntax tree or control flow graph), from which the query computes a set of result tuples. A bug finding analysis, for instance, could return pairs of source locations and error messages. Since the database describes the program as it was at one particular point in time, we refer to it as a snapshot database. A snapshot database is created by a language-specific extractor. We have built extractors for various different languages based on existing compiler frontends. As our first example of a QL query, let us consider an analysis for finding useless expressions in JavaScript programs, i.e., pure (that is, side effect-free) expressions appearing in a void context where their value is immediately discarded. Typically, this indicates a typo, for instance mistyping an assignment "x = 42;" as an equality check "x == 42;".