Answering FO+MOD Queries under Updates on Bounded Degree Databases

Christoph Berkholz, Jens Keppeler, Nicole Schweikardt
2018 ACM Transactions on Database Systems  
We investigate the query evaluation problem for fixed queries over fully dynamic databases, where tuples can be inserted or deleted. The task is to design a dynamic algorithm that immediately reports the new result of a fixed query after every database update. We consider queries in first-order logic (FO) and its extension with modulo-counting quantifiers (FO+MOD), and show that they can be efficiently evaluated under updates, provided that the dynamic database does not exceed a certain degree
more » ... ound. In particular, we construct a data structure that allows to answer a Boolean FO+MOD query and to compute the size of the query result within constant time after every database update. Furthermore, after every update we are able to immediately enumerate the new query result with constant delay between the output tuples. The time needed to build the data structure is linear in the size of the database. Our results extend earlier work on the evaluation of first-order queries on static databases of bounded degree and rely on an effective Hanf normal form for FO+MOD recently obtained by Heimberg, Kuske, and Schweikardt (LICS 2016). database. The data structure shall be designed in such a way that it quickly provides the query result, preferably in constant time (i. e., independent of the database size). We focus on the following flavours of query evaluation. Testing: Decide whether a given tuple a is contained in ϕ(D). Counting: Compute |ϕ(D)| (i.e., the number of tuples that belong to ϕ(D)). Enumeration: Enumerate ϕ(D) with a bounded delay between the output tuples. Here, as usual, ϕ(D) denotes the k-ary relation obtained by evaluating a k-ary query ϕ on a relational database D. For Boolean queries, all three tasks boil down to Answering: Decide if ϕ(D) = ∅. Compared to the dynamic descriptive complexity framework introduced by Patnaik and Immerman [17] , which focuses on the expressive power of first-order logic on dynamic databases and has led to a rich body of literature (see [18] for a survey), we are interested in the complexity of query evaluation. The query language studied in this paper is FO+MOD, the extension of first-order logic FO with modulo-counting quantifiers of the form ∃ i mod m x ψ, expressing that the number of witnesses x that satisfy ψ is congruent to i modulo m. FO+MOD can be viewed as a subclass of SQL that properly extends the relational algebra. Following [2], we say that a query evaluation algorithm is efficient if the update time is either constant or at most polylogarithmic (log c n) in the size of the database. As a consequence, efficient query evaluation in the dynamic setting is only possible if the static problem (i.e., the setting without database updates) can be solved in linear or pseudo-linear (n 1+ε ) time. Since this is not always possible, we provide a short overview on known results about first-order query evaluation on static databases and then proceed by discussing our results in the dynamic setting. First-order query evaluation on static databases. The problem of deciding whether a given database D satisfies a FO-sentence ϕ is AW[ * ]-complete (parameterised by ||ϕ||) and it is therefore generally believed that the evaluation problem cannot be solved in time f (||ϕ||)||D|| c for any computable f and constant c (here, ||ϕ|| and ||D|| denote the size of the query and the database, respectively). For this reason, a long line of research focused on increasing classes of sparse instances ranging from databases of bounded degree [19] (where every domain element occurs only in a constant number of tuples in the database) to classes that are nowhere dense [9] . In particular, Boolean first-order queries can be evaluated on classes of databases of bounded degree in linear time f (||ϕ||)||D||, where the constant factor f (||ϕ||) is 3-fold exponential in ||ϕ|| [19, 7] . As a matter of fact, Frick and Grohe [7] showed that the 3-fold exponential blow-up in terms of the query size is unavoidable assuming FPT = AW[ * ]. Durand and Grandjean [5] and Kazana and Segoufin [11] considered the task of enumerating the result of a k-ary first-order query on bounded degree databases and showed that after a linear time preprocessing phase the query result can be enumerated with constant delay. This result was later extended to classes of databases of bounded expansion [12] . Kazana and Segoufin [12] also showed that counting the number of result tuples of a k-ary first-order query on databases of bounded expansion (and hence also on databases of bounded degree) can be done in time f (||ϕ||)||D||. In [6] an analogous result was obtained for classes of databases of low degree (i. e., degree at most ||D|| o(1) ) and pseudo-linear time f (||ϕ||)||D|| 1+ε ; the paper also presented an algorithm for enumerating the query results with constant delay after pseudo-linear time preprocessing. Our contribution. We extend the known linear time algorithms for first-order logic on classes of databases of bounded degree to the more expressive query language FO+MOD.
doi:10.1145/3232056 fatcat:43y7di3muvh3vg6vtzbvij4xdm