Optimal Sparsification for Some Binary CSPs Using Low-Degree Polynomials

Bart M. P. Jansen, Astrid Pieterse
2019 ACM Transactions on Computation Theory  
This paper analyzes to what extent it is possible to efficiently reduce the number of clauses in NP-hard satisfiability problems, without changing the answer. Upper and lower bounds are established using the concept of kernelization. Existing results show that if NP ⊆ coNP/poly, no efficient preprocessing algorithm can reduce n-variable instances of cnf-sat with d literals per clause, to equivalent instances with O(n d−ε ) bits for any ε > 0. For the Not-All-Equal sat problem, a compression to
more » ... ize O(n d−1 ) exists. We put these results in a common framework by analyzing the compressibility of binary CSPs. We characterize constraint types based on the minimum degree of multivariate polynomials whose roots correspond to the satisfying assignments, obtaining (nearly) matching upper and lower bounds in several settings. Our lower bounds show that not just the number of constraints, but also the encoding size of individual constraints plays an important role. For example, for Exact Satisfiability with unbounded clause length it is possible to efficiently reduce the number of constraints to n+1, yet no polynomial-time algorithm can reduce to an equivalent instance with O(n 2−ε ) bits for any ε > 0, unless NP ⊆ coNP/poly. The vast majority of the currently known results in this direction are negative [10, 18, 19, 20] , stating that no nontrivial sparsification is possible under plausible complexity-theoretic assumptions. For example, Dell and van Melkebeek [10] obtained such a result for CNF-Satisfiability with clauses of size at most d (d-cnf-sat), for each fixed d ≥ 3. Assuming NP ⊆ coNP/poly, there is no polynomial-time algorithm that compresses any n-variable instance of d-cnf-sat to an equivalent instance with O(n d−ε ) bits for ε > 0. Since there are O(n d ) possible clauses of size at most d over n variables, the trivial compression scheme that outputs a bitstring of length O(n d ), denoting for each possible clause whether it occurs in the instance or not, is optimal up to n o(1) factors. A problem for which nontrivial polynomial-time sparsification is possible was recently discovered by the current authors [20] . Any n-variable instance of the Not-All-Equal CNF-Satisfiability problem with clauses of size at most d (d-nae-sat) can efficiently be compressed to an equivalent instance with O(n d−1 ) clauses, which can be encoded in O(n d−1 log n) bits. The preprocessing algorithm is based on a linear-algebraic lemma by Lovász [27] to identify clauses that are implied by others, allowing a reduction from Θ(n d ) clauses to O(n d−1 ). This sparsification for d-nae-sat forms the starting point for this work. Since d-cnf-sat and d-nae-sat can both be seen as constraint satisfaction problems (CSPs) with a binary domain, it is natural to ask whether the positive results for d-nae-sat extend to other binary CSPs. The difference between d-cnf-sat and d-nae-sat shows that the type of constraints that one allows, affects the compressibility of the resulting CSP. The goal of this paper is to understand how the optimal compression size for a binary CSP depends on the type of legal constraints, with the aim of obtaining matching upper and lower bounds. Before presenting our results, we give an example to illustrate our methods. Consider the NP-complete Exact d-CNF-Satisfiability (Exact d-sat) problem, which asks whether there is a truth assignment that satisfies exactly one literal in each clause; the clauses have size at most d. While there are Θ(n d ) different clauses that can occur in an instance with n variables, the exact nature of the problem makes it possible to reduce any instance to an equivalent one with n + 1 clauses. A clause such as x 1 ∨ x 3 ∨ ¬x 5 naturally corresponds to an equality constraint of the form x 1 + x 3 + (1 − x 5 ) = 1, since a 0/1-assignment to the variables satisfies exactly one literal of the clause if and only if it satisfies the equality. To find redundant clauses, transform each of the m clauses into an equality to obtain a system of equalities Ax = b where A is an m × n matrix, x is the column vector (x 1 , . . . , x n ), and b is an integer column vector. Using Gaussian elimination, one can efficiently compute a basis B for the row space of the extended matrix (A|b): a set of equalities such that every equality can be written as a linear combination of equalities in B. Since (A|b) has n + 1 columns, its rank is at most n + 1 and the basis B contains at most n + 1 equalities. To perform data reduction, remove all clauses from the Exact d-sat instance whose corresponding equalities do not occur in B. If an assignment satisfies f 1 (x) = b 1 and f 2 (x) = b 2 , then it also satisfies their sum f 1 (x) + f 2 (x) = b 1 + b 2 , and any linear combination of the satisfied equalities. Since any equality not in B can be written as a linear combination of equalities in B, a truth assignment satisfying all clauses from B must necessarily also satisfy the remaining clauses, which shows the correctness of the data reduction procedure. The resulting instance can be encoded in O(n log n) bits, as each of the remaining n + 1 clauses has d ∈ O(1) literals. Our results Our positive results are generalizations of the linear-algebraic data reduction tool for binary CSPs presented above. They reveal that the O(n)-bit compression for Exact d-sat, the O(n d−1 )-bit compression for d-nae-sat, and the O(n d )-bit compression for d-cnf-sat
doi:10.1145/3349618 fatcat:p7f56ucksrbczgjnlz63fdf67m