### The PCP theorem by gap amplification

Irit Dinur
2006 Proceedings of the thirty-eighth annual ACM symposium on Theory of computing - STOC '06
The PCP theorem [3, 2] says that every language in NP has a witness format that can be checked probabilistically by reading only a constant number of bits from the proof. The celebrated equivalence of this theorem and inapproximability of certain optimization problems, due to [12] , has placed the PCP theorem at the heart of the area of inapproximability. In this work we present a new proof of the PCP theorem that draws on this equivalence. We give a combinatorial proof for the NP-hardness of
more » ... proximating a certain constraint satisfaction problem, which can then be reinterpreted to yield the PCP theorem. Our approach is to consider the unsat value of a constraint system, which is the smallest fraction of unsatisfied constraints, ranging over all possible assignments for the underlying variables. We describe a new combinatorial amplification transformation that doubles the unsat-value of a constraintsystem, with only a linear blowup in the size of the system. The amplification step causes an increase in alphabet-size that is corrected by a (standard) PCP composition step. Iterative application of these two steps yields a proof for the PCP theorem. The amplification lemma relies on a new notion of "graph powering" that can be applied to systems of binary constraints. This powering amplifies the unsat-value of a constraint system provided that the underlying graph structure is an expander. We also extend our amplification lemma towards construction of assignment testers (alternatively, PCPs of Proximity) which are slightly stronger objects than PCPs. We then construct PCPs and locally-testable codes whose length is linear up to a polylog factor, and whose correctness can be probabilistically verified by making a constant number of queries. Namely, we prove SAT ∈ P CP 1 2 ,1 [log 2 (n · poly log n), O(1)]. * Hebrew University. Email: dinuri@cs.huji.ac.il. Supported by the Israel Science Foundation. • If x ∈ L then for any proof π, Pr[V π (x) accepts] ≤ 1 2 . The PCP theorem says that every language in NP has a verifier that uses at most O(log n) random bits and reads only O(1) bits from the proof. In other words, This theorem was a great surprise, as it completely revises our concept of a proof. Rather than the classical notion of a proof as a sequential object that if erroneous in even one place can easily prove a false statement. The PCP theorem provides a new proof notion that is more robust, and must be erroneous in many places when attempting to prove a falsity. Historically, the class P CP [r, q] stemmed out of the celebrated notion of interactive proofs [20, 4] and the class IP. The original motivation for defining IP was cryptographic, but it soon lead to a list of remarkable complexity-theoretic results, including for example IP=PSPACE (see [24, 32] ). We will not give a detailed historic account which can be found in, say, [1]. Let us just mention that an exciting sequence of papers (see [6, 14, 5] ) lead to the following theorem: the class of all languages with exponential-sized proofs is equal to the class of all languages that can be verified by a (randomized) polynomial-time verifier. At this point attempts were made to "scale down" this result so as to characterize the class NP in similar terms, through suitable restriction of the verifier. This was especially motivated by the discovery of [12] that connected such a scale-down to an inapproximability result for the clique number (see below). This scale-down was achieved partially in [3] and completed in [2] and came to be known as the PCP theorem. The techniques that lead to the proof were mainly algebraic, including low-degree extension over finite fields, low-degree test, parallelization through curves, a sum-check protocol, and the Hadamard and quadratic functions encodings. PCP and Inapproximability As mentioned above, the discovery of the PCP theorem came hand in hand with the beautiful and surprising connection, discovered by Feige et. al. [12] , between PCP characterizations of NP and the hardness of approximating the clique number in a graph. Predating these developments the situation regarding approximation problems was unclear. There was no clue why different approximation problems seem to exhibit different approximation behavior. The PCP theorem implied, for the first time, that numerous problems (including, for example, max-3-SAT) are hard to approximate. This has had a tremendous impact on the study of combinatorial optimization problems, and today the PCP theorem stands at the heart of nearly all hardness-of-approximation results. The connection to inapproximability is best described through constraint satisfaction problems. Let us begin by defining a constraint, Definition 1.1 Let V = {v 1 , . . . , v n } be a set of variables, and let Σ be a finite alphabet. A q-ary constraint (C, i 1 , . . . , i q ) consists of a q-tuple of indices i 1 , . . . , i q ∈ [n] and a subset C ⊆ Σ q of "acceptable" values. A constraint is satisfied by a given assignment a : The constraint satisfaction problem (CSP) is the problem of, given a system of constraints C = {c 1 , . . . , c n } over a set of variables V , deciding whether there is an assignment for the variables that satisfies every constraint. This problem is clearly NP-complete as it generalizes many well known NPcomplete problems such as 3-SAT and 3-colorability. For example, in the equivalent of the 3-colorability problem, the alphabet is Σ = {1, 2, 3} and the binary constraints are of the form (C, i 1 , i 2 ) where Proposition 1.4 Given a constraint graph G = (V, E), Σ, C with |Σ| = 3, it is NP-hard to decide if UNSAT(G) = 0. Proof: We reduce from graph 3-colorability. Given a graph G, let the alphabet be Σ = {1, 2, 3} for the three colors, and equip the edges with inequality constraints. Clearly, G is 3-colorable if and only if UNSAT(G) = 0. Observe that in case UNSAT(G) > 0 it must be that UNSAT(G) ≥ 1/ |G|. Therefore, it is actually NP-hard to distinguish between the cases (i) UNSAT(G) = 0 and (ii) UNSAT(G) ≥ 1/ |G|. Our main theorem is the aforementioned 'gap amplification step', where a graph G is converted into a new graph G whose unsat value is doubled. Theorem 1.5 (Main) There exists Σ 0 such that the following holds. For any finite alphabet Σ there exist C > 0 and 0 < α < 1 such that, given a constraint graph G = (V, E), Σ, C , one can construct, in polynomial time, a constraint graph G = (V , E ), Σ 0 , C such that • size(G ) ≤ C · size(G). • (Completeness:) If UNSAT(G) = 0 then UNSAT(G ) = 0. Lemma 1.8 (Composition Lemma -Informal statement) Assume the existence of an assignment tester P, with constant rejection probability ε > 0, and alphabet Σ 0 , |Σ 0 | = O(1). There exists β 3 > 0 that depends only on P, such that given any constraint graph G = (V, E), Σ, C , one can compute, in linear time, the constraint graph G = G • P, such that size(G ) = c(P, |Σ|) · size(G), and β 3 · UNSAT(G) ≤ UNSAT(G ) ≤ UNSAT(G). For the sake of self-containedness, we include a construction of an assignment tester P in Section 7.