Average-Case Complexity

Andrej Bogdanov
2006 Foundations and Trends® in Theoretical Computer Science  
We survey the average-case complexity of problems in NP. We discuss various notions of good-on-average algorithms, and present completeness results due to Impagliazzo and Levin. Such completeness results establish the fact that if a certain specific (but somewhat artificial) NP problem is easy-on-average with respect to the uniform distribution, then all problems in NP are easy-on-average with respect to all samplable distributions. Applying the theory to natural distributional problems remain
more » ... n outstanding open question. We review some natural distributional problems whose average-case complexity is of particular interest and that do not yet fit into this theory. A major open question is whether the existence of hard-on-average problems in NP can be based on the P = NP assumption or on related worst-case assumptions. We review negative results showing that certain proof techniques cannot prove such a result. While the relation between worst-case and average-case complexity for general NP problems remains open, there has been progress in understanding the relation between different "degrees" of average-case complexity. We discuss some of these "hardness amplification" results. 1.1. Roadmap 3 1.1. Roadmap 5 definition of tractability except the notion of "bad" input. (See also Chapter 3.) A completeness result Having given the definition of computational problem and of reduction, we will present a completeness result [53] . We consider the bounded halting problem BH, where on input (M, x, 1 t ) we have to determine whether the non-deterministic Turing machine M accepts input x within t steps. This problem is readily seen to be NP-complete. We show that for every distributional problem (L, D), where L is in NP and D is a polynomial-time computable distribution there is a reduction from (L, D) to (BH, U BH ), where U BH is a reasonable formalization of the notion of a "uniformly chosen" random input for BH. Informally, the reduction maps an input x into the triple (M , C(x), 1 t ), where C is a (carefully chosen) injective polynomial-time computable encoding function; M is a non-deterministic machine that first recovers x from C(x) and then simulates the non-deterministic polynomial time Turing machine that decides whether x ∈ L (recall that L is in NP); and t is a polynomial upper bound to the running time of M . The main claim in the analysis of the reduction is that, for x selected from D, C(x) is "approximately" uniformly distributed. Technically, we show that the distribution of C(x) is dominated by the uniform distribution. This will follow from a choice of C as an information-theoretically optimal compression scheme. The completeness result implies that if (BH, U BH ) has a good-onaverage algorithm (according to one of the possible definitions), then all problems (L, D), where L is in NP and D is polynomial-time computable, also have good-on-average algorithms. The proof uses the fact that all polynomial-time computable distributions D allow polynomial-time computable optimal compression schemes. Many natural distributions are polynomial-time computable, but there are a number of important exceptions. The output of a pseudorandom generator, for example, defines a distribution that is not optimally compressible in polynomial time and, hence, is not polynomial-time computable. 1.2. A historical overview 9 1.2. A historical overview 13 2.3. Non-uniform and randomized heuristics 25
doi:10.1561/0400000004 fatcat:6emvwcgkinbgblhqzcdt7nzmou