ECCC-Report TR08-103

ECCC-Report TR08-103https://eccc.weizmann.ac.il/report/2008/103Comments and Revisions published for TR08-103en-usSun, 23 Nov 2008 00:06:27 +0200 Paper TR08-103 | Regularity, Boosting, and Efficiently Simulating Every High-Entropy Distribution | Luca Trevisan, Madhur Tulsiani, Salil Vadhan https://eccc.weizmann.ac.il/report/2008/103We show that every high-entropy distribution is indistinguishable from an efficiently samplable distribution of the same entropy. Specifically, we prove that if $D$ is a distribution over $\{ 0,1\}^n$ of min-entropy at least $n-k$, then for every $S$ and $\epsilon$ there is a circuit $C$ of size at most $S\cdot \poly(\epsilon^{-1}, 2^k)$ that samples a distribution of entropy at least $n-k$ that is $\epsilon$-indistinguishable from $D$ by circuits of size $S$. Stated in a more abstract form (where we refer to indistinguishability by arbitrary families of distinguishers rather than bounded-size circuits), our result implies (a) the Weak Szemer\'edi Regularity Lemma of Frieze and Kannan (b) a constructive version of the Dense Model Theorem of Green, Tao and Ziegler with better quantitative parameters (polynomial rather than exponential in the distinguishing probability $\epsilon$), and (c) the Impagliazzo Hardcore Set Lemma. It appears to be the general result underlying the known connections between ``regularity'' results in graph theory, ``decomposition'' results in additive combinatorics, and the Hardcore Lemma in complexity theory. We present two proofs of our result, one in the spirit of Nisan's proof of the Hardcore Lemma via duality of linear programming, and one similar to Impagliazzo's ``boosting'' proof. A third proof by iterative partitioning, which gives the complexity of the sampler to be exponential in $1/\epsilon$ and $2^k$, is also implicit in the Green-Tao-Ziegler proofs of the Dense Model Theorem. Sun, 23 Nov 2008 00:06:27 +0200https://eccc.weizmann.ac.il/report/2008/103