ECCC-Report TR22-052https://eccc.weizmann.ac.il/report/2022/052Comments and Revisions published for TR22-052en-usMon, 18 Apr 2022 21:36:21 +0300
Paper TR22-052
| Verifying The Unseen: Interactive Proofs for Label-Invariant Distribution Properties |
Tal Herman,
Guy Rothblum
https://eccc.weizmann.ac.il/report/2022/052Given i.i.d. samples from an unknown distribution over a large domain $[N]$, approximating several basic quantities, including the distribution's support size, its entropy, and its distance from the uniform distribution, requires $\Theta(N / \log N)$ samples [Valiant and Valiant, STOC 2011].
Suppose, however, that we can interact with a powerful but untrusted prover, who knows the entire distribution (or a good approximation of it). Can we use such a prover to approximate (or rather, to approximately {\em verify}) such statistical quantities more efficiently? We show that this is indeed the case: the support size, the entropy, and the distance from the uniform distribution, can all be approximately verified via a 2-message interactive proof, where the communication complexity, the verifier's running time, and the sample complexity are $\widetilde{O}({\sqrt{N}})$. For all these quantities, the sample complexity is tight up to $\polylog N$ factors (for any interactive proof, regardless of its communication complexity or verification time).
More generally, we give a tolerant interactive proof system with the above sample and communication complexities for verifying a distribution's proximity to any label-invariant property (any property that is invariant to re-labeling of the elements in the distribution's support). The verifier's running time in this more general protocol is also $\widetilde{O}({\sqrt{N}})$, under a mild assumption about the complexity of deciding, given a compact representation of a distribution, whether it is in the property or far from it.Mon, 18 Apr 2022 21:36:21 +0300https://eccc.weizmann.ac.il/report/2022/052