On the Complexity of Estimating the Effective Support Size

Oded Goldreich

Electronic Colloquium on Computational Complexity

Under the auspices of the Computational Complexity Foundation (CCF)

REPORTS > DETAIL:

Revision(s):

Revision #1 to TR19-088 | 26th October 2021 18:24

On the Complexity of Estimating the Effective Support Size

Revision #1 Authors: Oded Goldreich
Accepted on: 26th October 2021 18:24
Downloads: 649

Keywords:

Abstract:

Loosely speaking, the effective support size of a distribution is the size of the support of a distribution that is close to it (in totally variation distance).
We study the complexity of estimating the effective support size of an unknown distribution when given samples of the distributions as well as an evaluation oracle (which returns the probability that the queried element appears in the distribution).
In this context, we present several algorithms that exhibit a trade-off between the quality of the approximation and the complexity of obtaining it, and leave open the question of their optimality.

In particular, for any constant $\beta>1$, we present an algorithm that, on input $\epsilon>0$ and oracle access to a distribution $\cal D$, uses $O(1/\epsilon^{1.001})$ samples and queries, and outputs a number $\widetilde{n}$ such that $\cal D$ is $\epsilon$-far from any distribution that has support of size $\widetilde{n}$ but is $\beta\cdot\epsilon$-close to a distribution that has support size $f\cdot\widetilde{n}$, where $f=O(\log\log\log\log\log({\widetilde{n}}/\epsilon))$.
(Indeed, 1.001 stands for any constant larger than 1, and $\log\log\log\log\log$ stands for any constant iterations of the logarithmic function.)

Changes to previous version:

The original presentation contained several inaccuracies and minor errors, which are corrected in the current version. Most importantly, the original Claim 2.1.2 (Part 2) was off by a factor of $\beta-1$, but this has negligible effect on the main results.
In addition, the current provides more technical details.

Paper:

TR19-088 | 16th June 2019 18:32

On the Complexity of Estimating the Effective Support Size

TR19-088 Authors: Oded Goldreich
Publication: 16th June 2019 18:33
Downloads: 1269

Keywords:

distribution testing, Effective support size, Evaluation oracle

Abstract:

ISSN 1433-8092 | Imprint