TR16-026 Authors: Anindya De, Michael Saks, Sijian Tang

Publication: 1st March 2016 15:38

Downloads: 968

Keywords:

In the noisy population recovery problem of Dvir et al., the goal is to learn

an unknown distribution $f$ on binary strings of length $n$ from noisy samples. For some parameter $\mu \in [0,1]$,

a noisy sample is generated by flipping each coordinate of a sample from $f$ independently with

probability $(1-\mu)/2$.

We assume an upper bound $k$ on the size of the support of the distribution, and the

goal is to estimate the probability of any string to within some given error $\varepsilon$. It is known

that the algorithmic complexity and sample complexity of this problem are polynomially related to each other.

We show that for $\mu > 0$, the sample complexity (and hence the algorithmic complexity)

is bounded by a polynomial in $k$, $n$ and $1/\varepsilon$ improving upon the previous best result of $\mathsf{poly}(k^{\log\log k},n,1/\varepsilon)$ due to Lovett and Zhang.

Our proof combines ideas from Lovett and Zhang with a noise attenuated version of Mobius inversion. In turn, the latter crucially uses the construction of robust local inverse due to Moitra and Saks.