TR17-075 Authors: Clement Canonne, Ilias Diakonikolas, Alistair Stewart

Publication: 29th April 2017 21:31

Downloads: 145

Keywords:

We study the general problem of testing whether an unknown discrete distribution belongs to a given family of distributions. More specifically, given a class of distributions $\mathcal{P}$ and sample access to an unknown distribution $\mathbf{P}$, we want to distinguish (with high probability) between the case that $\mathbf{P} \in \mathcal{P}$ and the case that $\mathbf{P}$ is $\epsilon$-far, in total variation distance, from every distribution in $\mathcal{P}$. This is the prototypical hypothesis testing problem that has received significant attention in statistics and,

more recently, in theoretical computer science.

The sample complexity of this general problem depends on the underlying family $\mathcal{P}$. We are interested in designing sample-optimal and computationally efficient algorithms for this task. The main contribution of this work is a new and simple testing technique that is applicable to distribution families whose Fourier spectrum approximately satisfies a certain sparsity property. As the main applications of our Fourier-based testing technique, we obtain the first non-trivial testers for two fundamental families of discrete distributions: Sums of Independent Integer Random Variables (SIIRVs) and Poisson Multinomial Distributions (PMDs). Our testers for these families are nearly sample-optimal and computationally efficient. We also obtain a tester with improved sample complexity for discrete log-concave distributions. To the best of our knowledge, ours is the first use of the Fourier transform in the context of distribution testing.