ECCC-Report TR10-157

ECCC-Report TR10-157https://eccc.weizmann.ac.il/report/2010/157Comments and Revisions published for TR10-157en-usTue, 19 Apr 2011 11:15:39 +0300 Revision 1 | Testing Properties of Collections of Distributions | Dana Ron, Reut Levi, Ronitt Rubinfeld https://eccc.weizmann.ac.il/report/2010/157#revision1We propose a framework for studying property testing of collections of distributions, where the number of distributions in the collection is a parameter of the problem. Previous work on property testing of distributions considered single distributions or pairs of distributions. We suggest two models that differ in the way the algorithm is given access to samples from the distributions. In one model the algorithm may ask for a sample from any distribution of its choice, and in the other the choice of the distribution is random. Our main focus is on the basic problem of distinguishing between the case that all the distributions in the collection are the same (or very similar), and the case that it is necessary to modify the distributions in the collection in a non-negligible manner so as to obtain this property. We give almost tight upper and lower bounds for this testing problem, as well as study an extension to a clusterability property. One of our lower bounds directly implies a lower bound on testing independence of a joint distribution, a result which was left open by previous work.Tue, 19 Apr 2011 11:15:39 +0300https://eccc.weizmann.ac.il/report/2010/157#revision1 Paper TR10-157 | Testing Properties of Collections of Distributions | Dana Ron, Reut Levi, Ronitt Rubinfeld https://eccc.weizmann.ac.il/report/2010/157We propose a framework for studying property testing of collections of distributions, where the number of distributions in the collection is a parameter of the problem. Previous work on property testing of distributions considered single distributions or pairs of distributions. We suggest two models that differ in the way the algorithm is given access to samples from the distributions. In one model the algorithm may ask for a sample from any distribution of its choice, and in the other the choice of the distribution is random. Our main focus is on the basic problem of distinguishing between the case that all the distributions in the collection are the same (or very similar), and the case that it is necessary to modify the distributions in the collection in a non-negligible manner so as to obtain this property. We give almost tight upper and lower bounds for this testing problem, as well as study an extension to a clusterability property. One of our lower bounds directly implies a lower bound on testing independence of a joint distribution, a result which was left open by previous work.Sun, 24 Oct 2010 15:58:18 +0200https://eccc.weizmann.ac.il/report/2010/157