ECCC-Report TR17-006https://eccc.weizmann.ac.il/report/2017/006Comments and Revisions published for TR17-006en-usMon, 30 Oct 2017 22:17:20 +0200
Revision 2
| Testing Ising Models |
Constantinos Daskalakis,
Nishanth Dikkala,
Gautam Kamath
https://eccc.weizmann.ac.il/report/2017/006#revision2Given samples from an unknown multivariate distribution $p$, is it possible to distinguish whether $p$ is the product of its marginals versus $p$ being far from every product distribution? Similarly, is it possible to distinguish whether $p$ equals a given distribution $q$ versus $p$ and $q$ being far from each other? These problems of testing independence and goodness-of-fit have received enormous attention in statistics, information theory, and theoretical computer science, with sample-optimal algorithms known in several interesting regimes of parameters. Unfortunately, it has also been understood that these problems become intractable in large dimensions, necessitating exponential sample complexity.
Motivated by the exponential lower bounds for general distributions as well as the ubiquity of Markov Random Fields (MRFs) in the modeling of high-dimensional distributions, we initiate the study of distribution testing on structured multivariate distributions, and in particular the prototypical example of MRFs: the Ising Model. We demonstrate that, in this structured setting, we can avoid the curse of dimensionality, obtaining sample and time efficient testers for independence and goodness-of-fit. One of the key technical challenges we face along the way is bounding the variance of functions of the Ising model.Mon, 30 Oct 2017 22:17:20 +0200https://eccc.weizmann.ac.il/report/2017/006#revision2
Revision 1
| Testing Ising Models |
Constantinos Daskalakis,
Nishanth Dikkala,
Gautam Kamath
https://eccc.weizmann.ac.il/report/2017/006#revision1Given samples from an unknown multivariate distribution $p$, is it possible to distinguish whether $p$ is the product of its marginals versus $p$ being far from every product distribution? Similarly, is it possible to distinguish whether $p$ equals a given distribution $q$ versus $p$ and $q$ being far from each other? These problems of testing independence and goodness-of-fit have received enormous attention in statistics, information theory, and theoretical computer science, with sample-optimal algorithms known in several interesting regimes of parameters. Unfortunately, it has also been understood that these problems become intractable in large dimensions, necessitating exponential sample complexity.
Motivated by the exponential lower bounds for general distributions as well as the ubiquity of Markov Random Fields (MRFs) in the modeling of high-dimensional distributions, we initiate the study of distribution testing on structured multivariate distributions, and in particular the prototypical example of MRFs: the Ising Model. We demonstrate that, in this structured setting, we can avoid the curse of dimensionality, obtaining sample and time efficient testers for independence and goodness-of-fit. Along the way, we develop new tools for establishing concentration of functions of the Ising model, using the exchangeable pairs framework developed by Chatterjee, and improving upon this framework. In particular, we prove tighter concentration results for multi-linear functions of the Ising model in the high-temperature regime.Mon, 10 Apr 2017 18:06:08 +0300https://eccc.weizmann.ac.il/report/2017/006#revision1
Paper TR17-006
| Testing Ising Models |
Constantinos Daskalakis,
Nishanth Dikkala,
Gautam Kamath
https://eccc.weizmann.ac.il/report/2017/006Given samples from an unknown multivariate distribution $p$, is it possible to distinguish whether $p$ is the product of its marginals versus $p$ being far from every product distribution? Similarly, is it possible to distinguish whether $p$ equals a given distribution $q$ versus $p$ and $q$ being far from each other? These problems of testing independence and goodness-of-fit have received enormous attention in statistics, information theory, and theoretical computer science, with sample-optimal algorithms known in several interesting regimes of parameters. Unfortunately, it has also been understood that these problems become intractable in large dimensions, necessitating exponential sample complexity.
Motivated by the exponential lower bounds for general distributions as well as the ubiquity of Markov Random Fields (MRFs) in the modeling of high-dimensional distributions, we initiate the study of distribution testing on structured multivariate distributions, and in particular the prototypical example of MRFs: the Ising Model. We demonstrate that, in this structured setting, we can avoid the curse of dimensionality, obtaining sample and time efficient testers for independence and goodness-of-fit. Along the way, we develop new tools for establishing concentration of functions of the Ising model, using the exchangeable pairs framework developed by Chatterjee, and improving upon this framework. In particular, we prove tighter concentration results for multi-linear functions of the Ising model in the high-temperature regime.Sun, 15 Jan 2017 23:49:03 +0200https://eccc.weizmann.ac.il/report/2017/006