We describe a general technique that yields the first {\em Statistical Query lower bounds} for
a range of fundamental high-dimensional learning problems involving
Gaussian distributions. Our main results are for the problems of
(1) learning Gaussian mixture models (GMMs), and (2) robust (agnostic) learning of a single unknown Gaussian distribution.
For each of these problems, we show a {\em super-polynomial gap} between the (information-theoretic)
sample complexity and the computational complexity of {\em any} Statistical Query algorithm for the problem.
Statistical Query (SQ) algorithms are a class of algorithms
that are only allowed to query expectations of functions of the distribution rather than directly access samples.
This class of algorithms is quite broad:
a wide range of known algorithmic techniques in machine learning are known to
be implementable using SQs. Moreover, for the unsupervised learning problems studied in this paper, all known algorithms with non-trivial performance guarantees are SQ or are easily implementable using SQs.
Our SQ lower bound for Problem (1)
is qualitatively matched by known learning algorithms for GMMs.
At a conceptual level, this result implies that -- as far as SQ algorithms are concerned -- the computational complexity
of learning GMMs is inherently exponential
{\em in the dimension of the latent space} -- even though there
is no such information-theoretic barrier.
Our lower bound for Problem (2) implies that the accuracy of the robust learning algorithm
in~\cite{DiakonikolasKKLMS16} is essentially best possible among all polynomial-time SQ algorithms.
On the positive side, we also give a new (SQ) learning algorithm for Problem (2) achieving
the information-theoretically optimal accuracy, up to a constant factor,
whose running time essentially matches our lower bound.
Our algorithm relies on a filtering technique generalizing~\cite{DiakonikolasKKLMS16}
that removes outliers based on higher-order tensors.
Our SQ lower bounds are attained via a unified moment-matching technique that is useful in other contexts and may be of broader interest. Our technique yields nearly-tight lower bounds for a number of related unsupervised estimation problems.
Specifically, for the problems of (3) robust covariance estimation in spectral norm,
and (4) robust sparse mean estimation, we establish a quadratic {\em statistical--computational tradeoff} for SQ algorithms,
matching known upper bounds. Finally, our technique can be used to obtain tight sample complexity
lower bounds for high-dimensional {\em testing} problems. Specifically, for the classical problem of robustly {\em testing} an unknown mean (known covariance) Gaussian, our technique implies
an information-theoretic sample lower bound that scales {\em linearly} in the dimension.
Our sample lower bound matches the sample complexity of the corresponding robust {\em learning} problem and separates the sample complexity of robust testing from standard (non-robust) testing.
This separation is surprising because such a gap does not exist for the corresponding learning problem.
Revised presentation. Added two more applications of the technique (tight SQ lower bounds for: robust sparse mean estimation, robust covariance estimation in spectral norm), and sharpened sample complexity lower bound for high-dimensional testing to linear in the dimension (as opposed to nearly-linear in previous version).
We prove the first {\em Statistical Query lower bounds} for two fundamental high-dimensional learning problems involving Gaussian distributions: (1) learning Gaussian mixture models (GMMs), and (2) robust (agnostic) learning of a single unknown mean Gaussian. In particular, we show a {\em super-polynomial gap} between the (information-theoretic) sample complexity and the complexity of {\em any} Statistical Query algorithm for these problems. Statistical Query (SQ) algorithms are a class of algorithms that are only allowed to query expectations of functions of the distribution rather than directly access samples. This class of algorithms is quite broad: with the sole exception of Gaussian elimination over finite fields, all known algorithmic approaches in machine learning can be implemented in this model.
Our SQ lower bound for Problem (1) is qualitatively matched by known learning algorithms for GMMs (all of which can be implemented as SQ algorithms). At a conceptual level, this result implies that -- as far as SQ algorithms are concerned -- the computational complexity of learning GMMs is inherently exponential {\em in the dimension of the latent space} -- even though there is no such information-theoretic barrier. Our lower bound for Problem (2) implies that the accuracy of the robust learning algorithm in~\cite{DiakonikolasKKLMS16} is essentially best possible among all polynomial-time SQ algorithms.
On the positive side, we give a new SQ learning algorithm for this problem with optimal accuracy whose running time nearly matches our lower bound. Both our SQ lower bounds are attained via a unified moment-matching technique that may be useful in other contexts. Our SQ learning algorithm for Problem (2) relies on a filtering technique that removes outliers based on higher-order tensors.
Our lower bound technique also has implications for related inference problems, specifically for the problem of robust {\em testing} of an unknown mean Gaussian. Here we show an information-theoretic lower bound which separates the sample complexity of the robust testing problem from its non-robust variant. This result is surprising because such a separation does not exist for the corresponding learning problem.