A Time-Space Lower Bound for a Large Class of Learning Problems

Ran Raz

Electronic Colloquium on Computational Complexity

Under the auspices of the Computational Complexity Foundation (CCF)

REPORTS > DETAIL:

Paper:

TR17-020 | 12th February 2017 04:57

A Time-Space Lower Bound for a Large Class of Learning Problems

TR17-020 Authors: Ran Raz
Publication: 12th February 2017 05:03
Downloads: 2909

Keywords:

branching program, learning, lower bounds, time-space tradeoff

Abstract:

We prove a general time-space lower bound that applies for a large class of learning problems and shows that for every problem in that class, any learning algorithm requires either a memory of quadratic size or an exponential number of samples.

Our result is stated in terms of the norm of the matrix that corresponds to the learning problem. Let $X$, $A$ be two finite sets. Let $M: A \times X \rightarrow \{-1,1\}$ be a matrix. The matrix $M$ corresponds to the following learning problem: An unknown element $x \in X$ was chosen uniformly at random. A learner tries to learn $x$ from a stream of samples, $(a_1, b_1), (a_2, b_2) \ldots$, where for every $i$, $a_i \in A$ is chosen uniformly at random and $b_i = M(a_i,x)$.

Let $\sigma_{\max}$ be the largest singular value of $M$ and note that always $\sigma_{\max} \leq |A|^{\frac{1}{2}}\cdot |X|^{\frac{1}{2}}$.

We show that if $\sigma_{\max} \leq |A|^{\frac{1}{2}}\cdot |X|^{\frac{1}{2} -
\varepsilon}$, then any learning algorithm for the corresponding learning problem requires either a memory of size at least $\Omega \left( (\varepsilon n)^2 \right) $ or at least $2^{\Omega(\varepsilon n)}$ samples, where $n = \log_2 |X|$.

As a special case, this gives a new proof for the time-space lower bound for
parity learning [Raz 2016].

ISSN 1433-8092 | Imprint