If a concept class can be represented with a certain amount of memory, can it be efficiently learned with the same amount of memory? What concepts can be efficiently learned by algorithms that extract only a few bits of information from each example? We introduce a formal framework for studying these questions, and investigate the relationship between the fundamental resources of memory or communication and the sample complexity of the learning task. We relate our memory-bounded and communication-bounded learning models to the well-studied statistical query model. This connection can be leveraged to obtain both upper and lower bounds: we show several strong lower bounds on learning parity functions with bounded communication (for example, that any multi-round multiparty protocol for learning parity functions over length $n$ inputs in which each party receives a list of $\le n/4$ examples but is limited to at most $n/16$ bits of communication, requires an exponential number of parties), as well as the first upper bounds on solving generic sparse linear regression problems with limited memory.
Substantial clean-up of writing.
If a concept class can be represented with a certain amount of memory, can it be efficiently learned with the same amount of memory? What concepts can be efficiently learned by algorithms that extract only a few bits of information from each example? We introduce a formal framework for studying these questions, and investigate the relationship between the fundamental resources of memory or communication and the sample complexity of the learning task. We relate our memory-bounded and communication-bounded learning models to the well-studied statistical query model. This connection can be leveraged to obtain both upper and lower bounds: we show several strong lower bounds on learning parity functions with bounded communication (for example, that any multi-round multiparty protocol for learning parity functions over length $n$ inputs in which each party receives a list of $\le n/4$ examples but is limited to at most $n/16$ bits of communication, requires an exponential number of parties), as well as the first upper bounds on solving generic sparse linear regression problems with limited memory.