__
TR01-098 | 19th November 2001 00:00
__

#### On Learning Correlated Boolean Functions Using Statistical Query

TR01-098
Authors:

Ke Yang
Publication: 11th December 2001 11:00

Downloads: 1893

Keywords:

**Abstract:**
In this paper, we study the problem of using statistical

query (SQ) to learn highly correlated boolean functions, namely, a

class of functions where any

pair agree on significantly more than a fraction 1/2 of the inputs.

We give a limit on how well one can approximate all the functions

without making any query, and then we show that beyond this limit,

the number of statistical queries the algorithm has to make

increases with the ``extra'' advantage the algorithm gains

in learning the functions. Here the advantage is defined to be the

probability the algorithm agrees with the target function minus the

probability the algorithm doesn't agree.

An interesting consequence of our results is that the class of

booleanized linear functions over a finite field

($f(_{\vec{a}}(\vec{x}) = 1$ iff $\phi(\vec{a}\cdot\vec{x}) = 1$,

where $\phi:GF_p\mapsto\{-1,1\}$ is an arbitrary boolean function

the maps any elements in $GF_p$ to $\pm1$). This result is

useful since the hardness of learning booleanized linear functions

over a finite field is related to the security of certain

cryptosystems (\cite{B01}). In particular, we prove that the class

of linear threshold functions over a finite field

($f(_{\vec{a}, b}(\vec{x}) = 1$ iff $\vec{a}\cdot\vec{x}\ge b$) cannot

be learned efficiently using statistical query. This contrasts with

Blum et al.'s result \cite{BFK+96} that linear threshold functions

over reals (perceptrons) are learnable using SQ model.

Finally, we describe a PAC-learning algorithm that learns a class of

linear threshold functions in time that is provably impossible for

statistical query algorithms to learn the class. With properly

chosen parameters, this class of linear threshold functions can

become an example of PAC-learnable, but not SQ-learnable functions

that are not parity functions.