On Learning Correlated Boolean Functions Using Statistical Query

Ke Yang

Electronic Colloquium on Computational Complexity

Under the auspices of the Computational Complexity Foundation (CCF)

REPORTS > DETAIL:

Paper:

TR01-098 | 19th November 2001 00:00

On Learning Correlated Boolean Functions Using Statistical Query

TR01-098 Authors: Ke Yang
Publication: 11th December 2001 11:00
Downloads: 3591

Keywords:

blind computation, Linear Threshold Functions, statistical query

Abstract:

In this paper, we study the problem of using statistical
query (SQ) to learn highly correlated boolean functions, namely, a
class of functions where any
pair agree on significantly more than a fraction 1/2 of the inputs.
We give a limit on how well one can approximate all the functions
without making any query, and then we show that beyond this limit,
the number of statistical queries the algorithm has to make
increases with the ``extra'' advantage the algorithm gains
in learning the functions. Here the advantage is defined to be the
probability the algorithm agrees with the target function minus the
probability the algorithm doesn't agree.

An interesting consequence of our results is that the class of
booleanized linear functions over a finite field
($f(_{\vec{a}}(\vec{x}) = 1$ iff $\phi(\vec{a}\cdot\vec{x}) = 1$,
where $\phi:GF_p\mapsto\{-1,1\}$ is an arbitrary boolean function
the maps any elements in $GF_p$ to $\pm1$). This result is
useful since the hardness of learning booleanized linear functions
over a finite field is related to the security of certain
cryptosystems (\cite{B01}). In particular, we prove that the class
of linear threshold functions over a finite field
($f(_{\vec{a}, b}(\vec{x}) = 1$ iff $\vec{a}\cdot\vec{x}\ge b$) cannot
be learned efficiently using statistical query. This contrasts with
Blum et al.'s result \cite{BFK+96} that linear threshold functions
over reals (perceptrons) are learnable using SQ model.

Finally, we describe a PAC-learning algorithm that learns a class of
linear threshold functions in time that is provably impossible for
statistical query algorithms to learn the class. With properly
chosen parameters, this class of linear threshold functions can
become an example of PAC-learnable, but not SQ-learnable functions
that are not parity functions.

ISSN 1433-8092 | Imprint