We study the polynomial reconstruction problem for low-degree
multivariate polynomials over finite fields. In the GF[2] version of this problem, we are given a set of points on the hypercube and target values $f(x)$ for each of these points, with the promise that there is a polynomial over GF[2] of degree at most $d$ that agrees with $f$ at $1 -eps$ fraction of the points. Our goal is to find a degree $d$ polynomial that has good agreement with $f$. We show that it is NP-hard to find a polynomial that agrees with $f$ on more than $1 -2^{-d}$ fraction of the points. This holds even with the stronger promise that the polynomial that fits the data is in fact linear, whereas the algorithm is allowed to find a polynomial of degree $d$. Previously the only known hardness of approximation (or even NP-completeness) was for the case when $d =1$, which follows from a celebrated result of Hastad.
In the setting of Computational Learning, our result shows the hardness of (non-proper)agnostic learning of parities, where the learner is allowed a low-degree polynomial as a hypothesis. This is the first non-proper hardness result for this central problem in computational learning. Our result suggests limitations to learning circuit classes via polynomial reconstruction. Our results can be extended to multivariate polynomial reconstruction over any finite field.