Weizmann Logo
ECCC
Electronic Colloquium on Computational Complexity

Under the auspices of the Computational Complexity Foundation (CCF)

Login | Register | Classic Style



REPORTS > DETAIL:

Paper:

TR21-155 | 13th November 2021 20:47

Learning generalized depth-three arithmetic circuits in the non-degenerate case

RSS-Feed

Abstract:

Consider a homogeneous degree $d$ polynomial $f = T_1 + \cdots + T_s$, $T_i = g_i(\ell_{i,1}, \ldots, \ell_{i, m})$ where $g_i$'s are homogeneous $m$-variate degree $d$ polynomials and $\ell_{i,j}$'s are linear polynomials in $n$ variables. We design a (randomized) learning algorithm that given black-box access to $f$, computes black-boxes for the $T_i$'s. The running time of the algorithm is $\text{poly}(n, m, d, s)$ and the algorithm works under some \emph{non-degeneracy} conditions on the linear forms and the $g_i$'s, and some additional technical assumptions $n \ge (md)^2, s \le n^{ d/4} $. The non-degeneracy conditions on $\ell_{i,j}$'s constitute non-membership in a variety, and hence are satisfied when the coefficients of $\ell_{i,j}$'s are chosen uniformly and randomly from a large enough set. The conditions on $g_i$'s are satisfied for random polynomials and also for natural polynomials common in the study of arithmetic complexity like determinant, permanent, elementary symmetric polynomial, iterated matrix multiplication. A particularly appealing algorithmic corollary is the following: Given black-box access to an $f = \mbox{Det}_{r}(L^{(1)}) + \ldots + \mbox{Det}_{r}(L^{(s)})$, where $L^{(k)} = (\ell_{i,j}^{(k)})_{i,j}$ with $\ell_{i,j}^{(k)}$'s being linear forms in $n$ variables chosen randomly, there is an algorithm which in time $\poly(n, r)$ outputs matrices $(M^{(k)})_k$ of linear forms \text{s.t.} there exists a permutation $\pi: [s] \rightarrow [s]$ with $\mbox{Det}_{r}(M^{(k)}) = \mbox{Det}_{r}(L^{(\pi(k))})$.

Our work follows the works of Kayal-Saha(STOC'19) and Garg-Kayal-Saha(FOCS'20) which use lower bound methods in arithmetic complexity to design average case learning algorithms. It also vastly generalizes the result in \cite{Kayal-Saha'19} about learning depth-three circuits, which is a special case where each $g_i$ is just a monomial. At the core of our algorithm is the partial derivative method which can be used to prove lower bounds for generalized depth-three circuits. To apply the general framework in \cite{Kayal-Saha'19, Garg-Kayal-Saha'20}, we need to establish that the non-degeneracy conditions arising out of applying the framework with the partial derivative method are satisfied in the random case. We develop simple but general and powerful tools to establish this, which might be useful in designing average case learning algorithms for other arithmetic circuit models.



ISSN 1433-8092 | Imprint