ECCC-Report TR20-045https://eccc.weizmann.ac.il/report/2020/045Comments and Revisions published for TR20-045en-usTue, 16 Jun 2020 12:16:58 +0300
Revision 1
| Learning sums of powers of low-degree polynomials in the non-degenerate case |
Ankit Garg,
Neeraj Kayal,
Chandan Saha
https://eccc.weizmann.ac.il/report/2020/045#revision1We develop algorithms for writing a polynomial as sums of powers of low degree polynomials. Consider an $n$-variate degree-$d$ polynomial $f$ which can be written as
$$f = c_1Q_1^{m} + \ldots + c_s Q_s^{m},$$
where each $c_i\in \mathbb{F}^{\times}$, $Q_i$ is a homogeneous polynomial of degree $t$, and $t m = d$. In this paper, we give a $\text{poly}((ns)^t)$-time learning algorithm for finding the $Q_i$'s given (black-box access to) $f$, if the $Q_i's$ satisfy certain non-degeneracy conditions and $n$ is larger than $d^2$. The set of degenerate $Q_i$'s (i.e., inputs for which the algorithm does not work) form a non-trivial variety and hence if the $Q_i$'s are chosen according to any reasonable (full-dimensional) distribution, then they are non-degenerate with high probability (if $s$ is not too large). This problem generalizes symmetric tensor decomposition, which corresponds to the $t = 1$ case and is widely studied, having many applications in machine learning. Our algorithm (for $t=2$) allows us to solve the moment problem for mixtures of zero-mean Gaussians in the non-degenerate case.
Our algorithm is based on a scheme for obtaining a learning algorithm for an arithmetic circuit model from a lower bound for the same model, provided certain non-degeneracy conditions hold. The scheme reduces the learning problem to the problem of decomposing two vector spaces under the action of a set of linear operators, where the spaces and the operators are derived from the input circuit and the complexity measure used in a typical lower bound proof. The non-degeneracy conditions are certain restrictions on how the spaces decompose. Such a scheme is present in a rudimentary form in an earlier work of Kayal and Saha. Here, we make it more general and detailed, and potentially applicable to learning other circuit models.
An exponential lower bound for the representation above (also known as homogeneous $\Sigma \wedge \Sigma \Pi^{[t]}$ circuits) is known using the shifted partials measure. However, the number of linear operators in shifted partials is exponential and also the non-degeneracy condition emerging out of this measure is unlikely to be satisfied by a random $\Sigma \wedge \Sigma \Pi^{[t]}$ circuit when the number of variables is large with respect to the degree. We bypass this hurdle by proving a lower bound (which is nearly as strong as the previous bound) using a novel variant of the partial derivatives measure, namely affine projections of partials (APP). The non-degeneracy conditions appearing from this new measure are satisfied by a random $\Sigma \wedge \Sigma \Pi^{[t]}$ circuit. The APP measure could be of independent interest for proving other lower bounds.Tue, 16 Jun 2020 12:16:58 +0300https://eccc.weizmann.ac.il/report/2020/045#revision1
Paper TR20-045
| Learning sums of powers of low-degree polynomials in the non-degenerate case |
Ankit Garg,
Neeraj Kayal,
Chandan Saha
https://eccc.weizmann.ac.il/report/2020/045We develop algorithms for writing a polynomial as sums of powers of low degree polynomials. Consider an $n$-variate degree-$d$ polynomial $f$ which can be written as
$$f = c_1Q_1^{m} + \ldots + c_s Q_s^{m},$$
where each $c_i\in \mathbb{F}^{\times}$, $Q_i$ is a homogeneous polynomial of degree $t$, and $t m = d$. In this paper, we give a $\text{poly}((ns)^t)$-time learning algorithm for finding the $Q_i$'s given (black-box access to) $f$, if the $Q_i's$ satisfy certain non-degeneracy conditions and $n$ is larger than $d^2$. The set of degenerate $Q_i$'s (i.e., inputs for which the algorithm does not work) form a non-trivial variety and hence if the $Q_i$'s are chosen according to any reasonable (full-dimensional) distribution, then they are non-degenerate with high probability (if $s$ is not too large). This problem generalizes symmetric tensor decomposition, which corresponds to the $t = 1$ case and is widely studied, having many applications in machine learning. Our algorithm (for $t=2$) allows us to solve the moment problem for mixtures of zero-mean Gaussians in the non-degenerate case.
Our algorithm is based on a scheme for obtaining a learning algorithm for an arithmetic circuit model from a lower bound for the same model, provided certain non-degeneracy conditions hold. The scheme reduces the learning problem to the problem of decomposing two vector spaces under the action of a set of linear operators, where the spaces and the operators are derived from the input circuit and the complexity measure used in a typical lower bound proof. The non-degeneracy conditions are certain restrictions on how the spaces decompose. Such a scheme is present in a rudimentary form in an earlier work of Kayal and Saha. Here, we make it more general and detailed, and potentially applicable to learning other circuit models.
An exponential lower bound for the representation above (also known as homogeneous $\Sigma \wedge \Sigma \Pi^{[t]}$ circuits) is known using the shifted partials measure. However, the number of linear operators in shifted partials is exponential and also the non-degeneracy condition emerging out of this measure is unlikely to be satisfied by a random $\Sigma \wedge \Sigma \Pi^{[t]}$ circuit when the number of variables is large with respect to the degree. We bypass this hurdle by proving a lower bound (which is nearly as strong as the previous bound) using a novel variant of the partial derivatives measure, namely affine projections of partials (APP). The non-degeneracy conditions appearing from this new measure are satisfied by a random $\Sigma \wedge \Sigma \Pi^{[t]}$ circuit. The APP measure could be of independent interest for proving other lower bounds.Wed, 15 Apr 2020 10:50:58 +0300https://eccc.weizmann.ac.il/report/2020/045