Loading jsMath...
Weizmann Logo
ECCC
Electronic Colloquium on Computational Complexity

Under the auspices of the Computational Complexity Foundation (CCF)

Login | Register | Classic Style



REPORTS > DETAIL:

Revision(s):

Revision #1 to TR20-045 | 16th June 2020 12:16

Learning sums of powers of low-degree polynomials in the non-degenerate case

RSS-Feed




Revision #1
Authors: Ankit Garg, Neeraj Kayal, Chandan Saha
Accepted on: 16th June 2020 12:16
Downloads: 502
Keywords: 


Abstract:

We develop algorithms for writing a polynomial as sums of powers of low degree polynomials. Consider an n-variate degree-d polynomial f which can be written as

f = c_1Q_1^{m} + \ldots + c_s Q_s^{m},

where each c_i\in \mathbb{F}^{\times}, Q_i is a homogeneous polynomial of degree t, and t m = d. In this paper, we give a \text{poly}((ns)^t)-time learning algorithm for finding the Q_i's given (black-box access to) f, if the Q_i's satisfy certain non-degeneracy conditions and n is larger than d^2. The set of degenerate Q_i's (i.e., inputs for which the algorithm does not work) form a non-trivial variety and hence if the Q_i's are chosen according to any reasonable (full-dimensional) distribution, then they are non-degenerate with high probability (if s is not too large). This problem generalizes symmetric tensor decomposition, which corresponds to the t = 1 case and is widely studied, having many applications in machine learning. Our algorithm (for t=2) allows us to solve the moment problem for mixtures of zero-mean Gaussians in the non-degenerate case.

Our algorithm is based on a scheme for obtaining a learning algorithm for an arithmetic circuit model from a lower bound for the same model, provided certain non-degeneracy conditions hold. The scheme reduces the learning problem to the problem of decomposing two vector spaces under the action of a set of linear operators, where the spaces and the operators are derived from the input circuit and the complexity measure used in a typical lower bound proof. The non-degeneracy conditions are certain restrictions on how the spaces decompose. Such a scheme is present in a rudimentary form in an earlier work of Kayal and Saha. Here, we make it more general and detailed, and potentially applicable to learning other circuit models.

An exponential lower bound for the representation above (also known as homogeneous \Sigma \wedge \Sigma \Pi^{[t]} circuits) is known using the shifted partials measure. However, the number of linear operators in shifted partials is exponential and also the non-degeneracy condition emerging out of this measure is unlikely to be satisfied by a random \Sigma \wedge \Sigma \Pi^{[t]} circuit when the number of variables is large with respect to the degree. We bypass this hurdle by proving a lower bound (which is nearly as strong as the previous bound) using a novel variant of the partial derivatives measure, namely affine projections of partials (APP). The non-degeneracy conditions appearing from this new measure are satisfied by a random \Sigma \wedge \Sigma \Pi^{[t]} circuit. The APP measure could be of independent interest for proving other lower bounds.



Changes to previous version:

Fixed a minor bug in the statement and proof of Corollary 3.1.


Paper:

TR20-045 | 15th April 2020 09:13

Learning sums of powers of low-degree polynomials in the non-degenerate case





TR20-045
Authors: Ankit Garg, Neeraj Kayal, Chandan Saha
Publication: 15th April 2020 10:50
Downloads: 566
Keywords: 


Abstract:

We develop algorithms for writing a polynomial as sums of powers of low degree polynomials. Consider an n-variate degree-d polynomial f which can be written as

f = c_1Q_1^{m} + \ldots + c_s Q_s^{m},

where each c_i\in \mathbb{F}^{\times}, Q_i is a homogeneous polynomial of degree t, and t m = d. In this paper, we give a \text{poly}((ns)^t)-time learning algorithm for finding the Q_i's given (black-box access to) f, if the Q_i's satisfy certain non-degeneracy conditions and n is larger than d^2. The set of degenerate Q_i's (i.e., inputs for which the algorithm does not work) form a non-trivial variety and hence if the Q_i's are chosen according to any reasonable (full-dimensional) distribution, then they are non-degenerate with high probability (if s is not too large). This problem generalizes symmetric tensor decomposition, which corresponds to the t = 1 case and is widely studied, having many applications in machine learning. Our algorithm (for t=2) allows us to solve the moment problem for mixtures of zero-mean Gaussians in the non-degenerate case.

Our algorithm is based on a scheme for obtaining a learning algorithm for an arithmetic circuit model from a lower bound for the same model, provided certain non-degeneracy conditions hold. The scheme reduces the learning problem to the problem of decomposing two vector spaces under the action of a set of linear operators, where the spaces and the operators are derived from the input circuit and the complexity measure used in a typical lower bound proof. The non-degeneracy conditions are certain restrictions on how the spaces decompose. Such a scheme is present in a rudimentary form in an earlier work of Kayal and Saha. Here, we make it more general and detailed, and potentially applicable to learning other circuit models.

An exponential lower bound for the representation above (also known as homogeneous \Sigma \wedge \Sigma \Pi^{[t]} circuits) is known using the shifted partials measure. However, the number of linear operators in shifted partials is exponential and also the non-degeneracy condition emerging out of this measure is unlikely to be satisfied by a random \Sigma \wedge \Sigma \Pi^{[t]} circuit when the number of variables is large with respect to the degree. We bypass this hurdle by proving a lower bound (which is nearly as strong as the previous bound) using a novel variant of the partial derivatives measure, namely affine projections of partials (APP). The non-degeneracy conditions appearing from this new measure are satisfied by a random \Sigma \wedge \Sigma \Pi^{[t]} circuit. The APP measure could be of independent interest for proving other lower bounds.



ISSN 1433-8092 | Imprint