# Collapsing Exact Arithmetic Hierarchies 

Nikhil Balaji and Samir Datta<br>Chennai Mathematical Institute<br>\{nikhil, sdatta\}@cmi.ac.in


#### Abstract

We provide a uniform framework for proving the collapse of the hierarchy $\mathrm{NC}^{1}(\mathcal{C})$ for an exact arithmetic class $\mathcal{C}$ of polynomial degree. These hierarchies collapse all the way down to the third level of the $\mathrm{AC}^{0}$-hierarchy, $\mathrm{AC}_{3}^{0}(\mathcal{C})$. Our main collapsing exhibits are the classes $$
\mathcal{C} \in\left\{\mathrm{C}_{=} \mathrm{NC}^{1}, \mathrm{C}_{=} \mathrm{L}, \mathrm{C}=\mathrm{SAC}^{1}, \mathrm{C}=\mathrm{P}\right\}
$$ $\mathrm{NC}^{1}\left(\mathrm{C}_{=} \mathrm{L}\right)$ and $\mathrm{NC}^{1}\left(\mathrm{C}_{=} \mathrm{P}\right)$ are already known to collapse $[1,18,19]$. We reiterate that our contribution is a framework that works for all these hierarchies. Our proof generalizes a proof from [8] where it is used to prove the collapse of the $A C^{0}\left(C_{=} N C^{1}\right)$ hierarchy. It is essentially based on a polynomial degree characterization of each of the base classes.


## 1 Introduction

Collapsing hierarchies has been an important activity for structural complexity theorists through the years $[12,21,14,23,18,17,4,11]$. We provide a uniform framework for proving the collapse of the $N C^{1}$ hierarchy over an exact arithmetic class. Using our method, such a hierarchy collapses all the way down to the $\mathrm{AC}_{3}^{0}$ closure of the class.

Our main collapsing exhibits are the $N C^{1}$ hierarchies over the classes $C_{=} N C^{1}$, $C_{=} L, C_{=} S A C^{1}, C_{=} P$. Two of these hierarchies, viz. $\mathrm{NC}^{1}\left(\mathrm{C}_{=} \mathrm{L}\right), \mathrm{NC}^{1}\left(\mathrm{C}_{=} \mathrm{P}\right)$, are already known to collapse ( $[1,19,18]$ ) while a weaker collapse is known for a third one viz. that of $\mathrm{AC}^{0}\left(\mathrm{C}_{=} \mathrm{NC}^{1}\right)$. We reiterate that our contribution is a simple proof that works for all these hierarchies. Our proof is a generalization of a proof from [8] who used it to prove the collapse of the $\mathrm{C}_{=} \mathrm{NC}^{1}$ hierarchy and is essentially based on a polynomial degree characterization of each of the corresponding arithmetic classes.

The most well known amongst the exact arithmetic circuit hierarchy collapses is the collapse of the $N C^{1}$ hierarchy over $C_{=} L$. This was proved by using the linear algebraic properties of $C_{=} L$ an elegant and non-trivial argument by Allender, Beals, and Ogihara [1]. We find it remarkable that our proof does not use any linear algebra (apart from the characterization of GapL functions as being exactly the functions computed by weakly skew circuits [24], which involves a certain amount of linear algebra) to prove the collapse to $\mathrm{AC}_{3}^{0}\left(\mathrm{C}_{=} \mathrm{L}\right)$.

The collapse of the $N C^{1}$ hierarchy over $C_{=} P$ to a constant level of the $A C^{0}$ hierarchy follows easily from the results by Ogihara from [19,18]. Our proof is quite orthogonal to the proofs there.

We would like to point out a notational quirk that we carry over from [8] by consistently calling the finite level of $\mathrm{AC}^{0}$ circuit to which we collapse the various classes as
$\mathrm{AC}_{3}^{0}$ where we actually mean two layers of Boolean gates i.e $\wedge, \vee$. The reason for the subscript 3 being that we include negation gates while counting the depth contrary to popular usage.

### 1.1 Historical Perspective

Counting classes have been studied from the early days of complexity theory. Gill and Simon [10,22] were the first to introduce the class PP(Probabilistic polynomial time) which consists of all languages decidable by a nondeterministic Turing machine in polynomial time in which at least half of the computation paths lead to acceptance. This class gained importance due to Toda's theorem (which states that a polynomial time machine with access to a PP oracle is at least as powerful as the entire polynomial hierarchy) and Beigel, Reingold and Spielman [3] who proved that it is closed under intersection.
$C_{=} P$ is the humbler cousin of $P P$ though it happens to be one of our main protagonists. It was introduced by Simon [22] and consists of languages where the acceptance and rejection probability is the same.

Fenner, Fortnow and Kurtz[9] introduced the notion of Gap to aid the study of structural complexity of these counting classes. For a class $\mathcal{C}$, captured by nondeterministic Turing machines, denote by $\operatorname{Gap\mathcal {C}}$ the class of functions expressible as the difference between the number of accepting and rejecting computations of some nondeterministic Turing machine. Then $C_{=} P$ is just those languages which have a zero gap (with an NP-machine).

The class $C_{=} L$ and the $N C^{1}, A C^{0}$ hierarchies over it have also received attention in literature $[1,20,13]$. In [1] it was first proved that the $N C^{1}$ hierarchy over $C=L$ collapses all the way down to the first level of the hierarchy, $\mathrm{L}^{\mathrm{C}=\mathrm{L}}$. Further collapse to $\mathrm{C}_{=} \mathrm{L}$ is not known because this class is not known to be closed under complement.

The class $\mathrm{C}_{=} \mathrm{NC}^{1}$ was probably mentioned explicitly for the first time in [6] where it is shown that (a uniform version of) it is contained in Logspace. Various hierarchies over this class including the Boolean, $\mathrm{AC}^{0}$, and the arithmetic hierarchy were studied in [8].

To the extent of our knowledge, the only previous known occurence of the class $C_{=} S A C^{1}$ is in the context of deciding the properties of monomials computed by arithmetic circuits[16]. The classes SAC ${ }^{1}$ and its arithmetic analogs $\# S A C^{1}, \mathrm{GapSAC}^{1}$ have been actively investigated in literature [28,2,15].

### 1.2 Our Results

In this paper, we extend and generalize the framework used in [8] to prove that the $N C^{1}\left(C_{=} \mathcal{K}\right)$ for a $N C^{1}$-well-behaved class ${ }^{1} \mathcal{K}$ collapses to $\mathrm{AC}_{3}^{0}\left(\mathrm{C}_{=} \mathcal{K}\right)$. As a result, we obtain several results as corollaries:

1. $A C^{0}\left(C_{=} N C^{1}\right)=N C^{1}\left(C_{=} N C^{1}\right)=A C_{3}^{0}\left(C_{=} N C^{1}\right)$, improving on [8].
2. $A C^{0}\left(C_{=} \mathrm{L}\right)=\mathrm{NC}^{1}\left(\mathrm{C}_{=} \mathrm{L}\right)=\mathrm{AC}_{3}^{0}\left(\mathrm{C}_{=} \mathrm{L}\right)$, which gives an alternative proof of [1]
3. $A C^{0}\left(C_{=} S A C^{1}\right)=\operatorname{NC}^{1}\left(\mathrm{C}_{=} \mathrm{SAC}^{1}\right)=\mathrm{AC}_{3}^{0}\left(\mathrm{C}_{=} \mathrm{SAC}^{1}\right)$.

[^0]4. $\mathrm{AC}^{0}\left(\mathrm{C}_{=} \mathrm{P}\right)=\mathrm{NC}^{1}\left(\mathrm{C}_{=} \mathrm{P}\right)=\mathrm{AC}_{3}^{0}\left(\mathrm{C}_{=} \mathrm{P}\right)$, which gives an alternative proof of [18]

We prove that such a collapse can be made DLOGTIME-uniform. Note that we need to prove this strict uniformity of our collapse in order to exploit Venkateswaran's characterization of NP as exactly those languages that are decidable by DLOGTIMEuniform semi-unbounded circuit families of exponential size.

### 1.3 Proof Idea

The unifying feature of the aforementioned classes $\left(C_{=}=N C^{1}, C_{=} L, C_{=} P, C_{=} S A C^{1}\right)$ is that roughly speaking, they can be viewed as languages accepted by the $\mathrm{C}_{=}$-version of an arithmetic circuit family of polynomial degree. Thus if we consider arithmetic formulas, arithmetic weakly skew circuits, arithmetic circuits of polynomial size and arithmetic circuits of exponential size respectively, then imposing the polynomial degree bound gives us exactly the classes GapNC ${ }^{1}$, GapL, GapSAC ${ }^{1}$ and GapP [6,24,26,27]. Applying the $C_{=}$operator to these arithmetic classes yields our candidate exact arithmetic classes. We also tacitly use that the class of circuits are closed under composition with formulas.

The other important point to notice is that the [8] proof which just shows the collapse of the $A C^{0}\left(C_{=} N C^{1}\right)$ hierarchy to the third level can be extended to a collapse of $N C^{1}\left(C_{=} N C^{1}\right)$. This follows by observing that the level by level collapse outlined in [8] does not blow up the degree to more than a polynomial value even though we need to perform it logarithmically many times. This is because of the Cook-Wilson relativization, which can be interpreted as saying that if the oracle circuits have polynomial degree, so is the final circuit. This idea can then be generalized to any circuit family of polynomial degree.

A simple but crucial observation makes it possible to extend the results from bounded fan-in circuits in [8] to circuits where this restriction does not hold e.g. in the cases of $C_{=} S A C^{1}$ and $C_{=} P$. This observation concerning Vandermonde Determinants is described in Proposition 1 and used in the proof of Lemma 1.

### 1.4 Organization of the paper

We present preliminaries in Section 2. In Section 3, we state our main result and list some immediate corollaries of our result. We discuss possible future directions to our work in Section 5.

## 2 Preliminaries

We mention the standard complexity classes that we will use. For definitions and important results regarding these classes, we refer the reader to a standard text like [29].

An arithmetic circuit is a directed acyclic graph(DAG) with nodes labelled by $\{\times,+\} \cup X \cup\{-1\}$, where $X$ is the set of input variables. Note that -1 is the only constant necessary, and we can use it to generate $1=(-1) \times(-1), 0=(-1)+1$, and use $-1,0,1$ to generate any integer.

Given an arithmetic circuit $C$, the formal degree of the circuit is defined inductively as follows: Every input variable and the constant -1 have degree ${ }^{2} 1$. If $C_{1}$ and $C_{2}$ are two subcircuits of degree $d_{1}$ and $d_{2}$ fed in to a $\times$ gate (respectively + gate), then the degree of the $\times$ gate $(+$ gate $)$ is $d_{1}+d_{2}$ (respectively $\max \left(d_{1}, d_{2}\right)$ ).

An $(m \times m)$ Vandermonde matrix $V$ is one where the $(i, j)$-th entry of $V, V_{i j}=i^{j}$.
Fact 1 The absolute value of determinant of $V$ is equal to $\prod_{i<j ; i, j \leq m}(j-i)=\prod_{k=1}^{m} k$ !
It is important to note that the determinant of an $(m \times m)$-Vandermonde matrix (with entries that are indeterminates) can be computed by an arithmetic circuit of degree $m$. But since we start out with just the constant -1 , we will have to construct these numbers $1, \ldots, m$ and use them to find the the entries of the matrix, which can be done via repeated squaring and addition with a circuit of size at most $O\left(\mathrm{~m}^{2}\right)$, depth $O(\log m)$ and degree $O\left(m^{2} \log m\right)$, for every entry of the matrix. Also, the value of this determinant is at most $\prod_{k=1}^{m} k!\leq(m!)^{m} \leq m^{m^{2}}$ which can be computed by an arithmetic circuit of size at most $O\left(\binom{m}{2} m^{2}\right)=O\left(m^{4}\right)$, depth $O(\log m)$ and degree $O\left(m^{4} \log m\right)$.

We start with the usual definitions of $\mathrm{AC}^{0}$ and $\mathrm{NC}^{1}$ hierarchies over Boolean complexity classes.

Definition 1. Let $\mathcal{C}$ be a Boolean complexity class. $\operatorname{AC}^{0}(\mathcal{C})$ is the class of languages recognized by $\mathrm{AC}^{0}$ circuits with additional oracle gates( of unbounded fan-in) for $\mathcal{C}$.

Defining the $N C^{1}$ hierarchy naively as above will yield a circuit where there could be $O(\log n)$ many oracle gates on any path, which will stand in contrast to the definition of $\mathrm{NC}^{1}$ as that of circuits with bounded fan-in gates. The definition due to Cook and Wilson $[7,30]$ gives a reasonable model, which avoids these problems:

Definition 2. (Cook-Wilson) Let $\mathcal{C}$ be a Boolean complexity class. $\mathrm{NC}^{1}(\mathcal{C})$ is the class of languages recognized by $\mathrm{NC}^{1}$ circuits with additional oracle gates for $\mathcal{C}$, where an oracle gate of fan-in $k$ is charged $\log k$ towards the depth of the circuit.

We note that the small blob chains property defined in [8] is essentially modelled by the Cook-Wilson relativization for $\mathrm{NC}^{1}$ as given in Definition 2. An easy inclusion follows from the definitions above: For any boolean complexity class $\mathcal{C}$, we have $\mathrm{AC}^{0}(\mathcal{C}) \subseteq \mathrm{NC}^{1}(\mathcal{C})$. Next, we abstract the classes of base circuits over which we will consider various hierarchies:

Definition 3. Let $\mathcal{K}$ be a class of arithmetic circuits. Then $\mathrm{C}_{=} \mathcal{K}$ is the class of languages recognized by the circuits from the class $\mathcal{K}$ with an additional gate at the top that compares the output to zero to produce a Boolean output.

We will abuse notation to identify $\mathrm{C}_{=} \mathcal{K}$ with the class of circuits recognizing the class of represented languages.

[^1]Definition 4. Let $\mathcal{K}_{1}, \mathcal{K}_{2}$ be classes of circuits. Then $\mathcal{K}_{1} \circ \mathcal{K}_{2}$ is the class consisting of circuits with a circuit from $\mathcal{K}_{1}$ at the top, all of whose inputs are circuits from $\mathcal{K}_{2}$. We say that the class $\mathcal{K}_{2}$ is closed under composition with $\mathcal{K}_{1}$ if $\mathcal{K}_{2} \circ \mathcal{K}_{1} \subseteq \mathcal{K}_{2}$.

Definition 5. (Cook-Wilson for Arithmetic Classes) Let $\mathcal{K}$ be an arithmetic complexity class of polynomial formal degree. A special case when $\mathcal{K}$ is composed with itself $O(\log n)$ many times is denoted by,

$$
\mathcal{K}^{(l)}=\overbrace{\mathcal{K} \circ \mathcal{K} \ldots \circ \mathcal{K}}^{O(\log n) \text { times }}
$$

where along any path in the $\mathcal{K}^{(l)}$ circuit, the product of the degrees of $\mathcal{K}$ circuits is bounded by a polynomial in the number of input variables to the $\mathcal{K}^{(l)}$ circuit.

Throughout the paper, we use a very specific type of $A C^{0}$ circuit, namely, an $A C_{3}^{0}$ circuit which is essentially a boolean circuit of depth 3 consisting of an $\vee$ gate at the root, followed by a layer of $\wedge$ gates, which are fed by $\mathrm{C}_{=} \mathcal{K}$ or $\operatorname{coC}_{=} \mathcal{K}$ oracle gates. We will refer to such circuits as $\mathrm{AC}_{3}^{0}\left(\mathrm{C}_{=} \mathcal{K}\right)$ circuits.

Definition 6 (Toda[24]). A gate in a circuit is said to be weakly skew if for any multiplication gate $\alpha$ with children $\beta$ and $\gamma$,one of the two sub-circuits $C_{\beta}$ or $C_{\gamma}$ is only connected to the rest of the circuit by the wire going to $\alpha$. A weakly skew circuit is one where all the multiplication gates are weakly skew.

A simple consequence of this definition is that every formula is a weakly skew circuit.

## 3 Exact Arithmetic Hierarchies

Definition 7. Some examples of natural arithmetic circuit classes are as follows:

- $\mathcal{K}_{\text {form }}$ are formulas of polynomial size
- $\mathcal{K}_{w s k e w}$ are weakly skew circuits of polynomial size
- $\mathcal{K}_{\text {poly }}$ are circuits of polynomial degree and polynomial size
- $\mathcal{K}_{\text {exp }}$ are circuits of polynomial degree and exponential $\left(=2^{n^{O(1)}}\right)$ size

Throughout this paper, we will be interested in uniform versions of the classes above. Unless mentioned otherwise, all the arithmetic circuit families we mention are DLOGTIME-uniform.

Proposition 1. We prove the following for the aforementioned circuit classes:

1. $\mathrm{C}_{=} \mathcal{K}_{\text {form }}=\mathrm{C}_{=} \mathrm{NC}^{1}$
2. $\mathrm{C}_{=} \mathcal{K}_{\text {wskew }}=\mathrm{C}=\mathrm{L}$
3. $\mathrm{C}_{=} \mathcal{K}_{\text {poly }}=\mathrm{C}_{=} \mathrm{SAC}^{1}$
4. $\mathrm{C}_{=} \mathcal{K}_{\text {exp }}=\mathrm{C}_{=} \mathrm{P}$

Proof. Some of the above follow from well-known results:

1. Follows from the simulation of formulas by arithmetic straight-line programs due to Ben-Or and Cleve [5].
2. Follows from Toda's characterization of determinant using weakly skew circuits[24].
3. Follows from Venkateswaran's uniform circuit characterization of $S A C^{1}$ as consisting of languages recognized by circuit families having polynomial degree and polynomial size[27,26].
4. Follows from Venkateswaran's uniform circuit characterization of NP as consisting of languages recognized by circuit families having polynomial degree and exponential size.[27]

Notice that skew circuits which yield the usual definition of GapL are not closed under composition with formulas (because formulas are not necessarily skew circuits). Thus we have to use the alternative(and equivalent) definition of GapL in terms of weakly skew circuits and in this case all formulas are indeed weakly skew. One of the key requirements in our collapse is the ability to compose two different circuit families.
Definition 8. Given an arithmetic class $\mathcal{K}$, consider the following:
(i) $\mathcal{K}_{\text {form }} \subseteq \mathcal{K}$
(ii) $\mathcal{K} \subseteq \mathcal{K}_{\text {exp }}$.
(iii) $\mathcal{K} \circ \mathcal{K} \subseteq \mathcal{K}$
(iv) $\mathcal{K}^{(l)} \subseteq \mathcal{K}$.

We call $\mathcal{K}, \mathrm{AC}^{0}$-well-behaved if it satisfies $(i),($ ii $),($ (iii $)$ and $\mathrm{NC}^{1}$-well-behaved if it satisfies $(i),(i i),(i v)$.
It is easy to see that every $N C^{1}$-well-behaved class of circuits $\mathcal{K}$ is also $A C^{0}$-wellbehaved. Theorem 1 below already entails a collapse of $A C^{0}\left(C_{=} \mathcal{K}\right)$ by the observation that $N C^{1}$-well-behaved implies $A C^{0}$-well-behaved. We show that if one is interested in the collapse of $A C^{0}\left(C_{=} \mathcal{K}\right)$ then it is sufficient for $\mathcal{K}$ to be $A C^{0}$-well-behaved, which is a weaker notion.

Our main goal in this paper is to study the $A C^{0}$ and $N C^{1}$ reducibilities to $C_{=} \mathcal{K}$. $\mathrm{AC}^{0}\left(\mathrm{C}_{=} \mathcal{K}\right)\left(\right.$ respectively $\left.\mathrm{NC}^{1}\left(\mathrm{C}_{=} \mathcal{K}\right)\right)$ is the class consisting of languages which can be decided by constant depth(respectively $O(\log n)$-depth), polynomial size circuits consisting of $\wedge, \vee, \neg$ and $C_{=} \mathcal{K}$ gates. Our main theorem is the following:

Theorem 1. For every $\mathrm{NC}^{1}$-well-behaved class $\mathcal{K}$ the exact hierarchy $\mathrm{NC}^{1}\left(\mathrm{C}_{=} \mathcal{K}\right)$ collapses to $\mathrm{AC}_{3}^{0}\left(\mathrm{C}_{=} \mathcal{K}\right)$.

Proof. First we recall some terminology from [8]. Let $C$ be a circuit from $\mathrm{NC}^{1}\left(\mathrm{C}_{=} \mathcal{K}\right)$. Then a blob consists of the subcircuit rooted at some equality gate $g$ in the circuit where the equality gates in the sub-circuit are replaced by new formal variables. In other words it consists of a single oracle gate. A blob chain consists of a root to leaf path in the $\mathrm{NC}^{1}$ circuit and includes all the equality gates (aka oracle gates) along the path.

We need a generalization of a lemma from [8] which shows that a circuit of two blob layers i.e. a blob with its input blobs can be replaced by a shallow Boolean circuit $\left(\mathrm{AC}_{3}^{0}\right)$ with a single layer of blobs below it. In this "flattening" process the degree and the size of the circuit increase by an extent we explicate in the lemma.

In [8] this lemma was proved in the context of $\mathrm{C}_{=} N C^{1}$ but can be generalized to any $C_{=} \mathcal{K}$ since the only property of GapNC ${ }^{1}$ used in its proof was that it was a class of arithmetic circuits closed under composition with logarithmic depth polynomial size formulas; and this is true for all well-behaved $\mathcal{K}$ by definition.

Lemma 1. (Rephrased from [8]) Let $C_{0}(\boldsymbol{x}), C_{1}(\boldsymbol{y}), \ldots, C_{m}(\boldsymbol{y})$ be $m+1$ circuits from $\mathrm{C}_{=} \mathcal{K}$. Let $f: \boldsymbol{y} \mapsto C_{0}(\boldsymbol{C}(\boldsymbol{y}))$ be the function computed by feeding in the circuits $C_{i}$ for $i>0$ as inputs of $C_{0}$. Then, $f$ is also computed by the following $\mathrm{AC}_{3}^{0}\left(\mathrm{C}_{=} \mathcal{K}\right)$ circuit:

$$
\bigvee_{i=1}^{m} z_{i}(\boldsymbol{y})=\bigvee_{i=1}^{m}\left[S_{i, 0}\left(\boldsymbol{C}^{\prime}(\boldsymbol{y})\right) \wedge S_{i, 1}\left(\boldsymbol{C}^{\prime}(\boldsymbol{y})\right) \wedge B_{i}\left(\boldsymbol{C}^{\prime}(\boldsymbol{y})\right)\right]
$$

Notice that, we have used abbreviations, $\boldsymbol{x}=\left(x_{1}, \ldots, x_{m}\right), \boldsymbol{y}=\left(y_{1}, \ldots, y_{n}\right)$, and similarly $\boldsymbol{C}, \boldsymbol{C}^{\prime}$ to simplify notation. More importantly, $C_{i}^{\prime}$ are functions in $\mathcal{K}, S_{i, 0}$ is a $\mathrm{C}_{=} \mathcal{K}$ circuit, $S_{i, 1}$ is a $\mathrm{C}_{\neq} \mathcal{K}$ circuit, and $B_{i}$ from $\mathrm{C}_{\neq} \mathcal{K}$. Further, the size and the degree of the entire circuit is bounded by $O\left(m s_{0}\right)+O\left(m^{8} s\right)$ and $O\left(m^{9} d+d_{0} d\right)$ respectively. Here $s_{0}$ is the size of $C_{0}$, $s$ is the sum of sizes of the $C_{i}$ 's (for $\left.i>0\right), d_{0}$ is the degree of $C_{0}$ and $d$ is the sum of degrees of $C_{i}$ 's for $i>0$.
Emulating [8] we can use Lemma 1 to collapse two levels of blobs into one with an $\mathrm{AC}_{3}^{0}$ circuit at the top. The $\mathrm{AC}_{3}^{0}$ circuit can be converted to an element of $\mathcal{K}_{\text {form }}$ (the proof of the lemma shows that the $\mathrm{AC}_{3}^{0}$ circuit is an unambiguous formula and straightaway arithmetization will preserve the value). Assuming we start with a $\mathrm{AC}_{k}^{0}\left(\mathrm{C}_{=} \mathcal{K}\right)$ circuit, then the closure of $\mathcal{K}$ under composition with $\mathcal{K}_{\text {form }}$ allows us to convert the circuit to the form $\mathrm{AC}_{k-1}^{0}\left(\mathrm{C}_{=} \mathcal{K}\right)$ i.e. with depth one lesser than earlier. Since we need to repeat this operation $k$ times the size remains bounded by $n^{O(k)}$ times the size of the original circuit and the degree by the degree of the original circuit raised to a power which is $O(k)$. Notice that this seems to work only for $\mathrm{AC}^{0}$ relativizations because the naïve upper bounds on the size and degree become quasipolynomial when we repeat the collapse logarithmically many times as necessitated by $\mathrm{NC}^{1}$ relativizations. We show in Lemma 3 that this not the case.

Proof. (of Lemma 1)
Our proof is a quantitative version of the one used in [8]. There, given an $\mathrm{AC}^{0}\left(\mathrm{C}_{=} \mathcal{K}\right)$ circuit, one applies Lemma $1 O(k)$ times, where $k$ is the depth of the $\mathrm{AC}^{0}\left(\mathrm{C}_{=} \mathcal{K}\right)$ circuit, to get the required $\mathrm{AC}_{3}^{0}\left(\mathrm{C}_{=} \mathcal{K}\right)$ circuit.

Notice that the $\mathrm{AC}_{3}^{0}$ circuit defined by:
$\bigvee_{i=1}^{m} z_{i}(\boldsymbol{y})=\bigvee_{i=1}^{m}\left[S_{i, 0}\left(C_{1}^{\prime}(\boldsymbol{y}), \ldots, C_{m}^{\prime}(\boldsymbol{y})\right) \wedge S_{i, 1}\left(C_{1}^{\prime}(\boldsymbol{y}), \ldots, C_{m}^{\prime}(\boldsymbol{y})\right) \wedge B_{i}\left(C_{1}^{\prime}(\boldsymbol{y}), \ldots, C_{m}^{\prime}(\boldsymbol{y})\right)\right]$
implements the predicate that asserts that there is a subset of circuits from $C_{1}(\boldsymbol{y}), \ldots C_{m}(\boldsymbol{y})$ that all evaluate to 1 on $\boldsymbol{y}$. To this end, one implements the following three clauses $S_{i, 0}$, $S_{i, 1}$ and $B_{i}$ which evaluate to 1 simultaneously if and only if there is a subset of circuits $\left\{C_{j}(\boldsymbol{y})\right\}$ of size $i$, for $i, j \in[m]$ such that on input $\boldsymbol{y}$ all these $i$ circuits evaluate to a 1 and the circuit $B_{i}\left(C_{1}(\boldsymbol{y}), \ldots, C_{m}(\boldsymbol{y})\right)$ also evaluates to a 1 . We use Tzamaret's construction[25] to implement the functions $S_{i, 0}$ and $S_{i, 1}$ :

Let $Y=\left\{C_{1}^{\prime}(\boldsymbol{y}), C_{2}^{\prime}(\boldsymbol{y}), \ldots, C_{m}^{\prime}(\boldsymbol{y})\right\}$. Consider the polynomial $R_{i}(Y)=\sum_{S \subseteq Y ;|S|=i} \prod_{j \in S} C_{j}^{\prime}$ where $C_{j}^{\prime}$ come from the set $Y$ consisting of $m$ variables (We assume $R_{0}(Y)=1$ ). Notice that the product of polynomials, $\prod_{i=1}^{l}\left(x_{i}+z\right)=\sum_{j=0}^{m} R_{i}(Y) z^{j}$. We can recover the polynomials $R_{i}(Y)$, given values to the variables in $Y$ by evaluating the polynomials in $z$ at $(m+1)$ distinct values of $z$, via interpolation.

$$
V \times\left[\begin{array}{c}
R_{0}(Y) \\
R_{1}(Y) \\
\vdots \\
R_{m}(Y)
\end{array}\right]=\left[\begin{array}{c}
\prod_{i=1}^{m} C_{i}^{\prime}(\boldsymbol{y}) \\
\prod_{i=1}^{m}\left(C_{i}^{\prime}(\boldsymbol{y})+1\right) \\
\vdots \\
\prod_{i=1}^{m}\left(C_{i}^{\prime}(\boldsymbol{y})+m\right)
\end{array}\right]
$$

where $V$ is a $(m+1) \times(m+1)$ Vandermonde matrix(and hence invertible) where for $0 \leq i, j \leq m, V_{i j}=i^{j}$. Note that $V$ can be precomputed, since it is a constant. Now we can compute the symmetric polynomial $R_{i}$ as follows:

$$
R_{i}\left(C_{1}^{\prime}, \ldots, C_{m}^{\prime}\right)=\sum_{j=0}^{m} V_{i j}^{-1} \prod_{k=1}^{m}\left(C_{k}^{\prime}(\boldsymbol{y})+j\right)
$$

Notice that, here $S_{i, 0}\left(\right.$ respectively $\left.S_{i, 1}\right)$ are $C_{=} \mathcal{K}\left(\right.$ respectively $\left.C_{\neq} \mathcal{K}\right)$ functions respectively where the underlying $\mathcal{K}$ functions are respectively $R_{i}\left(C_{1}^{\prime}(\boldsymbol{y}), \ldots, C_{m}^{\prime}(\boldsymbol{y})\right)$ and $R_{i+1}\left(C_{1}^{\prime}(\boldsymbol{y}), \ldots, C_{m}^{\prime}(\boldsymbol{y})\right)$. The third clause $B_{i}\left(C_{1}^{\prime}(\boldsymbol{y}), \ldots, C_{m}^{\prime}(\boldsymbol{y})\right.$ is actually emulating our original circuit whereas the other two clauses are essentially used to eliminate the $=$ tests. Also, the $\mathrm{AC}_{3}^{0}$ circuit is actually a formula which makes the $\mathrm{AC}_{3}^{0}\left(\mathrm{C}_{=} \mathcal{K}\right)$ circuit unambiguous and hence one can use the fact that $\mathcal{K}_{\text {form }} \subseteq \mathcal{K}$ to infer that $\mathcal{K} \circ \mathcal{K}_{\text {form }} \subseteq \mathcal{K}$. We further claim that the predicate $\bigvee_{i=1}^{m} z_{i}(\boldsymbol{y})$ above is true for precisely one value of $i \in[m]$.
Claim. For all $\boldsymbol{y} \in\{0,1\}^{n}$,

$$
C_{0}\left(C_{1}(\boldsymbol{y}), C_{2}(\boldsymbol{y}), \ldots, C_{m}(\boldsymbol{y})\right) \neq 0 \Longleftrightarrow \exists i \in[m]: z_{i}(\boldsymbol{y})
$$

Proof. For a given $\boldsymbol{y}$, let $r_{y}$ denote the number of circuits $C_{i}^{\prime}$ that evaluate to a non-zero value at $\boldsymbol{y}$. Then,

1. $S_{i, 0}\left(C_{1}^{\prime}(\boldsymbol{y}), \ldots, C_{m}^{\prime}(\boldsymbol{y})\right)$ is 0 when $i>r_{y}$ because the symmetric polynomial on $m$ variables, where each monomial has exactly $i$ variables, is always 0 when $i>r_{y}$, since at least one of the variables in the product is 0 .
2. $S_{i, 1}\left(C_{1}^{\prime}(\boldsymbol{y}), \ldots, C_{m}^{\prime}(\boldsymbol{y})\right)$ is 0 exactly when $i<r_{\boldsymbol{y}}$, because of the same reason as above - the monomials are made up from only the non-zero variables, and hence the polynomial is always non-zero.
3. Hence, $z_{i}(\boldsymbol{y})$ is false whenever $i \neq r_{\boldsymbol{y}}$.

So, there is exactly one choice of $i$, equal to the number of $C_{i} \mathrm{~s}$ which are non-zero, where the first two clauses of the predicate evaluate to true for all strings $y$. What happens to the third clause $B_{i}$ at this particular $i$ ? One wants this clause to evaluate to a 0 whenever $C_{0}\left(C_{1}(\boldsymbol{y}), C_{2}(\boldsymbol{y}), \ldots, C_{m}(\boldsymbol{y})\right)$ is 0 and otherwise evaluate to 1 .

Let $G(\boldsymbol{y})=\prod_{C_{j}^{\prime}(\boldsymbol{y}) \neq 0} C_{j}^{\prime}(\boldsymbol{y})$. If $\forall i \in[m], C_{i}^{\prime}(\boldsymbol{y})=0$, define $G(\boldsymbol{y})=1$. Hence, $G(\boldsymbol{y})$ is always non-zero by our definition. Plug in $G(\boldsymbol{y})$ instead of 1 and $-G(\boldsymbol{y})$ instead of -1 in $C_{0}, \ldots, C_{m}$, and weigh each of the $C_{i}$ s by $G(\boldsymbol{y})$ to eliminate the $=$ gate. This results in a circuit of the form $C_{0}\left(C_{1}^{\prime}, \ldots, C_{m}^{\prime}\right)$ where $C_{1}^{\prime}, \ldots, C_{m}^{\prime}$ are functions from the class $\mathcal{K}$. Since the initial $\mathrm{NC}^{1}\left(\mathrm{C}_{=} \mathcal{K}\right)$ was of polynomial degree, the predicate implemented blows up the output of the circuit by a number which is at most polynomially many bits long. Since $B_{i}\left(C_{1}^{\prime}(\boldsymbol{y}), \ldots, C_{m}^{\prime}(\boldsymbol{y})\right)$ is exactly the function $C_{0}\left(C_{1}(\boldsymbol{y}), C_{2}(\boldsymbol{y}), \ldots, C_{m}(\boldsymbol{y})\right)$, the claim follows.

One is left with the task of constructing functions in class $\mathcal{K}$ which can implement the functions $S_{i, 0}$ and $S_{i, 1}$.

Claim. For all $i \in[m]$, there exist $\mathcal{K}$ functions $N_{i}$ and $D_{i}$ such that, $S_{i, 0}\left(C_{1}(\boldsymbol{y})\right.$, $\left.C_{2}(\boldsymbol{y}), \ldots, C_{m}(\boldsymbol{y})\right)=N_{m}^{i}(\boldsymbol{y}) / D_{m}^{i}(\boldsymbol{y})$.

Proof. The symmetric functions are constructed using the expression:

$$
R_{i}\left(C_{1}^{\prime}, \ldots, C_{m}^{\prime}\right)=\sum_{j=0}^{m} V_{i j}^{-1} \prod_{k=1}^{m}\left(C_{k}^{\prime}(\boldsymbol{y})+j\right)
$$

The implementation of $S_{i, 0}$ and $S_{i, 1}$ via Tzamaret's technique requires constructing and inverting a $(m \times m)$-Vandermonde matrix, which requires size $O\left(m^{4}\right)$ from the analysis in Section 2. Since $V$ is an integer matrix, each of the entries of $V^{-1}$ can be written as ratio of integers(the denominator being $\operatorname{det}(V)=\prod_{i<j}(j-i)$ ). So, to compute $R_{i}\left(C_{1}^{\prime}, \ldots, C_{m}^{\prime}\right)$, we have $D_{i}(\boldsymbol{y})=\operatorname{det}(V)$ and $N_{i}(\boldsymbol{y})=\sum_{j=0}^{m} V_{i j}^{-1} \prod_{i=1}^{m}\left(C_{i}(\boldsymbol{y})+\right.$ $j$ ). We can multiply the whole equation by $\operatorname{det}(V)$ and this is equivalent to computing $\operatorname{det}(V) N_{i}(\boldsymbol{y})=\sum_{j=0}^{m} \operatorname{det}(V) V_{i j}^{-1} \prod_{i=1}^{m}\left(C_{i}(\boldsymbol{y})+j\right)$.

Naïvely, it would seem like the determinants of the Vandermonde matrices get multiplied, at each level, leading to an explosion in the degree. But, notice that by Fact 1, at every level, the denomianator will be just $\max _{i \in[m]} \operatorname{det}\left(V_{i}\right)$, where $V_{i}$, the Vandermonde matrix of dimension $(i \times i)$, is the denominator that percolates up as a result of previous collapses. This will remove the need to carry around the symmetric function as a numerator-denominator pair. Since we replace 1 with $G(\boldsymbol{y})$ and -1 with $-G(\boldsymbol{y})$ in our circuit $B_{i}$, in order to bound the size and the degree, one has to make sure that the size and degree of the expression $\operatorname{det}(V) N_{i}(\boldsymbol{y})$ is bounded by a polynomial in the number of input variables. The size of the circuit resulting from the collapse is $O\left(m s_{0}+m^{8} s\right)$ : $S_{i, 0}$ and $S_{i, 1}$ contribute $O\left(m^{8} s\right)$ each for every value of $i$, and considering the fact that the boolean OR is over $m$ values of $i$, their total contribution to the size is $O\left(m^{9} s\right)$. The clause $B_{i}$ contributes $O\left(m s_{0}+m s\right)$ : the circuit at the top, $C_{0}$ contributes $s_{0}$ for each AND gate(totally $O\left(m s_{0}\right)$, and its inputs are of size $s$, adding up to $O(m s)$ size. Hence the total size of the circuit resulting from a single collapse is $O\left(m^{8} s+m s_{0}\right)$. In case of the degree, one really has to bound $\max _{i \in[m]}$ degree $\left[z_{i}(\boldsymbol{y})\right]$. The degree of the circuit is at most $O\left(m^{8} \log ^{2} m d\right)$ (this is in order to account for $V^{-1}$ which is written as a ratio of determinant of cofactor matrices and $\operatorname{det}(V)$ ). Hence the degree of the whole circuit obtained as a result of a single collapse is at most (degree of $S_{i, 0}+$ degree of $S_{i, 1}+$
degree of $\left.B_{i}\right) \leq O\left(m^{8} \log ^{2} m d\right)+O\left(m^{8} \log ^{2} m d\right)+O\left(d_{0} d\right) \leq O\left(m^{8} \log ^{2} m d+d_{0} d\right)$ , where $d_{0}$ is the degree of $C_{0}$ and $d$ is the sum of degrees of $C_{i} \mathrm{~s}$. Thus we get the size and degree bounded as claimed in Lemma 1.

Now we show how to do a careful analysis of the collapse in case of $\mathrm{NC}^{1}$ relativizations which allows us to conclude that the size and degree do not become too large. The key is to use basic counting along with Cook-Wilson condition on $N C^{1}$-relativization which allows us to conclude the following:

Proposition 2. Consider an $\mathrm{NC}^{1}\left(\mathrm{C}_{=} \mathcal{K}\right)$ circuit $C$ with size, height and number of inputs $s(C), h(C)$ and $n(C)$ respectively. Then,

- the number of distinct maximal blob chains is upper bounded by $s(C)$.
- the product of the fan-ins of the oracle gates along a blob-chain is upper bounded by $(n(C))^{c_{1}}$ for some constant $c_{1}$ depending on the circuit family; (a consequence of the Cook-Wilson property)

Next we bound the total size of the flattened circuit. Lemma 2 shows that, for the $\mathcal{K}^{(l)}$ circuits obtained as a result of the collapse, indeed $\mathcal{K}^{(l)}=\mathcal{K}$.

Lemma 2. The size of the flattened circuit is upper bounded by: $n^{c} s^{2}$, where $s$ is the sum of sizes of the circuits corresponding to oracle gates and c is a constant depending on the family of circuits (and $n$ is the number of inputs to the original circuit).

Proof. We use the recurrence:

$$
\begin{align*}
S(g) & =3 f_{g}^{9} \sum_{h: \operatorname{parent}(h)=g} S(h)+s_{g} f_{g}  \tag{1}\\
& \leq 3 f_{g}^{9} \sum_{h: \operatorname{parent}(h)=g} S(h)+s_{g}^{2}\left(\text { Since } f_{g} \leq s_{g}\right) \tag{2}
\end{align*}
$$

Here $S(g)$ is the total size of the sub-circuit below gate $g$ after flattening, $s_{g}$ is the original size of the $\mathcal{K}$-circuit for just the gate $g$ and $f_{g}$ is the fan-in of $g$. Let us further denote by $h_{g}$ the height of $g$. Then the recurrence has the solution:

$$
S(g)=3^{h_{g}} \sum_{g^{\prime}: g^{\prime} \text { is a descendant of } g}\left(\left(\prod_{h \in B\left(g, g^{\prime}\right)} f_{h}^{9}\right) s_{g^{\prime}}^{2}\right)
$$

where $B\left(g, g^{\prime}\right)$ denotes the blob-chain between $g$ and $g^{\prime}$.
Let $g_{0}$ be the root of the circuit. From Proposition 2, the products of the fan-ins in the sum above is upper bounded by $n^{c_{1}}$, where $n=n(C)$ is the number of inputs of the circuit. Also, $h_{g_{0}}$ is bounded by $c_{2}^{\prime} \log n$ for some constant $c_{2}^{\prime}$, since the reduction is an $\mathrm{NC}^{1}$ reduction. Hence, $S\left(g_{0}\right) \leq n^{c} \sum_{g} s_{g}{ }^{2}$.

Finally we bound the degree of the flattened circuit, which is crucial to proving our collapse:

Lemma 3. The degree of the flattened circuit is bounded by $n^{c^{\prime}}$ where $c^{\prime}$ is a constant depending on the original circuit family.
Proof. The degree of a well-behaved circuit on $t$ inputs is at most $t^{c_{1}^{\prime}}$ for some constant $c_{1}^{\prime}$ depending on the circuit family (since a circuit family is a subclass of $\mathcal{K}_{\text {exp }}$ circuits which have a polynomial degree. Now, we have the recurrence:

$$
D(g)=f_{g}^{9} d_{g} \sum_{h: \operatorname{parent}(h)=g} D(h)
$$

where we define $D(g)$ to be the degree of the flattened circuit at $g$ and $d_{g}$ its original degree as a circuit in $\mathcal{K}$. This has the solution:

$$
D(g)=\sum_{B: B \text { is a maximal blob-chain rooted at } g}\left(\prod_{h: h \in B}\left(f_{h}^{9} f_{h}^{c_{1}^{\prime}}\right)\right)
$$

Thus, $D\left(g_{0}\right) \leq s(C) n^{\left(c_{1}^{\prime}+1\right) c_{1}} \leq n^{c^{\prime}}$, using the bounds from Proposition 2 for the first inequality and observing that the fan-in of any oracle gate and the size of the $N C^{1}$ circuit is bounded by $n$.

Hence we have proved that for each equality gate in the final circuit, the degree remains polynomial via composition with formulas which means the circuit rooted at the equality gate is in $\mathrm{C}_{=} \mathcal{K}$, and the overall circuit in $\mathrm{AC}_{3}^{0}\left(\mathrm{C}_{=} \mathcal{K}\right)$. This completes the proof of the theorem.

We now document some consequences of Theorem 1:
Corollary 1. $\mathrm{NC}^{1}\left(\mathrm{C}_{=} \mathrm{NC}^{1}\right)=\mathrm{AC}_{3}^{0}\left(\mathrm{C}_{=} \mathrm{NC}^{1}\right)$
Proof. Formulas are $\mathrm{AC}^{0}$-well-behaved since they are clearly closed under composition with formulas and have polynomial degree. The fact that they are $N C^{1}$-well-behaved follows from Definition 5, namely the arithmetic Cook-Wilson property.
Corollary 2. $\mathrm{NC}^{1}\left(\mathrm{C}_{=} \mathrm{L}\right)=\mathrm{AC}_{3}^{0}\left(\mathrm{C}_{=} \mathrm{L}\right)$
Proof. Weakly skew circuits are closed under composition with $\mathcal{K}_{\text {form }}$ since $\mathcal{K}_{\text {form }}$ circuits are weakly skew(by [24]). Also, the polynomials computable by weakly skew circuits of polynomial size are also computable by determinants of polynomial sized matrices by [24]. Hence the former have polynomial degrees. Thus weakly skew circuits are $A C^{0}$-well-behaved. To prove that weakly skew circuits are $N C^{1}$-well-behaved, it is sufficient to note that, weakly skew circuits composed with themselves $O(\log n)$-many times, remain weakly skew due to the arithmetic Cook-Wilson in Definition 5.

Corollary 3. $\mathrm{NC}^{1}\left(\mathrm{C}_{=} \mathrm{SAC}^{1}\right)=\mathrm{AC}_{3}^{0}\left(\mathrm{C}_{=} \mathrm{SAC}^{1}\right)$
Proof. Formulas of polynomial size have polynomial degree, thus polynomial degree circuits of polynomial size are closed under composition with them completing the proof that they are $\mathrm{AC}^{0}$-well-behaved. Similarly, polynomial degree circuits composed with themselves $O(\log n)$-many times still have polynomial formal degree, and hence are $\mathrm{NC}^{1}$-well-behaved.

Corollary 4. $\mathrm{NC}^{1}\left(\mathrm{C}_{=} \mathrm{P}\right)=\mathrm{AC}_{3}^{0}\left(\mathrm{C}_{=} \mathrm{P}\right)$
Proof. This also follows exactly from the same reasoning as the one for $\mathrm{C}_{=} \mathrm{SAC}^{1}$ circuits.

## 4 Uniformity

We have proved that $\mathrm{NC}^{1}\left(\mathrm{C}_{=} \mathcal{K}\right)$ collapses to $\mathrm{AC}_{3}^{0}\left(\mathrm{C}_{=} \mathcal{K}\right)$. In this section we show that such an $\mathrm{AC}_{3}^{0}\left(\mathrm{C}_{=} \mathcal{K}\right)$ family is DLOGTIME-uniform, when the original $\mathrm{NC}^{1}\left(\mathrm{C}_{=} \mathcal{K}\right)$ circuit family is DLOGTIME-uniform.

Following [29], we call our $\mathrm{AC}_{3}^{0}$ circuit uniform, if the following language can be decided in $O(\log n)$ time by a deterministic turing machine (random access to its tape) $: L_{D C}=\{\langle y, g, p, b\rangle\}$, where

- $|y|=n$
- $g$ is the number of a gate $v$ in $C_{n}$
- $p \in\{0,1\}^{*}$ such that if $p=\epsilon$, then $b$ encodes the type of the gate from the basis of the circuit family and if $p$ is the binary representation of $k$, then $b$ is the number of the $k$-th predecessor gate to $v$ wrt to a fixed ordering on the gates in the circuit.
We encode every gate $g$ in the given circuit $\mathrm{NC}^{1}\left(\mathrm{C}_{=} \mathcal{K}\right) C$ by a concatenation of two labels $\left(g_{e}, g_{i}\right) \cdot g_{e}$ is an external encoding which assigns labels lexicographically to every $\mathrm{C}_{=} \mathcal{K}$ gate in $C . g_{i}$ is an internal encoding which assigns labels to the,$+ \times,=$ gates which constitute the $\mathrm{C}=\mathcal{K}$ gates. More precisely, given $\langle y, g, p, b\rangle$, we have $y=1^{n}$, $g=\left(g_{e}, g_{i}\right) . g_{e}$ is a $O(\log n)$ length string (since the underlying circuit is a $\mathrm{NC}^{1}$ circuit the total number of bits needed to uniquely address a gate in the circuit is $O(\log n)+\max _{\pi} \sum_{g \in \pi} \log \operatorname{fanin}(g)=O(\log n)$, where the maximum is taken over all paths $\pi$ in the circuit) by the definition of oracle circuits in the Cook-Wilson model). $g_{i}$ is of length $O(\log n)$ (since each of the $\mathrm{C}_{=} \mathcal{K}$ gates is poly-sized) ${ }^{3}$.

We will analyze a typical scenario in our collapse - Given a circuit of the form $C=C_{1} \circ C_{2} \circ C_{3}$ (see Figure 1A), where $C_{1}, C_{2}$ and $C_{3}$ are $\mathrm{C}_{=} \mathcal{K}$ circuits, we collapse $C_{2}$ and $C_{3}$ to obtain a circuit $C^{\prime}=C_{1} \circ C_{2}^{\prime} \circ C_{3}^{\prime}$ (see Figure 1B). Here $C_{2}^{\prime}$ is a circuit of the form $\vee \circ \wedge$ and $C_{3}^{\prime}$ is a layer of $\mathrm{C}_{=} \mathcal{K}$ circuits. We prove that the collapse of $C$ to $C^{\prime}$ can be made DLOGTIME-uniform and then show that under the Cook-Wilson property, such collapses can be combined to make the collapse of $\mathrm{NC}^{1}\left(\mathrm{C}_{=} \mathcal{K}\right)$, DLOGTIMEuniform.

Claim. For every $\mathrm{NC}^{1}$-well-behaved class $\mathcal{K}$, DLOGTIME-uniform $\mathrm{C}_{=} \mathcal{K} \circ \mathrm{C}_{=} \mathcal{K} \circ$ $\mathrm{C}_{=} \mathcal{K}=$ DLOGTIME-uniform $\mathrm{AC}_{3}^{0}\left(\mathrm{C}_{=} \mathcal{K}\right)$.

Proof. Assuming the DLOGTIME-uniformity and the aforementioned labelling scheme of $C$, we will now see how to decide the connection language of $C^{\prime}$ when we have access to the connection language of $C$. First we explain how we label $C^{\prime}$ from $C$ : The

[^2]

Fig. 1: (A) A circuit of the form $C_{1} \circ C_{2} \circ C_{3}=\mathrm{C}_{=} \mathcal{K} \circ \mathrm{C}_{=} \mathcal{K} \circ \mathrm{C}_{=} \mathcal{K}$ (B) The collapsed circuit obtained from the circuit $C_{1} \circ C_{2} \circ C_{3}$ above of the form $C_{1} \circ C_{2}^{\prime} \circ C_{3}^{\prime}=$ $\mathrm{C}_{=} \mathcal{K} \circ \vee \circ \wedge \circ C_{=} \mathcal{K}$.
labels of $C_{1}$ carry over from $C$ to $C^{\prime}$. For the $\vee \circ \wedge$ circuit $C_{2}^{\prime}$ we give new set of external and internal labels similar to $C$. This takes $O(\log n)$ many bits. There are two kinds of circuits in $C_{3}^{\prime}$ : circuits computing the symmetric functions, which are built via interpolation by symmetric polynomials, using the Vandermonde matrix and circuits of the form $C_{=} \mathcal{K} \circ \mathcal{K}$ (see Lemma 1). From Fact 1, it is clear that the entries of this matrix can be computed by circuits of polynomial size, which are labelled using $O(\log n)$ bits. The circuit of the form $\mathrm{C}_{=} \mathcal{K} \circ \mathcal{K}$ in $C_{3}^{\prime}$ is essentially $C_{2}$ to which the inputs are symmetric functions of $C_{3}$. Even here we assign a concatenation of external and internal encoding of the circuits from $C_{2}$ and $C_{3}$ respectively, which are $O(\log n)$ bits long.

In each of $C_{1}, C_{2}^{\prime}, C_{3}^{\prime}$, we identify three kinds of gates which we call $\alpha_{0}, \alpha_{\text {con }}, \alpha_{\text {mid }}$. $\alpha_{0}$ gates are either the output gate or the input gates of $C^{\prime} . \alpha_{\text {con }}$ are gates which connect $C_{i}$ to $C_{i+1}$ for $i \in\{1,2\}$. For example, the root gate of $C_{1}$, the $\vee$ gate in $C_{2}^{\prime}$ and the $=$ gate in $C_{3}^{\prime}$ are $\alpha_{\text {con }}$ type gates - their parents and children are not of type $\alpha_{\text {con }} . \alpha_{\text {mid }}$ gates are those whose parents and children are both $\alpha_{\text {mid }}$ or $\alpha_{\text {con }}$. The internal gates of $C_{1}$ and $C_{3}$ are gates of type $\alpha_{m i d}$. Note that all the gates in $C^{\prime}$ belong to one of these three categories.

So now we are ready to describe the label of a gate in $C^{\prime}$. It is of the form $\langle y, g, p, b\rangle$ as above where $g$ is now the concatenation of the external labels of the path to the nearest $=$ ancestor to the gate and the internal label of the gate in $C$. For example, the label of a gate in $C_{3}^{\prime}$ would consist of the concatenation of external labels of $C_{1}, C_{2}^{\prime}$, and the internal label of the gate in $C_{3}^{\prime}$. If $p$ is $\epsilon$, then $b$ is the type of the gate, which could be $\wedge, \vee, C_{=} \mathcal{K}$. Else the binary number encoded by the string $p$ points to the position of the gate according to our fixed ordering. Note that at the end of one collapse, the label lengths increase by atmost $O(\log n)$ bits(namely the bits required to internally label the $\vee \circ \wedge$ system arising out of the collapse. With the above labelling convention, now there are two tasks to be accomplished:

1. Given a label of a gate $g$ in $C^{\prime}$, we have to verify if it is indeed a valid label.
2. Given labels of gates $g, h$ is $g$ a parent/child of $h$ ?

The validity of a label is easily checked: One has to check if the concatenation of labels leads to a valid external and internal encoding by querying the DLOGTIME machine for the original circuit $C$ and check if it is a $\alpha_{0}, \alpha_{m i d}$ or $\alpha_{\text {con }}$ gate. For example, given two gates, deciding if one is the parent/child of the other can be done by checking if the label of one of them is a prefix of the other, and if yes, verifying if the gate types of the parent and child is valid according to the relation between $\alpha$ s specified above.

Now we prove that such a labelling convention leads to easily checking the validity of a label and connections in the collapse from Theorem 1.

Claim. For every $\mathrm{NC}^{1}$-well-behaved class $\mathcal{K}$, DLOGTIME-uniform $\mathrm{NC}^{1}\left(\mathrm{C}_{=} \mathcal{K}\right)=$ DLOGTIMEuniform $\mathrm{AC}_{3}^{0}\left(\mathrm{C}_{=} \mathcal{K}\right)$.

Proof. To label the final $\mathrm{AC}^{0}\left(\mathrm{C}_{=} \mathcal{K}\right)$ circuit, we concatenate the labels of the intermediate collapses. This might seem like it requires labels of length $O\left(\log ^{2} n\right)$ since the circuit we started out with had labels of length $O(\log n)$ and the $\mathrm{NC}^{1}\left(\mathrm{C}_{=} \mathcal{K}\right)$ circuit requires $O(\log n)$ collapses to reduce it to an $\mathrm{AC}^{0}\left(\mathrm{C}_{=} \mathcal{K}\right)$ circuit. But just as in the proof of Theorem 1, the Cook-Wilson property ensures that the length of the labels is $O(\log n)$. This is because, even though we do the collapse as many as $O(\log n)$ times, the increase in label lengths is only due to the $\vee \circ \wedge$ block. Recall that the number of $\wedge$ gates created in 1 is exactly equal to the fan-in of the oracle gate above, and the Cook-Wilson relativization model bounds the product of fan-ins along any path to a polynomial in the number of inputs. This also bounds the number of new gates created, and hence they can be labelled by $O(\log n)$-many bits.

## 5 Conclusion and Open Problems

We provide sufficient conditions on arithmetic circuits under which the NC ${ }^{1}$ hierarchy over the corresponding exact arithmetic class collapses. Natural extensions can include proving similar results for more powerful reducibilities like Boolean Formula reductions or even Logarithmic Boolean Formula reductions (which do not satisfy the "small-blob-chains" property or equivalently do not follow Cook-Wilson relativization). An appropriately defined notion of $S A C^{1}$-reductions is another intriguing possibility.

It may also be an interesting idea to give similar uniform proofs for the $\mathrm{NC}^{1}$ hierarchies over the better known classes $\mathrm{PNC}^{1}, \mathrm{PL}, \mathrm{PSAC}^{1}, \mathrm{PP}$ where we already know collapses for the second[17] and the last [4] hierarchies and also of the $A C^{0}\left(P N C^{1}\right)$ hierarchy [8].

## Acknowledgements

We thank Eric Allender, Vikraman Arvind, Sourav Chakraborty, Meena Mahajan, Prajakta Nimbhorkar and B.V.Raghavendra Rao for helpful discussions. The second author would like to thank the organisers of the Dagstuhl seminar on "Algebraic and Combinatorial Methods in Computational Complexity" where many of the previously mentioned discussions took place. We thank the anonymous referees for helpful comments.

## References

1. Allender, E., Beals, R., Ogihara, M.: The complexity of matrix rank and feasible systems of linear equations. Computational Complexity 8(2), 99-126 (1999)
2. Allender, E., Jiao, J., Mahajan, M., Vinay, V.: Non-commutative arithmetic circuits: depth reduction and size lower bounds. Theoretical Computer Science 209(1), 47-86 (1998)
3. Beigel, R., Reingold, N., Spielman, D.A.: PP is closed under intersection. Journal of Computer and System Sciences 50(2), 191-202 (1995)
4. Beigel, R., Fu, B.: Circuits over pp and pl. J. Comput. Syst. Sci. 60(2), 422-441 (2000)
5. Ben-Or, M., Cleve, R.: Computing algebraic formulas using a constant number of registers. SIAM Journal on Computing 21, 54-58 (1992)
6. Caussinus, H., McKenzie, P., Thérien, D., Vollmer, H.: Nondeterministic NC ${ }^{1}$ computation. Journal of Computer and System Sciences 57, 200-212 (1998), preliminary version in Proceedings of the 11th IEEE Conference on Computational Complexity, 1996, 12-21
7. Cook, S.: A taxonomy of problems with fast parallel algorithms. Information and Control 64, 2-22 (1985)
8. Datta, S., Mahajan, M., Rao, B.V.R., Thomas, M., Vollmer, H.: Counting classes and the fine structure between $\mathrm{NC}^{1}$ and L . In: Proceedings of the 35 th international conference on Mathematical foundations of computer science. pp. 306-317. MFCS'10 (2010)
9. Fenner, S.A., Fortnow, L.J., Kurtz, S.A.: Gap-definable counting classes. Journal of Computer and System Sciences 48(1), 116-148 (Feb 1994)
10. Gill, J.: Computational complexity of probabilistic turing machines. SIAM Journal on Computing 6(4), 675-695 (1977)
11. Gottlob, G.: Collapsing oracle-tape hierarchies. In: IEEE Conference on Computational Complexity. pp. 33-42 (1996)
12. Hemachandra, L.: The strong exponential hierarchy collapses. In: Structure in Complexity Theory Conference. IEEE Computer Society (1987)
13. Hoang, T.M., Thierauf, T.: The complexity of the characteristic and the minimal polynomial. Theor. Comput. Sci. 295, 205-222 (2003)
14. Immerman, N.: Nondeterministic space is closed under complementation. SIAM Journal on Computing 17(5), 935-938 (Oct 1988)
15. Limaye, N., Mahajan, M., Rao, B.: Arithmetizing classes around nc 1 and 1. STACS 2007 pp. 477-488 (2007)
16. Mengel, S.: Conjunctive Queries, Arithmetic Circuits and Counting Complexity. Ph.D. thesis, Universität Paderborn (2012)
17. Ogihara, M.: The PL hierarchy collapses. SIAM J. Comput. 27(5), 1430-1437 (1998)
18. Ogihara, M.: Equivalence of $N C^{k}$ and $A C^{k-1}$ closures of $N P$ and Other Classes. Inf. Comput. 120(1), 55-58 (1995)
19. Ogiwara, M.: Generalized theorems on relationships among reducibility notions to certain complexity classes. Mathematical Systems Theory 27(3), 189-200 (1994)
20. Santha, M., Tan, S.: Verifying the determinant in parallel. Computational Complexity 7(2), 128-151 (1998)
21. Schöning, U., Wagner, K.W.: Collapsing oracle hierarchies, census functions and logarithmically many queries. In: STACS. pp. 91-97 (1988)
22. Simon, J.: On some central problems in computational complexity (1975)
23. Szelepcsényi, R.: The method of forced enumeration for nondeterministic automata. Acta Informatica 26(3), 279-284 (1988)
24. Toda, S.: Classes of arithmetic circuits capturing the complexity of computing the determinant. IEICE Transactions on Information and Systems E75-D, 116-124 (1992)
25. Tzamaret, I.: Studies in Algebraic and Propositional Proof Complexity. Ph.D. thesis, Tel Aviv University (2008)
26. Venkateswaran, H.: Properties that characterize LogCFL. Journal of Computer and System Sciences 42, 380-404 (1991)
27. Venkateswaran, H.: Circuit definitions of nondeterministic complexity classes. SIAM J. on Computing 21, 655-670 (1992)
28. Vinay, V.: Counting auxiliary pushdown automata and semi-unbounded arithmetic circuits. In: Proceedings of 6th Structure in Complexity Theory Conference. pp. 270-284 (1991)
29. Vollmer, H.: Introduction to Circuit Complexity: A Uniform Approach. Springer-Verlag New York Inc. (1999)
30. Wilson, C.B.: Relativized circuit complexity. J. Comput. Syst. Sci. 31(2), 169-181 (1985)

[^0]:    ${ }^{1}$ see Definitions 4, 8

[^1]:    ${ }^{2}$ Note the non-standard convention to account for the degree of a constant. We do this so as to account for the size and degree contribution of the Vandermonde matrices, which we will use extensively in our results.

[^2]:    ${ }^{3}$ Note that from [27] $C_{=} P$ is exactly characterized by DLOGTIME-uniform semi-unbounded circuits of $2^{n^{O(1)}}$ size and logarithmic depth and in this case the labels of the gate themselves will be strings of length $n^{O(1)}$

