# Skew Circuits of Small Width 

Nikhil Balaji ${ }^{1}$, Andreas Krebs ${ }^{2}$, and Nutan Limaye ${ }^{3}$<br>1 Chennai Mathematical Institute, Chennai, India. nikhil@cmi.ac.in<br>2 University of Tübingen, Germany. mail@krebs-net.de<br>3 Indian Institute of Technology, Bombay, India. nutan@cse.iitb.ac.in


#### Abstract

A celebrated result of Barrington (1985) proved that polynomial size, width- 5 branching programs (BP) are equivalent in power to a restricted form of branching programs - polynomial sized width5 permutation branching programs ( PBP ), which in turn capture all of $\mathrm{NC}^{1}$. On the other hand it is known that width-3 PBPs require exponential size to compute the AND function. No such lower bound is known for width- 4 PBPs , however it is widely conjectured that width- 4 PBPs will not capture all of $N C^{1}$. In this work, we investigate the region inside width- 5 branching programs by comparing them with bounded width skew circuits.

It is well known that branching programs of bounded width have the same power as skew circuit of bounded width. The naive approach converts a BP of width $w$ to a skew circuit of width $w^{2}$. We improve this bound and show that BP of width $w \geq 5$ can be converted to a skew circuit of width 7. This also implies that skew circuits of bounded width are equal in power to skew circuits of width 7 . For the other way, we prove that for any $w \geq 2$, a skew circuit of width $w$ can be converted into an equivalent branching program of width $w$.

We prove that width-2 skew circuits are not universal while width-3 skew circuits are universal. We show that any polynomial sized CNF or DNF is computable by width 3 skew circuits of polynomial size. We prove that a width-3 skew circuit computing Parity requires exponential size. This gives an exponential separation between the power of width-3 skew circuits and width-4 skew circuits.


1998 ACM Subject Classification Dummy classification - please refer to http://www.acm.org/ about/class/ccs98-html

## 1 Introduction

The Boolean circuit complexity class NC ${ }^{1}$ consists of Boolean functions computable by polynomial sized logarithmic depth circuits. Basic arithmetic operations like addition, multiplication and division are known to be implementable in $N C^{1}$. $N C^{1}$ itself is contained in Logspace. All regular languages have uniform $N C^{1}$ families deciding them and there is a regular language which is $N C^{1}$-hard. Over the years, several useful characterizations of $N C^{1}$ have emerged: $N C^{1}$ contains exactly those regular languages that are characterized by having a monoid containing a non-solvable group. They are also equally expressive as Branching Programs of constant width. Our interest in $N C^{1}$ is motivated by the celebrated result of Barrington [1], that Branching Programs of width 5 are sufficient to capture $N C^{1}$ in its entirety.

Branching programs have been pivotal to our understanding of computation with limited resources. They were first defined in [6] and formally studied by Masek in his Master's thesis [7]. Borodin et al.[3] proved that $\mathrm{AC}^{0}$ is contained in the class of functions computed by bounded width branching programs and conjectured that Majority cannot be computed by them. In a surprising result, Barrington showed that in fact, width 5 branching programs can compute all of $N C^{1}$ and hence the Majority function.

© Nikhil Balaji, Andreas Krebs and Nutan Limaye;
licensed under Creative Commons License CC-BY
Leibniz International Proceedings in Informatics
LIPICS Schloss Dagstuhl - Leibniz-Zentrum für Informatik, Dagstuhl Publishing, Germany

After the strong lower bound results of [9, 10], the question of proving lower bounds for $\mathrm{NC}^{1}$ gained a lot of attention. However this has turned out to be a notorious open problem and we still do not know any non-trivial lower bounds for $\mathrm{NC}^{1}$. Given this scenario, the branching program characterization of $\mathrm{NC}^{1}$ has provided an avenue to understand the power of classes that reside inside $\mathrm{NC}^{1}$. Though proving lower bounds for width 5 branching programs is equivalent to proving lower bounds for $\mathrm{NC}^{1}$, it is conceivable that proving lower bounds for width 4 branching programs is easier. In this regard, it is known [2] that width 3 branching programs of a restricted type (permutation branching programs) require exponential size to compute the AND function. It is worthwhile to contrast this against the situation at width 5 , where permutation branching programs are known to be as powerful as general branching programs of width 5 and hence $N C^{1}$ itself.

It is known that bounded width branching programs can be equivalently thought of as bounded width skew circuits (see for example [8]). Here, we take a closer look at this relationship. The folklore construction converts a polynomial size branching program of width $w$ into a polynomial size skew circuit of width $w^{2}$. We improve this construction and show that any bounded width branching program of width greater than or equal to 5 can be converted into an equivalent skew circuit of width 7 . We also study the conversion of skew circuits into branching programs. Here, the known construction converts a skew circuit of width $w$ into a branching program of width $w+1$ [8]. We improve this construction and prove that a polynomial size skew circuit of width $w$ can be converted into a polynomial size branching program of width $w$. These results prove that width 7 skew circuits of polynomial size characterize $\mathrm{NC}^{1}$.

These structural results allow us to examine the set of languages in $N C^{1}$ by varying the width of skew circuits between 1 and 7 . Like for permutation branching programs, some natural questions arise for bounded width skew circuits. We start by examining the power of width 2 skew circuits. We first observe that they are not universal. We prove that width 2 skew circuits of any size cannot compute parity of two bits.

We then study the power of width 3 skew circuits. Recall that a CNF (DNF) is an AND (OR) of ORs (ANDs) of variables, i.e. in a CNF the AND gate is (possibly) non-skew. We implement a CNF by a width 3 skew circuit. Formally, we prove that any $k$-CNF or any $k$-DNF of size $s$ has width 3 skew circuits of length $O(s k)$. Given that any Boolean function on $n$ variables has a CNF of exponential (in $n$ ) size, this also proves that width 3 skew circuits are universal.

We consider the problem of proving lower bound for width 3 skew circuits. A natural candidate is a function which has no polynomial sized CNF or DNF. It is known that Parity is one such function. We prove that Parity requires width 3 skew circuits of exponential size. We observe that Parity and Approximate Majority have respectively, linear and polynomial size width 4 skew circuits. This separates width 3 skew circuits from width 4 skew circuits.

The rest of the paper is organized as follows: In Section 2, we introduce the branching programs, permutation branching programs and skew circuits, and present some obvious containments. In Section 3, we present our improved results of simulation of branching programs by skew circuits and vice versa. In Section 4, we explore the skew circuit classes of width 1 to 7 and present some upper and lower bounds. In Section 5, we give an exponential separation between skew circuits of width-3 and width-4 via the Parity function. We discuss some questions that remain unanswered by our work in Section 6.

## 2 Preliminaries

In this section we introduce some notations and preliminaries which we will use in the rest of the paper.

A directed acyclic graph $G=(V, E)$ is called layered if the vertex set of the graph can be partitioned, $V=V_{1} \cup \ldots \cup V_{\ell}$ in such a way that for each edge $e=(u, v)$ there exists $1 \leq i<\ell$ such that $u \in V_{i}$ and $v \in V_{i+1}$. Given a layered graph $G$ the length of the graph is the number of layers in it and the width of the graph is the maximum over $i \in[\ell],\left|V_{i}\right|$.
$\rightarrow$ Definition 1. (Branching Programs) A Deterministic Branching Program (BP) is a layered directed acyclic graph $G$ with the following properties:

- There is a designated source vertex $s$ in the first layer (of in-degree 0 ) and a sink vertex $t$ (of out-degree 0 ) in the last layer.
- The edges are labelled by an element of $X \cup\{0,1\}$, where $X$ is the set of input variables to the branching program.

The branching program naturally computes a boolean function $f(X)$, where $f(X)=1$ if and only if there is path from $s$ to $t$ in which each edge is labelled by a true literal or a constant 1 on input $X$. The length (width) of the BP is the length (respectively, width) of the underlying layered DAG.

We will denote the class of languages accepted by width-w BP by $\mathrm{BP}^{w}$. Barrington $[2,1]$ defined a restricted notion of branching programs called the Permutation Branching Program (PBP):

- Definition 2. (Permutation Branching Programs as a graph) A width-w PBP is a layered width $w$ BP in which the following conditions hold:
- There are designated source vertices $s_{1}, s_{2}, \ldots, s_{w}$ in the first layer, say layer 1 (of indegree 0 ) and sink vertices $t_{1}, t_{2}, \ldots, t_{w}$ (of out-degree 0 ) in the last layer, layer $\ell$.
- Each layer has exactly $w$ vertices.
- In each layer $1 \leq i<\ell$, all the edges are labelled by a unique variable, say $x_{j_{i}}$.
- In each layer $1 \leq i \leq \ell$ and $b \in\{0,1\}$, the edges activated when $x_{j_{i}}=b$ forms a permutation/matching, say $\theta_{i, b}$.

The permutation branching program naturally computes a boolean function $f(X)$, where $f(X)=1$ if and only if there is path from $s_{1}$ to $t_{1}, s_{2}$ to $t_{2}$, and so on till $s_{w}$ to $t_{w}$, where in each path each edge is labelled by a true literal or a constant 1 on input $X$. We will refer to the class of languages accepted by polynomial sized width- $w \mathrm{PBP}$ by $\mathrm{PBP}^{w}$.

The above definition of PBP can be rephrased as follows:

- Definition 3. (Permutation Branching Programs as a set of instructions) A width- $w$ length- $\ell$ PBP is a program given by a set of $\ell$ instructions in which for any $1 \leq i \leq \ell$, the $i$ th instruction is a three tuple $\left\langle j_{i}, \theta^{i}, \sigma^{i}\right\rangle$, where $j_{i}$ is an index from $\{1,2, \ldots,|X|\}, \theta^{i}, \sigma^{i}$ are permutations of $\{1,2, \ldots, w\}$. The output of the instruction is $\theta^{i}$ if $x_{j_{i}}=1$ and it is $\sigma^{i}$ if $x_{j_{i}}=0$. The output of the program on input $x$ is the product of the output of each instruction of the program on $x$.

We say that a permutation branching program computes a function $f$ if there exists a fixed permutation $\pi \neq i d$ such that for every $x$ such that $f(x)=1$ the program outputs $\pi$ and for every $x$ such that $f(x)=0$ the program outputs $i d^{1}$.

[^0]It is easy to see that the above two definitions of PBP are equivalent.

- Definition 4. (Skew Circuits) An AND gate is called skew if all but one of its children are input variables. A Boolean circuit in which all the AND gates are skew is called a skew circuit.

We assume that the skew circuits are layered. The width of the circuit is the maximum number of gates in any layer. The layer may have AND, OR or input gates. Each type of gate contributes towards the width. We assume that the fan-in of the AND gates is bounded by 2 and there are no NOT gates (negations appear only for the input variables) ${ }^{2}$. We denote the class of languages decided by width- $w$ skew circuits by $\mathrm{SK}^{w}$. The following lemma summarises some well known connections between $\mathrm{BP}^{w}, \mathrm{PBP}^{w}, \mathrm{SK}^{w}$.

- Lemma 5. Let $w \in \mathbb{N}$. Then

1. For any $w, \mathrm{PBP}^{w} \subseteq \mathrm{BP}^{w}$.
2. For any $w \geq 5, \mathrm{BP}^{w} \subseteq \mathrm{PBP}^{w}[1]$.
3. For any $w, \mathrm{BP}^{w}$ is contained in $\mathrm{SK}^{w^{2}}$ (see for example [8]).

Proof Sketch. The first part follows from the definitions of $\mathrm{PBP}^{w}$ and $\mathrm{BP}^{w}$. The celebrated result of Barrington shows the converse for $w \geq 5$. The proof of 3 involves converting a layered DAG $G$ into a circuit $C_{G}$ such that on any input $x$ there is a directed path from $s$ to $t$ in $G$ if and only if $C_{G}$ evaluates to 1 on $x$. This is done by converting every node of $G$ into an OR gate and every edge of $G$ into an AND gate. Let $e=(u, v)$ be an edge in $G$ which connects vertex $u$ to vertex $v$ and has label $x_{e}$. Let $\mathrm{OR}_{u}$ and $\mathrm{OR}_{v}$ be the OR gates in $C_{G}$ corresponding to $u, v$ and let $\mathrm{AND}_{e}$ be the AND gate in $C_{G}$ corresponding to $e$. In $C_{G}$, output of $\mathrm{AND}_{e}$ feeds into $\mathrm{OR}_{v}$ and $\mathrm{AND}_{e}$ receives $\mathrm{OR}_{u}$ and $x_{e}$ as its inputs. It is easy to see that this construction ensures that on any input $x$ there is a directed path from $s$ to $t$ in $G$ if and only if $C_{G}$ evaluated to 1 on $x$. Moreover, the circuit $C_{G}$ thus constructed is a skew circuit. In a layered graph $G$ if there are at most $w$ nodes per layer, then there are at most $w^{2}$ edges between any two consecutive layers. Therefore, in $C_{G}$ any layer can have at most $w^{2}$ gates.

- Definition 6. (Approximate Majority) Approximate Majority ApproxMaj ${ }_{a, n}:\{0,1\}^{n} \rightarrow$ $\{0,1\}$ is the promise problem defined as:

ApproxMaj $_{a, n}(x):= \begin{cases}0, & \text { if } x \text { has Hamming weight at most } a \\ 1, & \text { if } x \text { has Hamming weight at least } n-a\end{cases}$

## 3 Branching Programs and Skew Circuits

Here we analyze the conversion from branching programs to skew circuits and vice versa.

### 3.1 Branching Programs to Skew Circuits

First we discuss the following folklore conversion of branching programs to skew circuits:

[^1]- Lemma 7. (Folklore) Let $f:\{0,1\}^{n} \rightarrow\{0,1\}$ be a Boolean function computed by a width$w$ length- $\ell$ branching program with at most $k$ edges between any two consecutive layers and with the additional property that each layer reads at most one variable or its negation. Then there is a skew circuit of width max $\{w+2, k\}$ and size $O((k+w) \ell)$ computing $f$.

Proof. Given a branching program $B$ of width $w$, we convert it to a skew circuit $C$ as follows:

1. For every node $g$ in $B$ (except $s$ ), the circuit $C$ has an OR gate $\vee_{g}$. The node corresponding to $s$ in $C$ is the gate that computes the constant function 1. The output gate of $C$ is the node corresponding to $t$ in $B$.
2. Suppose there are incoming edges $e_{1}, e_{2}, \ldots, e_{k}$ to the node $g$ from gates $g_{1}, g_{2}, \ldots, g_{k}$ respectively. From our assumption they read the same input variable or its negation. For these wires we create AND gates $\wedge_{1}, \wedge_{2}, \ldots, \wedge_{k}$ which feed into $\vee_{g}$ and each AND gate $\wedge_{i}$ receives two inputs: $g_{i}$ and the varibale (or its negation) labeling the edge $e_{i}$, respectively.

Every vertex in the BP gives rise to an OR gate in the skew circuit. And every wire in the BP gives rise to an AND gate in the skew circuit. As every wire in any layer reads the same variable or its negation, we need to add two vertices corresponding to this variable and its negation on the layer below the AND layer, i.e. in the OR layer just below it. Therefore, the width of the OR layer is at most $w+2$, whereas the width of the AND layer is at most $k$. This immediately yields width and size bounds of $\max \{w+2, k\}$ and $O((k+w) \ell)$ respectively. It is easy to see that for every $x \in\{0,1\}^{n}, f(x)=B(x)=C(x)$.

### 3.2 Permutation Branching Programs to skew circuits

A permutation $\theta$ is called a transposition if either it is the identity permutation or there exists $i \neq j$ such that $\theta(i)=j, \theta(j)=i$ and for all $k \neq i \neq j, \theta(k)=k$. We call a transposition non-trivial if it is not the identity permutation, trivial otherwise.

- Definition 8. (Transposition Branching Programs, TBP) A width-w length- $\ell$ TBP is a program given by a set of $\ell$ instructions in which for any $1 \leq i \leq \ell$, the $i$ th instruction is a three tuple $\left\langle j_{i}, \theta^{i}, \sigma^{i}\right\rangle$, where $j_{i}$ is an index from $\{1,2, \ldots,|X|\}, \theta_{i}, \sigma_{i}$ are transpositions of $\{1,2, \ldots, w\}$. The output of the instruction is $\theta_{i}$ if $x_{j_{i}}=1$ and it is $\sigma_{i}$ if $x_{j_{i}}=0$. The output of the program on input $x$ is the product of the output of each instruction of the program on $x$.
- Lemma 9. Given a width-w PBP of length $\ell$ there is an equivalent width-w TBP of length $O(w \ell)$.

Proof. It is known (see for example [5]) that any permutation of $\{1,2, \ldots, w\}$ can be written as a product of $W$ transpositions of $\{1,2, \ldots, w\}$, where $W=O(w)$. Let $P$ be a width- $w$ PBP of length $\ell$. Consider the $i$ th instruction in the program, say $\left\langle j_{i}, \theta_{i}, \sigma_{i}\right\rangle$. We know that we can write $\theta_{i}$ as a product of $W$ transpositions, i.e. $\theta_{i}=t_{i, 1} \cdot t_{i, 2} \ldots t_{i, W}$, where for $1 \leq j \leq W t_{i, j}$ is a transposition. Similarly, we have $\sigma_{i}=s_{i, 1} \cdot s_{i, 2} \ldots s_{i, W}$, where $s_{i, j}$ is a transposition for $1 \leq j \leq W$.

To give a TBP equivalent to $P$, we replace every instruction $\left\langle j_{i}, \theta_{i}, \sigma_{i}\right\rangle$ in $P$ by the following: $\left\langle j_{i}, t_{i, 1}, i d\right\rangle \cdot\left\langle j_{i}, t_{i, 2}, i d\right\rangle \ldots\left\langle j_{i}, t_{i, W}, i d\right\rangle \cdot\left\langle j_{i}, i d, s_{i, 1}\right\rangle \cdot\left\langle j_{i}, i d, s_{i, 2}\right\rangle \ldots\left\langle j_{i}, i d, s_{i, W}\right\rangle$. By a simple inductive argument we can prove that the the transposition branching program thus obtained is equivalent to $P$. As $W=O(w)$, the upper bound on the length of the resulting branching program follows.

We defined TBPs as a set of instructions. Like in the case of PBPs, the definition of TBPs can be rephrased in terms of the underlying DAG. We observe the following about the DAG resulting from TBPs.

A width- $w$ TBP is a layered width $w$ PBP in which the following conditions hold:

- There are designated source vertices $s_{1}, s_{2}, \ldots, s_{w}$ in the first layer, say layer 1 (of indegree 0 ) and sink vertices $t_{1}, t_{2}, \ldots, t_{w}$ (of out-degree 0 ) in the last layer, layer $\ell$.
- Each layer has exactly $w$ vertices.
- In each layer $1 \leq i<\ell$, all the edges are labelled by a unique variable, say $x_{j_{i}}$.
- In each layer $1 \leq i \leq \ell$, one of the following holds:
- either the edges corresponding to $x_{j_{i}}=1$ form a non-trivial transposition and the edges cooresponding to $x_{j_{i}}=0$ form the identity permutation
- or the edges corresponding to $x_{j_{i}}=0$ form a non-trivial transposition and the edges cooresponding to $x_{j_{i}}=1$ form the identity permutation
- Remark 10. As a result of the above properties of the TBP the total number of distinct edges between any two layers in a width- $w$ TBP is at most $w+2$ : there are $w$ edges corresponding to the identity permutation, 2 edges corresponding to the transposition of two elements, and $w-2$ edges corresponding to the identity maps for all but the two transposed elements. The $w-2$ last edges overlap with the $w$ edges corresponding to the identity permutation.
- Lemma 11. $\mathrm{PBP}^{w} \subseteq \mathrm{SK}^{w+2}$

Proof. Given a $\mathrm{PBP}^{w}$ for $L$ of size $s$, by Lemma 9 we know that $L$ also has a width- $w$ TBP. By Remark 10 the underlying DAG for the TBP has at most $w+2$ edges between any two consecutive layers. Using Lemma 7 we get a skew circuit of width $w+2$ for $L$. Note that the size of such a circuit is $O(w s)$

Using Barrington's characterization of $\mathrm{NC}^{1}$ and Lemma 11 we get the following: $\mathrm{NC}^{1}=$ $\mathrm{BP}^{5}=\mathrm{PBP}^{5} \subseteq \mathrm{SK}^{7}$

### 3.3 Skew Circuits to Branching Programs

In this section we start from a skew circuit of bounded width and convert it into a branching program of bounded width. Formally, we prove the following:

- Theorem 12. If $C$ is a skew circuit of width $w$ and length $\ell$ then there is a branching program $P$ of width $w$ and size $O(w \ell)$ computing the same function as $C$.

Proof. Recall that in a skew circuit $C$, AND gates have fan-in 2 and at least one child is an input variable whereas OR gates have arbitrary fan-in and arbitrary predecessors.

Given a skew circuit $C$ of width $w$ and length $\ell_{0}$ we will construct a branching program $P$ of width $w$ that will recognize the same language. Let $G_{i 1}, \ldots, G_{i w}$ be the gates of $C$ on layer $i$ for $i=1, \ldots, \ell_{0}$.

Let $X=\left\{\ell_{i_{1}}, \ell_{i_{2}}, \ldots, \ell_{i_{L}}\right\}(|X|=L)$ be the set of layers on which there is at least one input gate. Without loss of generality we assume that in each $j \in[L]$ the gate $G_{\ell_{i_{j}} w}$ is an input gate. (There may be other input gates as well.)

We will construct a branching program of length $s=L+2$ and width $w$. The nodes in the branching program in layer $\ell_{i_{j}} \in X$ will be called $N_{j 0}, \ldots, N_{j(w-1)}$. The node $N_{00}$ is the initial node and the node $N_{(s-1)(w-1)}$ will be the target node.

The nodes $N_{11}, \ldots, N_{(s-1)(w-1)}$ will by our construction compute the value of the nodes in a layer in $X$. More formally, for every input $x$, the gate $G_{\ell_{j} c}$ in layer $\ell_{j}$ of the circuit (and layer $j$ in $X$ ) evaluated to 1 iff the node $N_{j c}$ can be reached from the initial node. Since the gate $G_{i w}$ in $X$ is an input gate we will not add corresponding gate in the branching program. We have completely specified the vertex set of the branching program $P$.

We now describe the edge set of $P$. We add an edge from $N_{(j-1) 0}$ to $N_{j 0}$ labeled by 1 for every $1 \leq j \leq s-1$. This ensures that all nodes $N_{j 0}$ are always reachable from the initial node.

Suppose that the layer $\ell_{j}$ and $\ell_{j}+1$ are both in $X$, i.e. $\ell_{j+1}=\ell_{j}+1$, then the edges between the nodes in the layer $\ell_{j}$ and $\ell_{j+1}$ in the branching program are easy to state. A node $N_{j+1 c}$ is connected to $N_{j d}$ if there is an edge between the corresponding gates $G_{\ell_{j+1} c}$ and $G_{\ell_{j} d}$. Also the edge in the branching program is labeled by 1 if the gate $G_{\ell_{j} d}$ is an OR gate, and labeled by the variable $x_{i}$ (or its negtion $\neg x_{i}$ ) if $G_{\ell_{j} d}$ is an AND gate querying $x_{i}$, resp. $\neg x_{i}$. If an OR gate in $\ell_{j}$ is connected to an input gate, we generate an edge to $N_{j 0}$ labeled by the literal queried by the input gate.

Now assume that the layer $\ell_{j}$ is in $X$ and $\ell_{j+1}$ is the next layer in $X$ and $\ell_{j+1}>\ell_{j}+1$. Then in the skew circuit, no input gates occur strictly between the layers $\ell_{j}$ and $\ell_{j+1}$. This implies that there are no AND gates in the layers $\ell_{j}+2, \ldots, \ell_{j+1}$. Hence the functions computed by the gates in layer $\ell_{j+1}$ are ORs of some gates in layer $\ell_{j}+1$. In layer gates in layer $\ell_{j}+1$ are ORs of either ANDs of gates in layer $\ell_{j}+1$ and an input variable or ORs of directly gates in layer $\ell_{j}+1$. Therefore, we add the following edges in the branching program: a node $N_{(j+1) c}$ is connected to $N_{j d}$ if the OR function computed by $G_{\ell_{j+1} c}$ has $G_{\ell_{j} d}$ as one of the inputs. This edge in the branching program is labeled by 1 if this was a direct OR, it is labeled by the variable $x_{i}$ (or its negtion $\neg x_{i}$ ) it it was an 'or' of an 'and' querying $x_{i}$ resp. $\neg x_{i}$. e.

It is easy to verify by induction on the layers that $N_{j c}$ is reachable from the inital gate if the corresponding gate evaluates true. Finally we add an edge from the node corresponding to the output gate to $N_{(s-1)(w-1)}$.

Putting together Lemma 11 and Theorem 12 we get the following corollary:

- Corollary 13. $\mathrm{NC}^{1}=\mathrm{BP}^{5}=\mathrm{PBP}^{5}=\mathrm{SK}^{7}$


## 4 Width $\leq 7$ skew circuits

In this section we study the structure of the languages in $N C^{1}$ by investigating properties of skew circuits of width 7 or less. By definition $\mathrm{SK}^{i} \subseteq \mathrm{SK}^{i+1}$ for $1 \leq i \leq 6$.

We start by proving that width 2 circuits are not universal.

- Lemma 14. A width 2 skew circuit of arbitrary size cannot compute Parity of 2 bits.

Proof. Let $f=\left(x_{1} \wedge \neg x_{2}\right) \vee\left(\neg x_{1} \wedge x_{2}\right)$. To show that $f \notin \mathrm{SK}^{2}$, firstly notice that fixing the values of both the variables is necessary and sufficient to make $f$ a constant function. Let $C$ be a $\mathrm{SK}^{2}$ circuit of minimal length computing $f$. The output gate of $C$ cannot have a constant as its input: suppose the output is an $\operatorname{AND}(\mathrm{OR})$ gate and receives 0 (resp. 1) as input then the circuit computes a constant function. On the other hand, if the output is an $\operatorname{AND}(\mathrm{OR})$ gate and receives 1 (resp. 0) as input then such an output gate is redundant, contradicting the minimality of the circuit.

Now, the output cannot be an AND gate because being skew, it can be fixed by fixing one of its inputs, which contradicts the fact that $C$ computes $f$. Therefore the output gate of $C$, say $g_{0}$, is an OR gate. Let $g_{1}, g_{2}$ be its two inputs. a) Both the children of $g_{0}$ cannot be OR since we can collapse such a layer and still compute the same function contradicting minimality; b) Both the children cannot be AND: suppose they are both AND gates, the next layer must have at least one input variable, say $x_{i}$. Let the other gate on that layer be $h_{1}$. If both query the variable, then by setting that variable to 0 , we will make the output zero, which will contradict the assumption that $C$ computes $f$. Suppose one of them, say $g_{1}$, does not query $x_{1}$ then the output of the circuit is $g_{0}=\left(h_{1} \wedge x_{i}\right) \vee h_{1}=h_{1}$. This contradicts the minimality of the circuit. c) Even one of the children of $g_{0}$ cannot be a variable, else by setting that variable we can fix the output of the circuit. d) Due to cases (a), (b) and (c) we are only left with a case in where one of the inputs to $g_{0}$ is an AND and the other is OR, say $g_{1}, g_{2}$ resp. In the layer just below this layer, there will have to be an input gate in a minimal circuit and this will have to be queried by both $g_{1}, g_{2}$. This variable can now be fixed (to make $g_{1}=1$ and therefore $g_{0}=1$ ) so that the output of the circuit is fixed. Therefore, no width 2 skew circuit computes $f$. This proves that width 2 circuits are not universal.

Recall that a $k$-DNF of size $s$ on $n$ variables is an OR of $s$ terms, where each term is an AND of at most $k$ literals from $\left\{x_{1}, x_{2}, \ldots, x_{n}, \neg x_{1}, \neg x_{2}, \ldots, \neg x_{n}\right\}$. Similarly, $k$-CNF of size $s$ on $n$ variables is an AND of $s$ clauses, where each clause is an OR of at most $k$ literals from $\left\{x_{1}, x_{2}, \ldots, x_{n}, \neg x_{1}, \neg x_{2}, \ldots, \neg x_{n}\right\}$.

- Lemma 15. Let $f$ be a $k$-DNF of size $s$ on $n$ variables. Then $f$ has a width- 3 skew circuit of length $O(s k)$.

Proof. A term is an AND of at most $k$ literals. It is easy to see that any such term can be computed by a skew circuit of width 2 and length $O(k)$. Suppose $C_{1}, C_{2}, \ldots, C_{s}$ are width- 2 skew circuits of length $\ell_{1}, \ell_{2}, \ldots, \ell_{s}$, respectively computing the $s$ terms in the $k$ DNF of size $s$. We need to compute an OR of these. This can be done using the one extra width: stagger circuits $C_{1}, C_{2}, \ldots, C_{s}$ one after the other. This requires width 2 and length $L:=k \cdot \sum_{i=1}^{s} \ell_{i}=O(s k)$. Now add one OR gate per layer alongside these $C_{i} \mathrm{~s}$. Let us call them $g_{1}, g_{2}, \ldots, g_{L}$. Feed output of $g_{i}$ to $g_{i+1}$ for all $1 \leq i \leq L$ and feed output of $C_{i}$ to $g_{\ell_{i}+1}$. It is easy to see that $g_{L+1}$ computes $f$. (See Figure 1.)


Figure 1 Width 3 skew circuits for DNFs

As any Boolean function on $n$ variables has an $n$-DNF of size at most $2^{n}$, we get the following corollary.

- Corollary 16. Let $f:\{0,1\}^{n} \rightarrow\{0,1\}$. Then $f$ can be computed by width-3 skew circuit of length $O\left(n 2^{n}\right)$, i.e. width-3 skew circuits are universal.
- Lemma 17. Let $f$ be a $k$-CNF of size $s$ on $n$ variables. Then $f$ has a width-3 skew circuit of length $O(s k)$.

Proof. Note that in a CNF, the top AND gate gets clauses as inputs. That is, the AND gate is not skew. However, it is still possible to get a skew circuit for CNFs. We prove this by induction on the number of clauses, i.e. $s$. The base case is $s=1$. This is just an OR of literals, which is computable by width- 2 skew circuit of length $O(k)$. Let $f_{i}\left(x_{1}, \ldots, x_{n}\right)=C_{1} \wedge \ldots \wedge C_{i}$ be computable by width-3 skew circuit of length $O(i k)$. Now $f_{i+1}=f_{i} \wedge C_{i+1}$ (where $C_{i+1}=x_{j_{1}} \vee x_{j_{2}} \ldots \vee x_{j_{k}}$ ) is computed as follows: $f_{i+1}=$ $\left(\ldots\left(\left(f_{i} \wedge x_{j_{1}}\right) \vee\left(f_{i} \wedge x_{j_{2}}\right) \ldots\left(f_{i} \wedge x_{j_{k}}\right)\right) \ldots\right)$. Note that we need width 1 each for the $f_{i}$, the AND gate and the input variable. (Even though we require width 3 to compute $f_{i}$, after the computation, it just requires width 1 to carry around the value of the function to the next stage). (See Figure 2.)


Figure 2 Width 3 skew circuits for CNFs
By a result of Viola [11], it is known that Approximate Majority is computable by Puniform depth-3 circuits of polynomial size with alternating AND/OR layers with the output gate being an OR gate. This along with Lemma 17 above yields:

- Corollary 18. Approx $\mathrm{Maj}_{a, n}$ has skew circuits of width 4 and polynomial length.

Razborov[9] and Smolensky[10] show that Parity does not have constant depth circuit of polynomial size. It also implies that Parity does not have polynomial size CNFs or DNFs . However, we now show that Parity has skew circuits of width 4 and polynomial size.

- Lemma 19. Parity on $n$ variables has a skew circuit of width 4 and length $O(n)$.

Proof. This is an easy observation which comes from the fact that Parity has a branching program of width 2 and length $O(n)$. This fact along with part 3 of Lemma 5 proves the result. (Figure 3 shows the width 4 skew circuit for Parity.)


Figure 3 Width-4 skew circuit for Parity

## 5 Parity and $\mathrm{SK}^{3}$

In this section we prove that Parity does not have polynomial length width-3 skew circuits. As Parity has width 4 skew circuits of linear length (Lemma 19), this separates $\mathrm{SK}^{3}$ from $S K^{4}$.

- Theorem 20. $\mathrm{SK}^{3} \subsetneq \mathrm{SK}^{4}$.

In order to prove this we first show that any width three skew circuit computing Parity can be converted into a normal form. We then show that any polynomial sized circuit of that normal form cannot compute Parity.

- Lemma 21. Let $C$ be a Boolean width 3 skew circuit with size $s(n)$ computing Parity ${ }_{n}$. The circuit $C$ can be converted into another circuit $D$ such that $D$ computes Parity of at least $n-2$ bits and has the following structure:

1. The top gate of $D$ is an $O R$ of at most $3 s(n)^{2}$ disjoint skew circuits, say $C_{1}, C_{2}, \ldots, C_{m}$, where $m \leq 3 s(n)^{2}$.
2. The sum of sizes of all $C_{i} s$ is at most $O\left(s(n)^{3}\right)$
3. At most two of these circuits have width 3 and all the other have width at most 2 .

- Lemma 22. Let $D$ be a circuit satisfying properties 1,2,3 from Lemma 21 and computing Parity. Then there exists a constant $c$ such that the size of $D$ is at least $2^{n} / n^{c}$.

Using Lemma 21 and Lemma 22 it is easy to see that the lower bound for Parity follows.

### 5.1 Proof of Lemma 21

Let $C$ be a width three circuit computing Parity of $n$ bits of size $s(n)$. The top gate of $C$ cannot be AND. This is because, by fixing the input wires of the AND gate, we could fix the output of the circuit, however, Parity of $n$ bits cannot be fixed by fixing $<n$ bits. Therefore, we can assume that the top gate is an OR gate.

Proposition 23. Let $C$ be any width three skew circuit computing Parity of $n$ bits. Let $k$ be the highest layer in $C$ consisting of only AND gates, say $g_{k}^{1}, g_{k}^{2}, g_{k}^{3}$. We can convert this
into another width 3 skew circuit $D$ computing Parity of at least $n-2$ bits such that no layer of $D$ contains three AND gates.

Proof. Let $k$ be the highest layer (closest to the output gate) with all three AND gates, say $g_{k}^{1}, g_{k}^{2}, g_{k}^{3}$. Note that, due to skewness of the circuit, the layer $k+1$ cannot have any AND gates. If any one of the gates at layer $k$ has fan-in 1 , then we can replace that gate by an OR gate. Therefore, we will assume that each gate has fan-in 2. Let gates at layer below, i.e. $k-1$, be $g_{k-1}^{1}, g_{k-1}^{2}, g_{k-1}^{3}$. As we are dealing with skew circuits, at least one of $g_{k-1}^{1}, g_{k-1}^{2}, g_{k-1}^{3}$ is an input gate. Suppose there is only one input gate, then all gates at layer $k$ must read this input bit. And therefore, the three gates at layer $k$ can be fixed by fixing this input bit. If there are two input gates, then either both read the same literal (a variable and its negation) or they read two distinct input bits. In the latter case, by fixing the two distinct bits, all the three gates can be fixed. Finally, if the two variables being read are $x$ and $\neg x$ then fixing $x$ may not fix all three gates. For example, suppose $g_{k-1}^{1}=x, g_{k-1}^{2}=\neg x, g_{k-1}^{3}=g$, and $g_{k}^{1}=(g$ AND $x)$ and $g_{k}^{2}=(g$ AND $\neg x)$. Then fixing $x$ to any value will not fix both $g_{k}^{1}$ and $g_{k}^{2}$. However, in this case, note that gates at layer $k$ feed into OR gates. Therefore, in this case, such an AND layer is redundant. Therefore, any such layer consisting of only AND gates can be completely removed from the circuit by fixing at most 2 bits.

Therefore, we will now assume that we have a circuit $D$ which computes parity of at least $n-2$ bits in which no layer consists of only AND gates.

Let $G=(V, E)$ denote the DAG underlying the circuit $D$. Let $X \subseteq V$ denote the subset of gates which have a path to the top gate via only OR gates. Note that all the vertices in this set are themselves OR gates. We refer to the set of vertices $X$ as ORSET.

Let $X_{i n} \subseteq V$ denote the set of vertices in $V \backslash X$ which have an edge incident from it to some vertex in $X$. Similarly, let $X_{\text {out }} \subseteq V$ denote the set of vertices in $V \backslash X$ which have an edge incident to them from some vertex in $X$.
(a) Disconnect all edges incident to the set $X_{\text {out }}$ from $X$. Let the new dangling wires be labelled with the constant 0 input.
(b) If after step (a) any AND gates receives constant input 0 then delete the gate and if any OR gate receives constant input 0 then delete this input of the OR gate.
(c) In the graph obtained after steps (a) and (b), consider $V \backslash X$. This disconnects the DAG $G$ and gives rise to some connected components, say $X_{1}, X_{2}, \ldots, X_{\ell}$. For each $i \in[\ell]$ and edge $(u, v)$ such that $u \in X_{i}$ and $v \in X$ let $C_{i, u}$ be the subgraph (DAG) of $X_{i}$ with $u$ as its sink.

- Proposition 24. The step (a) above does not change the function computed by the circuit.

Proof. Let $u$ be a gate in $X$ which feeds into some gate in $X_{\text {out }}$ say $v$. Step (a) above disconnects $u$ from $v$ and feeds value 0 to $v$. Suppose $g_{u}$, the OR gate corresponding to $u$ evaluates to 1 , then the circuit will evaluate to 1 irrespective of all other gates. On the other hand, if it evaluates to 0 , then that is the value that we have fed into $v$ and therefore, the modification in Step (a) does not change the output of the circuit in either case.

Note that $V \backslash X$ is partitioned by $X_{i}$ s. Therefore, $\ell \leq s(n)$. Also, $\cup_{i=1}^{m} X_{i} \subseteq V \backslash X$. Therefore, $\sum_{i=1}^{m}\left|X_{i}\right| \leq s(n)$. The number of edges in $G$ is at most $3 s(n)$. Therefore, the total number of edges between any $X_{i}$ and $X$ is at most $3 s(n)$. Therefore, the number of circuits $C_{u, i}$ is at most $s(n)^{2}$ and each such circuit is of size at most $s(n)$. This proves parts 1,2 of Lemma 21.

We will now prove part 3 of Lemma 21. The top gate of $D$ is the same as the top gate of $C$, and it is an OR gate. Therefore, this gate is in the ORSET. Now as long as there are OR gates on the layers, we have at least one gate per layer in the ORSET. Finally, if there is a layer with no OR gates, then this layer must have at least one input variable (as $D$ does not have any layer with three AND gates). The other two gates at this layer be $g, h$. Let $X_{g}, X_{h}$ be two DAGs rooted at $g$ and $h$, respectively, and $C_{g}$ and $C_{h}$ be the two corresponding circuits. These are possibly width three circuits. However, all other connected components of $V \backslash X$ are of width at most 2 due to step (b) above. This gives Part 3 of Lemma 21.

### 5.2 Proof of Lemma 22

Let $D$ be the circuit given by Lemma 21. Let $n_{0}$ denote the number of unfixed variables. Let $C_{1}, C_{2}, \ldots, C_{m}$ be circuits given by Lemma 21. We know that at most two circuits among these have width 3 . Let us assume without loss of generality that the two circuits are $C_{1}, C_{2}$. The output gates of these circuits are AND gates, say $G_{1}$ and $G_{2}$, respectively. These being skew circuits, all but one of the inputs of the AND gates are input gates. We will first prove the following proposition.

- Proposition 25. By fixing at most two variables both $G_{1}$ and $G_{2}$ can be set to zero.

Proof. Let $G_{1}, G_{2}, \ldots, G_{m}$ be top gates of the circuits $C_{1}, C_{2}, \ldots, C_{m}$. Note that $G_{i}$ s are AND gates because if they were OR gates they would have been in the ORSET. Say they appear on layers $k_{1}, k_{2}, \ldots, k_{m}$ in the original width 3 circuit $C$. If both $C_{1}, C_{2}$ are width 3 circuits, then it is easy to see that $k_{1} \leq k_{i}$ for $3 \leq i \leq m$ and $k_{2} \leq k_{i}$ for $3 \leq i \leq m$. Let us assume without loss of generality $k_{1} \leq k_{2}$, i.e the gate $G_{1}$ appears on a lower layer (closer to Layer 1) than $G_{2}$.

Let $x_{i}, Y_{1}, Y_{2}$ be the gates on the layer $k_{1}-1$. As $G_{1}$ is an AND gate, it must query one variable. Let that be $x_{i}$. Let the other input to $G_{1}$ be $Y_{1}$ without loss of generality.

Now note that $Y_{2}$ must be connected to the ORSET via $G_{2}$. If it is not connected via $G_{2}$ then it is easy to see that $C_{2}$ cannot have width $3 .\left(\because k_{2}>k_{1}\right.$ and ORSET extends all the way upto $k_{1}+1$, therefore, the circuit $C_{2}$, which starts at $k_{2}$ and is disjoint from the ORSET and disjoint from the path which connected $Y_{2}$ to the ORSET, can have width at most 2.)

Our proof for the proposition is a case analysis of the various settings for $x_{i}, Y_{1}$, and $Y_{2}$.
(a) Suppose input gate of $G_{2}$ reads a variable $y \neq \neg x_{i}$, then set $x_{i}=0, y=0$ which will eliminate both $G_{1}, G_{2}$ by setting at most two variables as desired.
(b) Suppose $y=\neg x_{i}$. Here, by minimality of the circuit we can assume that the output of the entire circuit can be written as $\vee_{i=3}^{O(m)} C_{i} \vee\left(Y_{1} \wedge x_{i}\right) \vee\left(Y_{2} \wedge \neg x_{i}\right)$. To handle this we need to look at the next layer below $k_{1}-1$ that reads an input variable. Say the variable read is $x_{j}$ and the layer number is $k_{0}$. Let the other two gates on this layer be $Z_{1}, Z_{2}$. Here again, by observing that there cannot be any redundant gates in the minimal circuit (and using distributity of AND/ORs) it is easy to see that $x_{j} \neq x_{i}$. If either of $Y_{1}, Y_{2}$ is an AND gate, then $k_{0}=k_{1}-1$ else both are OR gates.
Therefore, $Y_{1}, Y_{2}$ are ORs over $\left\{\left(x_{j} \wedge Z_{1}\right), Z_{1}, Z_{2}\right\}$. (The analysis for the case of ORs over $\left\{\left(x_{j} \wedge Z_{2}\right), Z_{1}, Z_{2}\right\}$ is symmetric. The case of ORs over simply $\left\{x_{j}, Z_{1}, Z_{2}\right\}$ cannot arise as otherwise $Y_{1}$ or $Y_{2}$ will be connected to the ORSET directly, but this is not possible as we have that both are connected to the ORSET via ANDs.)
(i) If both $Y_{1}$ and $Y_{2}$ do not query $x_{j} \wedge Z_{1}$, then the AND is redundant, which is not possible in a minimal circuit.
(ii) If $Y_{2}\left(Y_{1}\right)$ queries only $x_{j} \wedge Z_{1}$. Then by setting $x_{i}=0, x_{j}=0\left(\neg x_{i}=0, x_{j}=0\right.$, respectively) we can set both $G_{1}$ and $G_{2}$ to zero.
(iii) If $Y_{2}$ queries $\left(x_{j} \wedge Z_{1}\right)$ and $Z_{1}$, but not $Z_{2}$. Then again the AND is redundant. (The case that $Y_{1}$ queries $\left(x_{j} \wedge Z_{1}\right)$ and $Z_{1}$, but not $Z_{2}$ is similar.)
(iv) If $Y_{2}$ queries $\left(x_{j} \wedge Z_{1}\right)$ and $Z_{2}$, but not $Z_{1}$ and $Y_{1}$ queries $Z_{2}\left(Y_{1}\right.$ may query $Z_{1}$ and $x_{j} \wedge Z_{1}$ as well) then $G_{1} \vee G_{2}=\left(\left(\left(x_{j} \wedge Z_{1}\right) \vee Z_{2}\right) \wedge x_{i}\right) \vee\left(Z_{2} \wedge x_{i}\right)=\left(\left(x_{j} \wedge Z_{1}\right) \wedge x_{i}\right) \vee$ $\left(Z_{2} \wedge x_{i}\right) \vee\left(Z \wedge \neg x_{i}\right)=\left(\left(x_{j} \wedge Z_{1}\right) \wedge x_{i}\right) \vee Z_{2}$, i.e. $Z_{2}$ directly feeds into the ORSET, but this contradicts minimality. (The case that $Y_{1}$ queries $\left(x_{j} \wedge Z_{1}\right)$ and $Z_{2}$, but not $Z_{1}$ and $Y_{2}$ queries $Z_{2}$ is similar).
(v) If $Y_{2}$ queries $\left(x_{j} \wedge Z_{1}\right)$ and $Z_{2}$, but not $Z_{1}$ and $Y_{1}$ queries $Z_{1}$ but nothing else then $Y_{1}$ is redundant.

This exhausts all the cases and finally by setting at most 2 variables we have managed to eliminate both $G_{1}, G_{2}$. This finishes the proof of the proposition.

Let $N$ denote the number of variables which were not set. Here, $N \geq n_{0}-2$. The new circuit, say $D^{\prime}$, is now an OR of $C_{3}, C_{4}, \ldots, C_{m}$ and by our assumption, it computes Parity of $N$ variables. We will show that OR of polynomially many polynomial size width- 2 skew circuits cannot compute Parity.

Let us fix some notation. Let $L_{\oplus}$ denote the 1 set of parity, i.e. $L_{\oplus}=\left\{x \in\{0,1\}^{N} \mid\right.$ $\operatorname{Parity}(x)=1\}$. We know that $\left|L_{\oplus}\right|=2^{N-1}$. For any Boolean circuit $C$, let $L_{C}$ denote $\left\{x \in\{0,1\}^{N} \mid C(x)=1\right\}$. Note that as $D^{\prime}$ above is an OR of $C_{3}, C_{4} \ldots, C_{m}$, we have $L_{D^{\prime}}=\cup_{i=3}^{m} L_{C_{i}}$.

- Definition 26. We say that a Boolean circuit $C \alpha$-approximates a function $f:\{0,1\}^{n} \rightarrow$ $\{0,1\}$ if the following conditions hold:
- $\forall x \in\{0,1\}^{n}$, if $f(x)=0$ then $C(x)=0$, i.e. $C$ has no false positives.
- The ratio of $|\{x \mid C(x)=1\}|$ to $|\{x \mid f(x)=1\}|$ is at least $\alpha$

For the sake of contradiction we have assumed that $D^{\prime}$ computes parity of $N$ bits. Assuming this and from the fact that $L_{D^{\prime}}=\cup_{i=3}^{m} L_{C_{i}}$, we get that there exists an $i \in$ $\{3,4, \ldots, m\}$ such that $C_{i} 1 / m$-approximates parity of $N$ bits. We will now prove that no such $C_{i}$ exists, which will give us the contradiction. Formally, we prove the following:

- Claim 27. Let $D^{\prime}$ and $C_{3}, C_{4}, \ldots, C_{m}$ be defined as above. There does not exists $i \in$ $\{3,4, \ldots, m\}$ such that $C_{i} 1 / m$-approximates Parity of $N$ bits.

Proof. Suppose there exists a $C_{i}$ which $1 / m$-approximates parity of $N$ bits. Recall that $C_{i}$ is a width 2 skew circuit. Let the last layer be $L$ and the first layer be 1 . Let $\ell_{i_{1}}, \ell_{i_{2}}, \ldots, \ell_{i_{t}}$ be the layers in which there is one input gate, with $\ell_{i_{1}}$ being closest to layer 1 and $\ell_{i_{t}}$ being closest to layer $L$. (Note that, we can assume without loss of generality that layer 1 is the only layer which has two input gates.) Let the variables queried by these gates be $x_{i_{1}}, x_{i_{2}}, \ldots, x_{i_{t}}$, respectively. Let $h_{i_{t+1}}$ denote the output gate in layer $L$. Similarly, let $h_{i_{1}}$ be the gate in layer $\ell_{i_{1}}$ (other than the input gate), $h_{i_{2}}$ be the gate in layer $\ell_{i_{2}}$ (other than the input gate) and so on till $h_{i_{t}}$ be the gate in layer $\ell_{i_{t}}$ (other than the input gate).

As there are no NOT gates in the circuit, $h_{i_{j+1}}$ is a monotone function of $x_{i_{j}}, h_{i_{j}}$ for every $1 \leq j \leq t$. There is a unique value of $x_{i_{j}}$, say $b_{i_{j}} \in\{0,1\}$, such that by setting $x_{i_{j}}=b_{i_{j}}, h_{i_{j+1}}$ becomes a non-trivial function of $h_{i_{j}}$. (This is because, there are at most 6 different monotone functions on two bits, two of which cannot occur in a minimal circuit. And the other four (AND, OR, NAND, NOR) have this property.)

Note that, the setting of $x_{i_{t}}=b_{i_{t}}$ will not fix the value of $h_{i_{t}}$. Suppose $h_{i_{t}}$ gets fixed due to this setting. In that case, value of $h_{i_{t+1}}$ will also get fixed. Suppose the value of $h_{i_{t+1}}$ becomes 1 , then for all settings of $x \neq x_{i_{t}}, h_{i_{t+1}}$ will continue to have value 1 . But we have assumed that $C_{i}$ has no false positives. Therefore, this is not possible. On the other hand, if the value of $h_{i_{t+1}}$ gets fixed to 0 , then for all settings of variables $x \neq x_{i_{t}}$ the circuit will output 0 . That is, for $2^{N-1}$ different inputs the circuit will output 0 . However, we have assumed that the circuit outputs 1 for at least $2^{N-1} / \mathrm{m}$ many inputs.

Assuming $x_{i_{t}}=b_{i_{t}}$ and $x_{i_{t}} \neq x_{i_{t-1}}$, we will repeat this argument for $x_{i_{t-1}}$. Let $x_{i_{t-1}}=$ $b_{i_{t-1}}$ be the setting of $x_{i_{t-1}}$ which makes $h_{i_{t}}$ a function of $h_{i_{t-1}}$. Suppose this setting of $x_{i_{t-1}}$ fixes $h_{i_{t}}$ then that will inturn fix $h_{i_{t+1}}$. As before, to avoid false positives, the value of $h_{i_{t+1}}$ cannot be fixed to 1 . And to ensure that the circuit evaluated to 1 on at least $2^{N-1} / m$ inputs, it cannot be fixed to 0 .

In this way, we can repeat the argument for $k$ distinct variables as long as $k<(N-$ 1) $-\lceil\log m\rceil$. Let $k_{0}$ be such that $k_{0}=\omega(\log m)$ and $k_{0}<(N-1)-\lceil\log m\rceil$. We fix $k_{0}$ distinct variables as above. But now note that any other setting of these $k_{0}$ variables fixes the value of $h_{i_{t+1}}$ to 0 . Therefore, the circuit can be 1 on at most $O\left(2^{N-k_{0}}\right)$ inputs. But this contradicts our assumption that $h_{i_{t+1}}$ evaluated to 1 on at least $2^{N} / m$ inputs from $L_{\oplus}$.

## 6 Discussion

The above study provides a wide range of interesting questions, answers to which may improve our understanding of functions in $N C^{1}$. Namely, the questions regarding lower bounds for width $k$ skew circuits for $4 \leq k \leq 6$. Some of these questions could be more tractable than the daunting question of proving lower bounds for $\mathrm{NC}^{1}$ circuits. We conjecture that Majority is a function with respect to which width 4 circuits will have exponential lower bound. Towards proving such a result, it may be interesting to obtain a normal form for width 4 skew circuits. It may also be possible that any function in $\mathrm{NC}^{1}$ is computable by width $k$ circuits for $4 \leq k \leq 6$. Such a result will tighten the connection between branching programs and skew circuits.

## References

1 D.A. Barrington. Bounded-width polynomial-size branching programs can recognize exactly those languages in $N C^{1}$. Journal of Computer and System Sciences, 38:150-164, 1989.
2 David A Barrington. Width-3 permutation branching programs. Technical Memo MIT/LCS/TM-293, Massachusetts Institute of Technology, Laboratory for Computer Science, 1985.
3 Allan Borodin, Danny Dolev, Faith E Fich, and Wolfgang Paul. Bounds for width two branching programs. SIAM Journal on Computing, 15(2):549-560, 1986.
4 Alex Brodsky. An impossibility gap between width-4 and width-5 permutation branching programs. Information Processing Letters, 94(4):159-164, May 2005.
5 Israel Nathan Herstein. Topics in algebra. John Wiley \& Sons, 2006.
6 Chang-Yeong Lee. Representation of switching circuits by binary-decision programs. Bell System Technical Journal, 38(4):985-999, 1959.
7 William Joseph Masek. A fast algorithm for the string editing problem and decision graph complexity. PhD thesis, Massachusetts Institute of Technology, 1976.
8 BV Raghavendra Rao. A study of width bounded arithmetic circuits and the complexity of matroid isomorphism. [HBNI TH 17], 2010.

9 Alexander A Razborov. Lower bounds on the size of bounded depth circuits over a complete basis with logical addition. Mathematical Notes, 41(4):333-338, 1987.
10 Roman Smolensky. Algebraic methods in the theory of lower bounds for boolean circuit complexity. In Proceedings of the nineteenth annual ACM symposium on Theory of computing, pages 77-82. ACM, 1987.
11 Emanuele Viola. On approximate majority and probabilistic time. Computational Complexity, 18(3):337-375, 2009.


[^0]:    1 This is often called the strong acceptance condiction. Other notions of acceptance have been studied in the literature. See for example [4]

[^1]:    2 This assumption is not without loss of generality. However, we will see that when a branching program is converted into a skew circuits, exactly this type of skew circuits arise.

