TR22-159 Authors: Songhua He, Periklis Papakonstantinou

Publication: 18th November 2022 21:13

Downloads: 152

Keywords:

Deep neural networks are the dominant machine learning model. We show that this model is missing a crucial complexity parameter. Today, the standard neural network (NN) model is a circuit whose gates (neurons) are ReLU units. The complexity of a NN is quantified by the depth (number of layers) and the size (number of neurons = depth times width). This work shows that this alone is insufficient, resulting in NNs with unreasonable computing power. We show that the correct way to talk about the size complexity of a NN is beside the number of neurons to consider the precision (or magnitude) of the weights of the ReLU units. The main message of this work is that if the precision of the weights is not considered in the complexity of the NN then one can engineer weights to "buy" exponentially many neurons for free. In summary, we make three theoretical contributions, potentially affecting many theoretical works on NNs.

1. Every function $f:\{0,1\}^n\to\{0,1\}$ can be computed with $O(\sqrt{2^n})$ many neurons and constant fan-in per neuron; i.e.~exponential times less than Shannon's classic lower bound for usual combinatorial circuits.

2. We give a new definition of circuit size that takes into account the precision/magnitude of the weights. Under this new definition of size we asymptotically match Shannon's bound for NNs.

3. We complement the above results showing that P-uniform NNs decide exactly P.