TR15-039 Authors: Anup Rao, Makrand Sinha

Publication: 16th March 2015 23:47

Downloads: 740

Keywords:

We study the complexity of parallelizing streaming algorithms (or equivalently, branching programs). If $M(f)$ denotes the minimum average memory required to compute a function $f(x_1,x_2, \dots, x_n)$ how much memory is required to compute $f$ on $k$ independent streams that arrive in parallel? We show that when the inputs (updates) are sampled independently from some domain $\mathcal{X}$ and $M(f) = \Omega(n)$, then computing the value of $f$ on $k$ streams requires average memory at least $\Omega\left(k \cdot \frac{M(f)}{n}\right)$.

Our results are obtained by defining new ways to measure the information complexity of streaming algorithms. We define two such measures: the \emph{transitional} and \emph{cumulative} information content. We prove that any streaming algorithm with transitional information content $I$ can be simulated using average memory $\mathcal{O}(n(I+1))$. On the other hand, if every algorithm with cumulative information content $I$ can be simulated with average memory $\mathcal{O}(I+1)$, then computing $f$ on $k$ streams requires average memory at least $\Omega(k \cdot (M(f)-1))$.