On Parallelizing Streaming Algorithms

Anup Rao; Makrand Sinha

Electronic Colloquium on Computational Complexity

Under the auspices of the Computational Complexity Foundation (CCF)

REPORTS > DETAIL:

Paper:

TR15-039 | 16th March 2015 23:41

On Parallelizing Streaming Algorithms

TR15-039 Authors: Anup Rao, Makrand Sinha
Publication: 16th March 2015 23:47
Downloads: 2265

Keywords:

direct sum, information complexity, Streaming Algorithms

Abstract:

We study the complexity of parallelizing streaming algorithms (or equivalently, branching programs). If $M(f)$ denotes the minimum average memory required to compute a function $f(x_1,x_2, \dots, x_n)$ how much memory is required to compute $f$ on $k$ independent streams that arrive in parallel? We show that when the inputs (updates) are sampled independently from some domain $\mathcal{X}$ and $M(f) = \Omega(n)$, then computing the value of $f$ on $k$ streams requires average memory at least $\Omega\left(k \cdot \frac{M(f)}{n}\right)$.

Our results are obtained by defining new ways to measure the information complexity of streaming algorithms. We define two such measures: the \emph{transitional} and \emph{cumulative} information content. We prove that any streaming algorithm with transitional information content $I$ can be simulated using average memory $\mathcal{O}(n(I+1))$. On the other hand, if every algorithm with cumulative information content $I$ can be simulated with average memory $\mathcal{O}(I+1)$, then computing $f$ on $k$ streams requires average memory at least $\Omega(k \cdot (M(f)-1))$.

ISSN 1433-8092 | Imprint