Revision #1 Authors: Omri Weinstein, David Woodruff

Accepted on: 22nd December 2021 12:10

Downloads: 40

Keywords:

We study $k$-party set disjointness in the simultaneous message-passing model, and show that even if each element $i\in[n]$ is guaranteed to either belong to all $k$ parties or to at most $O(1)$ parties in expectation (and to at most $O(\log n)$ parties with high probability), then $\Omega(n \min(\log 1/\delta, \log k) / k )$ communication is required by any $\delta$-error communication protocol for this problem (assuming $k = \Omega(\log n)$).

We use the strong promise of our lower bound, together with a recent characterization of turnstile streaming algorithms as linear sketches, to obtain new lower bounds for the well-studied problem in data streams of approximating the frequency moments. We obtain a space lower bound of $\Omega(n^{1-2/p} \epsilon^{-2} \log M \log 1/\delta)$ bits for any algorithm giving a $(1+\epsilon)$-approximation to the $p$-th moment $\sum_{i=1}^n |x_i|^p$ of an $n$-dimensional vector $x\in\{\pm M\}^n$ with probability $1-\delta$, for any $\delta \geq 2^{-o(n^{1/p})}$.

Our lower bound improves upon a prior $\Omega(n^{1-2/p} \epsilon^{-2} \log M)$ lower bound which did not capture the dependence on $\delta$, and our bound is optimal whenever $\epsilon \leq 1/\textrm{poly}(\log n)$. This is the first example of a lower bound in data streams which uses a characterization in terms of linear sketches to obtain stronger lower bounds than obtainable via the one-way communication model; indeed, our set disjointness lower bound provably cannot hold in the one-way model.

The bound in Lemma 6.2 was fixed, thanks to Yi Li who pointed it out. The fix does not affect (asymptotically) any of the main results.

TR15-083 Authors: Omri Weinstein, David Woodruff

Publication: 15th May 2015 19:02

Downloads: 1429

Keywords:

We study $k$-party set disjointness in the simultaneous message-passing model, and show that even if each element $i\in[n]$ is guaranteed to either belong to all $k$ parties or to at most $O(1)$ parties in expectation (and to at most $O(\log n)$ parties with high probability), then $\Omega(n \min(\log 1/\delta, \log k) / k )$ communication is required by any $\delta$-error communication protocol for this problem (assuming $k = \Omega(\log n)$).

We use the strong promise of our lower bound, together with a recent characterization of turnstile streaming algorithms as linear sketches, to obtain new lower bounds for the well-studied problem in data streams of approximating the frequency moments. We obtain a space lower bound of $\Omega(n^{1-2/p} \epsilon^{-2} \log M \log 1/\delta)$ bits for any algorithm giving a $(1+\epsilon)$-approximation to the $p$-th moment $\sum_{i=1}^n |x_i|^p$ of an $n$-dimensional vector $x\in\{\pm M\}^n$ with probability $1-\delta$, for any $\delta \geq 2^{-o(n^{1/p})}$.

Our lower bound improves upon a prior $\Omega(n^{1-2/p} \epsilon^{-2} \log M)$ lower bound which did not capture the dependence on $\delta$, and our bound is optimal whenever $\epsilon \leq 1/\textrm{poly}(\log n)$. This is the first example of a lower bound in data streams which uses a characterization in terms of linear sketches to obtain stronger lower bounds than obtainable via the one-way communication model; indeed, our set disjointness lower bound provably cannot hold in the one-way model.