Weizmann Logo
ECCC
Electronic Colloquium on Computational Complexity

Under the auspices of the Computational Complexity Foundation (CCF)

Login | Register | Classic Style



REPORTS > DETAIL:

Revision(s):

Revision #1 to TR15-083 | 22nd December 2021 12:10

The Simultaneous Communication of Disjointness with Applications to Data Streams

RSS-Feed




Revision #1
Authors: Omri Weinstein, David Woodruff
Accepted on: 22nd December 2021 12:10
Downloads: 122
Keywords: 


Abstract:

We study $k$-party set disjointness in the simultaneous message-passing model, and show that even if each element $i\in[n]$ is guaranteed to either belong to all $k$ parties or to at most $O(1)$ parties in expectation (and to at most $O(\log n)$ parties with high probability), then $\Omega(n \min(\log 1/\delta, \log k) / k )$ communication is required by any $\delta$-error communication protocol for this problem (assuming $k = \Omega(\log n)$).

We use the strong promise of our lower bound, together with a recent characterization of turnstile streaming algorithms as linear sketches, to obtain new lower bounds for the well-studied problem in data streams of approximating the frequency moments. We obtain a space lower bound of $\Omega(n^{1-2/p} \epsilon^{-2} \log M \log 1/\delta)$ bits for any algorithm giving a $(1+\epsilon)$-approximation to the $p$-th moment $\sum_{i=1}^n |x_i|^p$ of an $n$-dimensional vector $x\in\{\pm M\}^n$ with probability $1-\delta$, for any $\delta \geq 2^{-o(n^{1/p})}$.

Our lower bound improves upon a prior $\Omega(n^{1-2/p} \epsilon^{-2} \log M)$ lower bound which did not capture the dependence on $\delta$, and our bound is optimal whenever $\epsilon \leq 1/\textrm{poly}(\log n)$. This is the first example of a lower bound in data streams which uses a characterization in terms of linear sketches to obtain stronger lower bounds than obtainable via the one-way communication model; indeed, our set disjointness lower bound provably cannot hold in the one-way model.



Changes to previous version:

The bound in Lemma 6.2 was fixed, thanks to Yi Li who pointed it out. The fix does not affect (asymptotically) any of the main results.


Paper:

TR15-083 | 14th May 2015 10:27

The Simultaneous Communication of Disjointness with Applications to Data Streams


Abstract:

We study $k$-party set disjointness in the simultaneous message-passing model, and show that even if each element $i\in[n]$ is guaranteed to either belong to all $k$ parties or to at most $O(1)$ parties in expectation (and to at most $O(\log n)$ parties with high probability), then $\Omega(n \min(\log 1/\delta, \log k) / k )$ communication is required by any $\delta$-error communication protocol for this problem (assuming $k = \Omega(\log n)$).

We use the strong promise of our lower bound, together with a recent characterization of turnstile streaming algorithms as linear sketches, to obtain new lower bounds for the well-studied problem in data streams of approximating the frequency moments. We obtain a space lower bound of $\Omega(n^{1-2/p} \epsilon^{-2} \log M \log 1/\delta)$ bits for any algorithm giving a $(1+\epsilon)$-approximation to the $p$-th moment $\sum_{i=1}^n |x_i|^p$ of an $n$-dimensional vector $x\in\{\pm M\}^n$ with probability $1-\delta$, for any $\delta \geq 2^{-o(n^{1/p})}$.

Our lower bound improves upon a prior $\Omega(n^{1-2/p} \epsilon^{-2} \log M)$ lower bound which did not capture the dependence on $\delta$, and our bound is optimal whenever $\epsilon \leq 1/\textrm{poly}(\log n)$. This is the first example of a lower bound in data streams which uses a characterization in terms of linear sketches to obtain stronger lower bounds than obtainable via the one-way communication model; indeed, our set disjointness lower bound provably cannot hold in the one-way model.



ISSN 1433-8092 | Imprint