Revision #2 Authors: Paul Beame, Dang-Trinh Huynh-Ngoc

Accepted on: 5th May 2008 00:00

Downloads: 3642

Keywords:

Revision #1 Authors: Paul Beame, Dang-Trinh Huynh-Ngoc

Accepted on: 1st May 2008 00:00

Downloads: 3045

Keywords:

TR08-024 Authors: Paul Beame, Trinh Huynh

Publication: 11th March 2008 07:16

Downloads: 3049

Keywords:

Recently, an extension of the standard data stream model has been introduced in which an algorithm can create and manipulate multiple read/write streams in addition to its input data stream. Like the data stream model, the most important parameter for this model is the amount of internal memory used by such an algorithm. The other key parameters are the number of streams the algorithm uses and the number of passes it makes on these streams. We consider how the addition of these multiple read/write streams impacts the ability of algorithms to approximate the frequency moments of the input stream.

We show that any randomized read/write stream algorithm with a fixed number of streams and a sub-logarithmic number of passes that approximates the k-th frequency moment F_k of an input sequence of length of at most N from {1,...,N} within a constant factor requires space \Omega(N^{1-4/k-\delta}) for any \delta > 0. For comparison, it is known that with a single read-only data stream there is a randomized constant-factor approximation for F_k using \tilde O(N^{1-2/k}) space and that there is a deterministic algorithm computing F_k exactly using 3 read/write streams, O(log N) passes, and O(log N) space. Our lower bounds also apply to (1+\epsilon)-approximations of F_k for \epsilon \ge 1/N.