Pseudo-deterministic Streaming

Shafi Goldwasser; Ofer Grossman; Sidhanth Mohanty; David Woodruff

Electronic Colloquium on Computational Complexity

Under the auspices of the Computational Complexity Foundation (CCF)

REPORTS > DETAIL:

Paper:

TR19-177 | 6th December 2019 07:21

Pseudo-deterministic Streaming

TR19-177 Authors: Shafi Goldwasser, Ofer Grossman, Sidhanth Mohanty, David Woodruff
Publication: 6th December 2019 16:44
Downloads: 1884

Keywords:

pseudo-deterministic, Streaming Algorithms

Abstract:

A pseudo-deterministic algorithm is a (randomized) algorithm which, when run multiple times on the same input, with high probability outputs the same result on all executions. Classic streaming algorithms, such as those for finding heavy hitters, approximate counting, $\ell_2$ approximation, finding a nonzero entry in a vector (for turnstile algorithms) are not pseudo-deterministic. For example, in the instance of finding a nonzero entry in a vector, for any known low-space algorithm $A$, there exists a stream $x$ so that running $A$ twice on $x$ (using different randomness) would with high probability result in two different entries as the output.

In this work, we study whether it is inherent that these algorithms output different values on different executions. That is, we ask whether these problems have low-memory pseudo-deterministic algorithms. For instance, we show that there is no low-memory pseudo-deterministic algorithm for finding a nonzero entry in a vector (given in a turnstile fashion), and also that there is no low-dimensional pseudo-deterministic sketching algorithm for $\ell_2$ norm estimation. We also exhibit problems which do have low memory pseudo-deterministic algorithms but no low memory deterministic algorithm, such as outputting a nonzero row of a matrix, or outputting a basis for the row-span of a matrix.

We also investigate multi-pseudo-deterministic algorithms: algorithms which with high probability output one of a few options. We show the first lower bounds for such algorithms. This implies that there are streaming problems such that every low space algorithm for the problem must have inputs where there are many valid outputs, all with a significant probability of being outputted.

ISSN 1433-8092 | Imprint