We study graph computations in an enhanced data streaming setting, where a space-bounded client reading the edge stream of a massive graph may delegate some of its work to a cloud service. We seek algorithms that allow the client to verify a purported proof sent by the cloud service that ... more >>>
We give new algorithms in the annotated data streaming setting---also known as verifiable data stream computation---for certain graph problems. This setting is meant to model outsourced computation, where a space-bounded verifier limited to sequential data access seeks to overcome its computational limitations by engaging a powerful prover, without needing to ... more >>>
We develop a paradigm for studying multi-player deterministic communication,
based on a novel combinatorial concept that we call a {\em strong fooling
set}. Our paradigm leads to optimal lower bounds on the per-player
communication required for solving multi-player $\textsc{equality}$
problems in a private-message setting. This in turn gives a ...
more >>>
Set cover, over a universe of size $n$, may be modelled as a
data-streaming problem, where the $m$ sets that comprise the instance
are to be read one by one. A semi-streaming algorithm is allowed only
$O(n \text{ poly}\{\log n, \log m\})$ space to process this ...
more >>>
In the setting of streaming interactive proofs (SIPs), a client (verifier) needs to compute a given function on a massive stream of data, arriving online, but is unable to store even a small fraction of the data. It outsources the processing to a third party service (prover), but is unwilling ... more >>>
We introduce {\em online interactive proofs} (OIP), which are a hierarchy of communication complexity models that involve both randomness and nondeterminism (thus, they belong to the Arthur--Merlin family), but are {\em online} in the sense that the basic communication flows from Alice to Bob alone. The complexity classes defined by ... more >>>
The \textsc{equality} problem is usually one's first encounter with
communication complexity and is one of the most fundamental problems in the
field. Although its deterministic and randomized communication complexity
were settled decades ago, we find several new things to say about the
problem by focusing on two subtle aspects. The ...
more >>>
The central goal of data stream algorithms is to process massive streams of data using sublinear storage space. Motivated by work in the database community on outsourcing database and data stream processing, we ask whether the space usage of such algorithms can be further reduced by enlisting a more powerful ... more >>>
We study the communication complexity of evaluating functions when the input data is randomly allocated (according to some known distribution) amongst two or more players, possibly with information overlap. This naturally extends previously studied variable partition models such as the best-case and worst-case partition models. We aim to understand whether ... more >>>
We prove an optimal $\Omega(n)$ lower bound on the randomized
communication complexity of the much-studied
Gap-Hamming-Distance problem. As a consequence, we
obtain essentially optimal multi-pass space lower bounds in the
data stream model for a number of fundamental problems, including
the estimation of frequency moments.
The Gap-Hamming-Distance problem is a ... more >>>
The deterministic space complexity of approximating the length of the longest increasing subsequence of
a stream of $N$ integers is known to be $\widetilde{\Theta}(\sqrt N)$. However, the randomized
complexity is wide open. We show that the technique used in earlier work to establish the $\Omega(\sqrt
N)$ deterministic lower bound fails ...
more >>>
This paper makes three main contributions to the theory of communication complexity and stream computation. First, we present new bounds on the information complexity of AUGMENTED-INDEX. In contrast to analogous results for INDEX by Jain, Radhakrishnan and Sen [J. ACM, 2009], we have to overcome the significant technical challenge that ... more >>>
The Gap-Hamming-Distance problem arose in the context of proving space
lower bounds for a number of key problems in the data stream model. In
this problem, Alice and Bob have to decide whether the Hamming distance
between their $n$-bit input strings is large (i.e., at least $n/2 +
\sqrt n$) ...
more >>>
We consider the $k$-layer pointer jumping problem in the one-way
multi-party number-on-the-forehead communication model. In this problem,
the input is a layered directed graph with each vertex having outdegree
$1$, shared amongst $k$ players: Player~$i$ knows all layers {\em
except} the $i$th. The players must communicate, in the order
$1,2,\ldots,k$, ...
more >>>
We consider the approximate nearest neighbour search problem on the
Hamming Cube $\b^d$. We show that a randomised cell probe algorithm that
uses polynomial storage and word size $d^{O(1)}$ requires a worst case
query time of $\Omega(\log\log d/\log\log\log d)$. The approximation
factor may be as loose as $2^{\log^{1-\eta}d}$ for any ...
more >>>