__
TR18-129 | 13th July 2018 00:46
__

#### Optimal Lower Bounds for Distributed and Streaming Spanning Forest Computation

**Abstract:**
We show optimal lower bounds for spanning forest computation in two different models:

* One wants a data structure for fully dynamic spanning forest in which updates can insert or delete edges amongst a base set of $n$ vertices. The sole allowed query asks for a spanning forest, which the data structure should successfully answer with some given (potentially small) constant probability $\epsilon>0$. We prove that any such data structure must use $\Omega(n\log^3 n)$ bits of memory.

* There is a referee and $n$ vertices in a network sharing public randomness, and each vertex knows only its neighborhood; the referee receives no input. The vertices each send a message to the referee who then computes a spanning forest of the graph with constant probability $\epsilon>0$. We prove the average message length must be $\Omega(\log^3 n)$ bits.

Both our lower bounds are optimal, with matching upper bounds provided by the AGM sketch \cite{AhnGM12} (which even succeeds with probability $1 - 1/\mathrm{poly}(n)$). Furthermore, for the first setting we show optimal lower bounds even for low failure probability $\delta$, as long as $\delta > 2^{-n^{1-\epsilon}}$.