We study certain combinatorial aspects of list-decoding, motivated by the exponential gap between the known upper bound (of $O(1/\gamma)$) and lower bound (of $\Omega_p(\log (1/\gamma))$) for the list-size needed to decode up to radius $p$ with rate $\gamma$ away from capacity, i.e., $1-h(p)-\gamma$ (here $p\in (0,1/2)$ and $\gamma > 0$).
Our main result is the following:
We prove that in any binary code $C \subseteq \{0,1\}^n$ of rate $1-h(p)-\gamma$, there must exist a set $\mathcal{L} \subset C$ of $\Omega_p(1/\sqrt{\gamma})$ codewords such that the average distance of the points in $\mathcal{L}$ from their centroid is at most $pn$. In other words, there must exist $\Omega_p(1/\sqrt{\gamma})$ codewords with low ``average radius.'' The standard notion of list-decoding corresponds to working with the maximum distance of a collection of codewords from a center instead of average distance. The average-radius form is in itself quite natural; for instance, the classical Johnson bound in fact implies average-radius list-decodability.
The remaining results concern the standard notion of list-decoding, and help clarify the current state of affairs regarding combinatorial bounds for list-decoding:
1. We give a short simple proof, over all fixed alphabets, of the above-mentioned $\Omega_p(\log (1/\gamma))$ lower bound. Earlier, this bound followed from a complicated, more general result of Blinovsky.
2. We show that one {\em cannot} improve the $\Omega_p(\log (1/\gamma))$ lower bound via techniques based on identifying the zero-rate regime for list decoding of constant-weight codes (this is a typical approach for negative results in coding theory, including the $\Omega_p(\log (1/\gamma))$ list size lower bound). On a positive note, our $\Omega_p(1/\sqrt{\gamma})$ lower bound for the strong form of list decoding does circumvent this barrier.
3. We show a ``reverse connection'' showing that constant-weight codes for list decoding imply general codes for list decoding with higher rate. This shows that the best possible list-size, as a function of the gap $\gamma$ of the rate to the capacity limit, is the same up to constant factors for both constant-weight codes (with weight bounded away from p by a constant) and general codes.
4. We give simple second moment based proofs that w.h.p. a list-size of $\Omega_p(1/\gamma)$ is needed for list decoding {\em random} codes from errors as well as erasures, at rates which are $\gamma$ away from the corresponding capacities. For {\em random linear} codes, the corresponding list size bounds are $\Omega_p(1/\gamma)$ for errors and $\exp(\Omega_{p}(1/\gamma))$ for erasures.
The proof of the main lower bound for average-radius list decoding has been simplified, and the presentation has been improved in several places.
We prove the following results concerning the combinatorics of list decoding, motivated by the exponential gap between the known upper bound (of $O(1/\gamma)$) and lower bound (of $\Omega_p(\log (1/\gamma))$) for the list-size needed to decode up to radius $p$ with rate $\gamma$ away from capacity, i.e., $1-h(p)-\gamma$ (here $p\in (0,1/2)$ and $\gamma > 0$).
We prove that in any binary code $C \subseteq \{0,1\}^n$ of rate $1-h(p)-\gamma$, there must exist a set $\mathcal{L} \subset C$ of $\Omega_p(1/\sqrt{\gamma})$ codewords such that the average distance of the points in $\mathcal{L}$ from their centroid is at most $pn$. In other words, there must exist $\Omega_p(1/\sqrt{\gamma})$ codewords with low ``average radius.'' The motivation for this result is that it gives a list-size lower bound for a strong notion of list decoding; this strong form has been implicitly been used in the previous negative results for list decoding. (The usual notion of list decoding corresponds to replacing average radius by the minimum radius of an enclosing Hamming ball.)
The remaining results are for the usual notion of list decoding:
1. We give a short simple proof, over all fixed alphabets, of the above-mentioned $\Omega_p(\log (1/\gamma))$ lower bound due to Blinovsky.
2. We show that one {\em cannot} improve the $\Omega_p(\log (1/\gamma))$ lower bound via techniques based on identifying the zero-rate regime for list decoding of constant-weight codes (this is a typical approach for negative results in coding theory, including the $\Omega_p(\log (1/\gamma))$ list size lower bound). On a positive note, our $\Omega_p(1/\sqrt{\gamma})$ lower bound for the strong form of list decoding does circumvent this barrier.
3. We show a ``reverse connection'' showing that constant-weight codes for list decoding imply general codes for list decoding with higher rate. This shows that the best possible list-size, as a function of the gap $\gamma$ of the rate to the capacity limit, is the same up to constant factors for both constant-weight codes and general codes.
4. We give simple second moment based proofs that w.h.p. a list-size of $\Omega_p(1/\gamma)$ is needed for list decoding {\em random} codes from errors as well as erasures, at rates which are $\gamma$ away from the corresponding capacities. For {\em random linear} codes, the corresponding list size bounds are $\Omega_p(1/\gamma)$ for errors and $\exp(\Omega_{p}(1/\gamma))$ for erasures.