Efficient Low-Redundancy Codes for Correcting Multiple Deletions

Joshua Brakensiek; Venkatesan Guruswami; Samuel Zbarsky

Electronic Colloquium on Computational Complexity

Under the auspices of the Computational Complexity Foundation (CCF)

REPORTS > DETAIL:

Revision(s):

Revision #1 to TR15-116 | 20th May 2019 01:41

Efficient Low-Redundancy Codes for Correcting Multiple Deletions

Revision #1 Authors: Joshua Brakensiek, Venkatesan Guruswami, Samuel Zbarsky
Accepted on: 20th May 2019 01:41
Downloads: 785

Keywords:

Abstract:

We consider the problem of constructing binary codes to recover from $k$-bit deletions with efficient encoding/decoding, for a fixed $k$. The single deletion case is well understood, with the Varshamov-Tenengolts-Levenshtein code from 1965 giving an asymptotically optimal construction with $\approx 2^n/n$ codewords of length $n$, i.e., at most $\log n$ bits of redundancy. However, even for the case of two deletions, there was no known explicit construction with redundancy less than $n^{\Omega(1)}$.

For any fixed $k$, we construct a binary code with $c_k \log n$ redundancy that can be decoded from $k$ deletions in $O_k(n \log^4 n)$ time. The coefficient $c_k$ can be taken to be $O(k^2 \log k)$, which is only quadratically worse than the optimal, non-constructive bound of $O(k)$. We also indicate how to modify this code to allow for a combination of up to $k$ insertions and deletions.

Changes to previous version:

The claim that linear $k$-bit deletion codes must have rate at most about $1/(k+1)$ made in the appendix of the previous version is false, and is retracted in this version.

Paper:

TR15-116 | 21st July 2015 13:34

Efficient Low-Redundancy Codes for Correcting Multiple Deletions

TR15-116 Authors: Joshua Brakensiek, Venkatesan Guruswami, Samuel Zbarsky
Publication: 21st July 2015 13:34
Downloads: 2373

Keywords:

Edit Distance, error correction, explicit construction, hashing, Pseudorandomness

Abstract:

We also note that among linear codes capable of correcting $k$ deletions, the $(k+1)$-fold repetition code is essentially the best possible.

ISSN 1433-8092 | Imprint