We consider the problem of constructing binary codes to recover from k-bit deletions with efficient encoding/decoding, for a fixed k. The single deletion case is well understood, with the Varshamov-Tenengolts-Levenshtein code from 1965 giving an asymptotically optimal construction with \approx 2^n/n codewords of length n, i.e., at most \log n bits of redundancy. However, even for the case of two deletions, there was no known explicit construction with redundancy less than n^{\Omega(1)}.
For any fixed k, we construct a binary code with c_k \log n redundancy that can be decoded from k deletions in O_k(n \log^4 n) time. The coefficient c_k can be taken to be O(k^2 \log k), which is only quadratically worse than the optimal, non-constructive bound of O(k). We also indicate how to modify this code to allow for a combination of up to k insertions and deletions.
The claim that linear k-bit deletion codes must have rate at most about 1/(k+1) made in the appendix of the previous version is false, and is retracted in this version.
We consider the problem of constructing binary codes to recover from k-bit deletions with efficient encoding/decoding, for a fixed k. The single deletion case is well understood, with the Varshamov-Tenengolts-Levenshtein code from 1965 giving an asymptotically optimal construction with \approx 2^n/n codewords of length n, i.e., at most \log n bits of redundancy. However, even for the case of two deletions, there was no known explicit construction with redundancy less than n^{\Omega(1)}.
For any fixed k, we construct a binary code with c_k \log n redundancy that can be decoded from k deletions in O_k(n \log^4 n) time. The coefficient c_k can be taken to be O(k^2 \log k), which is only quadratically worse than the optimal, non-constructive bound of O(k). We also indicate how to modify this code to allow for a combination of up to k insertions and deletions.
We also note that among linear codes capable of correcting k deletions, the (k+1)-fold repetition code is essentially the best possible.