TR04-002 Authors: Troy Lee, Dieter van Melkebeek, Harry Buhrman

Publication: 11th January 2004 21:09

Downloads: 1532

Keywords:

The language compression problem asks for succinct descriptions of

the strings in a language A such that the strings can be efficiently

recovered from their description when given a membership oracle for

A. We study randomized and nondeterministic decompression schemes

and investigate how close we can get to the information theoretic lower

bound of log |A^{=n}| for the description length of strings

of length n.

Using nondeterminism alone, we can achieve the information theoretic

lower bound up to an additive term of O(sqrt{log |A^{=n}|} log n);

using both nondeterminism and randomness, we can make do with an

excess term of O(log^3 n). With randomness alone, we show a lower

bound of n - log |A^{=n}| - O(log n) on the description length of strings in A of length

n, and a lower bound of 2 log |A^{=n}| - O(1) on the

length of any program that distinguishes a given string of length n

in A from any other string. The latter lower bound is

tight up to an additive term of O(log n).

The key ingredient for our upper bounds is the relativizable hardness

versus randomness tradeoffs based on the Nisan-Wigderson pseudorandom

generator construction.