ECCC-Report TR95-022https://eccc.weizmann.ac.il/report/1995/022Comments and Revisions published for TR95-022en-usTue, 02 May 1995 17:33:29 +0300
Paper TR95-022
| Pattern-Matching for Strings with Short Descriptions |
Marek Karpinski,
Wojciech Rytter,
Ayumi Shinohara
https://eccc.weizmann.ac.il/report/1995/022We consider strings which are succinctly described. The description
is in terms of straight-line programs in which the constants are
symbols and the only operation is the concatenation. Such
descriptions correspond to the systems of recurrences or to
context-free grammars generating single words. The descriptive
size of a string is the length $n$ of a straight-line program (or
size of a grammar) which defines this string. Usually the strings
of descriptive size $n$ are of exponential length.
{\em Fibonacci \/} and {\em Thue-Morse words \/} are examples of
such strings. We show that for a pattern $P$ and text $T$ of
descriptive sizes $m, n$, an occurrence of $P$ in $T$ can be found
(if there is any) in time polynomial with respect to $n$. This is
nontrivial, since the actual lengths of $P$ and $T$ could be
exponential, and none of the known string-matching algorithms is
directly applicable. Our first tool is the periodicity lemma, which
allows to represent some sets of exponentially many positions in
terms of feasibly many arithmetic progressions. The second tool is
arithmetics: a simple application of Euclid algorithm. Hence a
textual problem for exponentially long strings is reduced here to
simple arithmetics on integers with (only) linearly many bits. We
present also an NP-complete version of the pattern-matching for
shortly described strings.
Tue, 02 May 1995 17:33:29 +0300https://eccc.weizmann.ac.il/report/1995/022