Revision #1 Authors: Bodo Manthey, Till Tantau

Accepted on: 28th May 2008 00:00

Downloads: 2276

Keywords:

Binary search trees are a fundamental data structure and their height plays a key role in the analysis of divide-and-conquer algorithms like quicksort. We analyze their smoothed height under additive uniform noise: An adversary chooses a sequence of n real numbers in the range [0,1], each number is individually perturbed by adding a value drawn uniformly at random from an interval of size d, and the resulting numbers are inserted into a search tree. An analysis of the smoothed tree height subject to n and d lies at the heart of our paper: We prove that the smoothed height of binary search trees is $\Theta (\sqrt{n/d} + \log n)$, where $d \ge 1/n$ may depend on n. Our analysis starts with the simpler problem of determining the smoothed number of left-to-right maxima in a sequence. We establish matching bounds, namely once more $\Theta (\sqrt{n/d} + \log n)$. We also apply our findings to the performance of the quicksort algorithm and prove that the smoothed number of comparisons made by quicksort is $\Theta(\frac{n}{d+1} \sqrt{n/d} + n \log n)$.

TR07-039 Authors: Bodo Manthey, Till Tantau

Publication: 26th April 2007 17:46

Downloads: 2360

Keywords:

Binary search trees are a fundamental data structure and their height

plays a key role in the analysis of divide-and-conquer algorithms like

quicksort. Their worst-case height is linear; their average height,

whose exact value is one of the best-studied problems in average-case

complexity, is logarithmic. We analyze their smoothed height under

additive noise: An adversary chooses a sequence of n real numbers in the

range [0,1]; each number is individually perturbed by adding a random

value from an interval of size d; and the resulting numbers are inserted

into a search tree. The expected height of this tree is called smoothed

tree height. If d is very small, namely for d < 1/n, the smoothed tree

height is the same as the worst-case height; if d is very large, the

smoothed tree height approaches the logarithmic average-case height. An

analysis of what happens between these extremes lies at the heart of our

paper: We prove that the smoothed height of binary search trees is

$\Theta (\sqrt{n/d} + \log n)$, where d >= 1/n may depend on n. This

implies that the logarithmic average-case height becomes manifest only

for $d \in \Omega (n/\log^2 n)$. For the analysis, we first prove that

the smoothed number of left-to-right maxima in a sequence is also

$\Theta (\sqrt{n/d} + \log n)$. We apply these findings to the

performance of the quicksort algorithm, which needs $\Theta(n^2)$

comparisons in the worst case and $\Theta(n \log n)$ on average, and

prove that the smoothed number of comparisons made by quicksort is

$\Theta(n/d+1 \sqrt{n/d} + n \log n)$. This implies that the

average-case becomes manifest already for

$d \in \Omega(\sqrt[3]{n/\logsqr n})$.