Binary search trees are a fundamental data structure and their height plays a key role in the analysis of divide-and-conquer algorithms like quicksort. We analyze their smoothed height under additive uniform noise: An adversary chooses a sequence of n real numbers in the range [0,1], each number is individually perturbed by adding a value drawn uniformly at random from an interval of size d, and the resulting numbers are inserted into a search tree. An analysis of the smoothed tree height subject to n and d lies at the heart of our paper: We prove that the smoothed height of binary search trees is \Theta (\sqrt{n/d} + \log n), where d \ge 1/n may depend on n. Our analysis starts with the simpler problem of determining the smoothed number of left-to-right maxima in a sequence. We establish matching bounds, namely once more \Theta (\sqrt{n/d} + \log n). We also apply our findings to the performance of the quicksort algorithm and prove that the smoothed number of comparisons made by quicksort is \Theta(\frac{n}{d+1} \sqrt{n/d} + n \log n).
Binary search trees are a fundamental data structure and their height
plays a key role in the analysis of divide-and-conquer algorithms like
quicksort. Their worst-case height is linear; their average height,
whose exact value is one of the best-studied problems in average-case
complexity, is logarithmic. We analyze their smoothed height under
additive noise: An adversary chooses a sequence of n real numbers in the
range [0,1]; each number is individually perturbed by adding a random
value from an interval of size d; and the resulting numbers are inserted
into a search tree. The expected height of this tree is called smoothed
tree height. If d is very small, namely for d < 1/n, the smoothed tree
height is the same as the worst-case height; if d is very large, the
smoothed tree height approaches the logarithmic average-case height. An
analysis of what happens between these extremes lies at the heart of our
paper: We prove that the smoothed height of binary search trees is
\Theta (\sqrt{n/d} + \log n), where d >= 1/n may depend on n. This
implies that the logarithmic average-case height becomes manifest only
for d \in \Omega (n/\log^2 n). For the analysis, we first prove that
the smoothed number of left-to-right maxima in a sequence is also
\Theta (\sqrt{n/d} + \log n). We apply these findings to the
performance of the quicksort algorithm, which needs \Theta(n^2)
comparisons in the worst case and \Theta(n \log n) on average, and
prove that the smoothed number of comparisons made by quicksort is
\Theta(n/d+1 \sqrt{n/d} + n \log n). This implies that the
average-case becomes manifest already for
d \in \Omega(\sqrt[3]{n/\logsqr n}).