A natural algorithmic scheme in online game playing is called `follow-the-leader', first proposed by Hannan in the 1950's. Simply stated, this method advocates the use of past history to make future predictions, by using the optimal strategy so far as the strategy for the next game iteration. Randomized variations on this method for the special case of linear regret functions have been rigorously analyzed and have found numerous applications in machine learning and game theory. It was a long standing open problem whether the `follow the leader' method attains any non-trivial regret guarantees for the case of concave regret functions. This question is significant since 'follow-the-leader' is a natural deterministic method, easy to implement and computationally efficient.
We introduce a new analysis technique and show that a deterministic variant of this method has optimal regret. This result is applicable to a variety of online optimization scenarios, including regret minimization for Lipschitz regret functions, universal portfolios management, online convex optimization and online utility maximization. For the well studied universal portfolio management problem, our algorithm combines optimal regret with computational efficiency. For the general setting, our algorithm achieves exponentially lower regret than previous algorithms.
Our analysis shows a surprising connection between interior point methods and online optimization using follow-the-leader.