TR11-115 Authors: Varun Kanade, Thomas Steinke

Publication: 19th August 2011 03:52

Downloads: 2334

Keywords:

We study the online decision problem where the set of available actions varies over time, also called the sleeping experts problem. We consider the setting where the performance comparison is made with respect to the best ordering of actions in hindsight. In this paper, both the payoff function and the availability of actions is adversarial. Kleinberg et al. (2008) gave a computationally efficient no-regret algorithm in the setting where payoffs are stochastic. Kanade et al. (2009) gave an efficient no-regret algorithm in the setting where action availability is stochastic.

However, the question of whether there exists a computationally efficient no-regret algorithm in the adversarial setting was posed as an open problem by Kleinberg et al. (2008). We show that such an algorithm would imply an algorithm for PAC learning DNF, a long standing important open problem. We also show that a related problem, the gambling problem, posed as an open problem by Abernethy (2010) is related to agnostically learning halfspaces, albeit under restricted distributions.