Suppose a decision maker has to purchase a commodity over time with
varying prices and demands. In particular, the price per unit might
depend on the amount purchased and this price function might vary from
step to step. The decision maker has a buffer of bounded size for
storing units of the commodity that can be used to satisfy demands at
later points in time. We seek for an algorithm deciding at which time
to buy which amount of the commodity so as to minimize the cost. This
kind of problem arises in many technological and economical settings
like, e.g., battery management in hybrid cars and economical caching
policies for mobile devices. A simplified but illustrative example is
a frugal car driver thinking about at which occasion to buy which
amount of gasoline.
We study this problem within a regret analysis. In particular, we
investigate the external regret obtained by the Weighted Majority
Algorithm applied to our problem. We show that the algorithm does
not achieve a reasonable regret bound if its random choices are
independent from step to step, that is, the regret for $T$ steps is
$\Omega(T)$. However, one can achieve regret $O(\sqrt{T})$ when
introducing dependencies in order to reduce the number of changes between
the chosen experts. If price functions satisfy a convexity condition
then one can even derive a deterministic, fractional variant of this
algorithm achieving the same regret bound.
Our more detailed bounds on the regret depend on the buffer size and
the number of available experts. The upper bounds are complemented by
a matching lower bound on the best possible external regret.