In the multi-armed bandit problem, a gambler must decide which arm
of K non-identical slot machines to play in a sequence of trials
so as to maximize his reward.
This classical problem has received much attention because of the
simple model it provides of the trade-off between
exploration ...
more >>>