Bandit Simulator
Regret
Total reward
Algorithm
Optimally Confident UCB
UCB
Arms
2
3
4
Horizon
100
200
500
750
1000
2000
5000
Display
Run Algorithm
Restart
New Data
New Problem