Course Schedule
Weekday Regular Schedule
Group | Type | Hours | Location |
---|---|---|---|
01 | Lecture | Monday 13-16 | Orentstein 111 |
Resources
Introduction to Multi-Armed Bandits, Aleksandrs Slivkins
Bandit Algorithms Tor Lattimore and Csaba Szepesvari
Detailed Schedule
Week | Date | Lecture topics | references | Lecture Scribe |
1 | Oct 11 | Introduction and Stochastic Bandits | Chapters 1 and Introduction from Slivkins, Chapter 5.3 from Lattimore and Szepesvari, Sleeping bandits from Regret bounds for sleeping experts and bandits by Robert Kleinberg, Alexandru Niculescu-Mizil, Yogeshwer Sharma |
Hebrew Notes 1 |
2 | Oct 18 | Lower Bounds MAB | Chapters 2 from Slivkins, Sleeping bandits from Regret bounds for sleeping experts and bandits by Robert Kleinberg, Alexandru Niculescu-Mizil, Yogeshwer Sharma |
Hebrew Notes 2 |
3 | Oct 25 | Bayesian bandits and Thompson sampling algorithm | Chapters 3 from Slivkins, Near-optimal Regret Bounds for Thompson Sampling by SHIPRA AGRAWAL and NAVIN GOYAL, |
Hebrew Notes 3 |
4 | Nov 1 | Lipschitz bandits | Chapter 4 from Slivkins, The Value of Knowing a Demand Curve: Bounds on Regret for On-line Posted-Price Auctions by Robert Kleinberg and Tom Leighton |
Hebrew Notes 4 |
5 | Nov 8 | Adversarial costs: Full feedback | Chapter 5 from Slivkins, Learning, Regret minimization, and Equilibria (Section 4.3) by A. Blum and Y. Mansour From External to internal regret (Section 7) by A. Blum and Y. Mansour |
Hebrew Notes 5 |
6 | Nov 1 | Adversarial costs: MAB | Reduction: previous years class notes THE NONSTOCHASTIC MULTIARMED BANDIT PROBLEM by P. AUER, N. CESA-BIANCHI, Y. FREUND, and R. SCHAPIRE Explore no more: Improved high-probability regret bounds for non-stochastic bandits by G. Neu |
Hebrew Notes 6 |
7 | Nov 22 | Linear Cost | Full information: Efficient algorithms for online decision problems Kalai and Vempala 2005 Reduction and Brycentic spanner Online linear optimization and adaptive routing Awerbuch and Kleinberg 2008 Stochastic: Chapter 19 Bandit Algorithms book (Lattimore and Szepesvari) Additional resources: Slivkins book Chapter 7 Stochastic Linear Optimization under Bandit Feedback Dani, Hayes, Kakade 2008 |
Hebrew Notes 7 |
8 | Nov 29 | Contextual Bandits | Lecture notes from 2018 course Efficient Algorithms for Adversarial Contextual Learning Vasilis Syrgkanis, Akshay Krishnamurthy, Robert E. Schapire, ICML 2016 |
Hebrew Notes 8 |
9 | Dec 6 | Zero-sum games and applications | Lecture notes from 2018/19 and 2009/10 course Slivkins chapter 9 |
Hebrew Notes 9 |
10 | Dec 13 | Swap and and correlated equilibrium | Lecture notes from 2018/19 | Hebrew Notes 10 |
11 | Dec 20 | Bandits with Knapsacks | Slivkins, chapter 10 Lower bound: Bandits with Knapsacks Ashwinkumar Badanidiyuru, Robert Kleinberg, Aleksandrs Slivkins UCB like solution: Bandits with concave rewards and convex knapsacks Shipra Agrawal, Nikhil R. Devanur |
Hebrew Notes 11 |
12 | Dec 27 | Bandits with incentives | Notes 12 | |
13 | Jan 3 | Gittins Index | Hebrew Notes 13 |