\( \newcommand{\argmax}{\operatorname{arg\,max}\limits} \) \( \newcommand{\P}[1]{\mathbf{P} \left(#1\right)} \) \( \newcommand{\E}{\mathbf{E}} \) \( \newcommand{\R}{\mathbb{R}} \) \( \newcommand{\set}[1]{\left\{#1\right\}} \) \( \newcommand{\floor}[1]{\left \lfloor {#1} \right\rfloor} \) \( \newcommand{\ceil}[1]{\left \lceil {#1} \right\rceil} \) \( \newcommand{\logp}{\log_{+}\!} \) \( \let\epsilon\varepsilon\)

Teaching

August 2018: Slides from lecture at reinforcement learning summer school are here. They cover the basics of stochastic bandits and learning in episodic Markov decision processes.

March 2018: Slides from AAAI tutorial on bandits are here (finite-armed bandits and linear bandits).

Fall 2016: B551 Elements of Artificial Intelligence. Material is available on Canvas. Results of Connect4 tournament. Results of prisoner’s dilemma tournament.