\( \newcommand{\argmax}{\operatorname{arg\,max}\limits} \) \( \newcommand{\P}[1]{\mathbf{P} \left\{ #1\right\}} \) \( \newcommand{\E}{\mathbf{E}} \) \( \newcommand{\R}{\mathbb{R}} \) \( \newcommand{\set}[1]{\left\{#1\right\}} \) \( \newcommand{\floor}[1]{\left \lfloor {#1} \right\rfloor} \) \( \newcommand{\ceil}[1]{\left \lceil {#1} \right\rceil} \) \( \newcommand{\logp}{\log_{+}\!} \) \( \let\epsilon\varepsilon\)

About me

I am a postdoc at the University of Alberta working with Csaba Szepesvari as part of the RLAI group. In the fall I will join the Department of Informatics and Computing at Indiana University as an assistant professor.

Broadly I am interested in machine learning, but with a special focus on sequential decision making in the face of uncertainty.

You can contact me at firstname.lastname@gmail.com.

News

  • May 2016: There are some missing lower bounds for adversarial finite-armed bandits. S├ębastien Gerchinovitz and I have now added some (preprint). Specifically: high probability bounds, first-order bounds (in terms of the loss of the best arm) and second-order bounds (in terms of the quadratic variation). There are also some impossibility results. For example, the presence of an arm that is optimal in every round cannot help, and neither can losses that lie in a small range. The latter results are in contrast to the full-information setting where these things lead to smaller regret.
  • May 2016: There is a new near-near-optimal algorithm for finite-armed subgaussian bandits. See the preprint or a brief description.
  • May 2016: Together with Jan Leike, Laurent Orseau and Marcus Hutter, I have a paper on asymptotic results for a version of Thompson sampling in general reinforcement learning environments accepted at UAI (http://arxiv.org/abs/1602.07905).
  • April 2016: I have papers accepted at COLT (regret analysis of the Gittins index, http://arxiv.org/abs/1511.06014) and ICML (on “conservative” bandits to guarantee revenue while exploring, http://arxiv.org/abs/1602.04282).