I am a postdoc at the University of Alberta working with Csaba Szepesvari as part of the RLAI group. In the fall I will join the Department of Informatics and Computing at Indiana University as an assistant professor.
Broadly I am interested in machine learning, but with a special focus on sequential decision making in the face of uncertainty.
You can contact me at email@example.com.
- June 2016: Finnian Lattimore (who happens to be my sister), Mark Reid and I have just released a new bandit paper to the arxiv. The idea is to show that causal information can sometimes be exploited when minimising the simple regret (pure exploration).
- June 2016: The paper with Jan, Marcus, Laurent and I won best student paper at UAI. Congratulations Jan!
- June 2016: Bandit strategies based on distinct phases of exploration followed by exploitation are pretty natural. Aurélien Garivier, Emilie Kaufmann and I show that such approaches are doomed to sub-optimality in the illustrative two-armed Gaussian bandit with known horizon.
- May 2016: There are missing lower bounds for adversarial finite-armed bandits. Sébastien Gerchinovitz and I have now added some (preprint). Specifically: high probability bounds, first-order bounds (in terms of the loss of the best arm) and second-order bounds (in terms of the quadratic variation). There are also some impossibility results. For example, the presence of an arm that is optimal in every round cannot help, and neither can losses that lie in a small range. The latter results are in contrast to the full-information setting where these things lead to smaller regret.
- May 2016: There is a new near-near-optimal algorithm for finite-armed subgaussian bandits. See the preprint or a brief description.
- May 2016: Together with Jan Leike, Laurent Orseau and Marcus Hutter, I have a paper on asymptotic results for a version of Thompson sampling in general reinforcement learning environments accepted at UAI (http://arxiv.org/abs/1602.07905).
- April 2016: I have papers accepted at COLT (regret analysis of the Gittins index, http://arxiv.org/abs/1511.06014) and ICML (on “conservative” bandits to guarantee revenue while exploring, http://arxiv.org/abs/1602.04282).