University of Cambridge > Talks.cam > Information Theory Seminar > Thompson Sampling for Stochastic Bandits with Noisy Contexts: An Information-Theoretic Analysis

Thompson Sampling for Stochastic Bandits with Noisy Contexts: An Information-Theoretic Analysis

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Dr Varun Jog.

Decision-making in the face of uncertainty is a practical challenge found across various areas such as control and robotics, clinical trials, communication, and ecology. An extensively studied decision-making framework is that of stochastic contextual bandits (CBs) which uses side information, termed context, for sequential decision making. Prior research on CBs has mostly focussed on models where the contexts are well-defined. This, however, is not true in real-world applications where the contexts are either noisy or are indicative of predictive measurements. In this talk, we focus on noisy CBs where the learner observes only a noisy, corrupted, version of the true context through an unknown noise channel. We introduce a Thompson Sampling algorithm for Gaussian bandits with Gaussian context noise that can ‘approximate’ the action policy of an oracle which has access to the predictive distribution of the true context from the observed noisy context. Using information-theoretic tools, we study the Bayesian regret of the proposed algorithm.

This talk is part of the Information Theory Seminar series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2024 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity