Talks.cam will close on 1 July 2026, further information is available on the UIS Help Site
 

University of Cambridge > Talks.cam > Isaac Newton Institute Seminar Series > Optimal Sequential Inference and Decision Making: From Uniform A/B Testing to Gradient-Based Bandits

Optimal Sequential Inference and Decision Making: From Uniform A/B Testing to Gradient-Based Bandits

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact nobody.

SCLW01 - Bridging Stochastic Control And Reinforcement Learning: Theories and Applications

Sequential data collection and decision making lie at the core of modern statistical learning, from anytime-valid A/B testing in online experiments to adaptive multi-armed bandit problems. This talk presents recent advances in the design and analysis of optimal algorithms for these two settings. In the first part, I will introduce a framework for anytime-valid, variance-adaptive inference in monotonic processes—such as cumulative distribution functions—that builds on the coin-betting paradigm of game-theoretic statistics and integrates PAC -Bayesian principles to yield tight hypothesis tests that are uniform not only in time but also in space. In the second part, I will focus on stochastic gradient bandits, a fundamental policy-gradient approach to online decision making, and present theoretical results showing how the learning rate governs the algorithm’s regret, revealing sharp thresholds that separate logarithmic and polynomial regimes and depend on the (unknown) sub-optimality gap. (Based on joint work with E. Clerico and H. E. Flynn, and with D. Baudry, E. Johnson, S. Vary, and C. Pike-Burke)

This talk is part of the Isaac Newton Institute Seminar Series series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2025 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity