Talks.cam will close on 1 July 2026, further information is available on the UIS Help Site
 

University of Cambridge > Talks.cam > Isaac Newton Institute Seminar Series > Proximal Policy Optimization in the Fisher-Rao geometry

Proximal Policy Optimization in the Fisher-Rao geometry

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact nobody.

SCLW01 - Bridging Stochastic Control And Reinforcement Learning: Theories and Applications

PPO is one of the most widely used algorithms in reinforcement learning, offering a practical policy gradient method with strong empirical performance. However, despite its popularity, PPO lacks rigorous theoretical guarantees for policy improvement and convergence. The method employs a clipped surrogate objective, derived from linearising the value function in a flat geometric setting. In this talk, we introduce a refined surrogate objective based on the Fisher–Rao geometry, leading to a new variant, Fisher–Rao PPO (FR-PPO). Our approach provides robust theoretical guarantees, including monotonic policy improvement and sub-linear convergence rates, representing a substantial advance toward formal convergence results for the wider class of PPO algorithms. This talk is based on joint work with David Siska and Lukasz Szpruch.

This talk is part of the Isaac Newton Institute Seminar Series series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2025 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity