COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring. |
University of Cambridge > Talks.cam > Artificial Intelligence Research Group Talks (Computer Laboratory) > Wasserstein Natural Gradients for Reinforcement Learning
Wasserstein Natural Gradients for Reinforcement LearningAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact Mateja Jamnik. Policy Gradient methods can learn complex behaviours in difficult reinforcement learning tasks but often struggle with data-inefficiency: they make slow progress requiring frequent rollouts or simulations of the environment. A key to speeding these methods up is to incorporate the information geometry of policies into the optimisation. This can be done via trust regions (TRPO), additive penalties (PPO), or via natural gradients. In this talk I present new optimization approach which can be applied to policy optimisation as well as evolution strategies for reinforcement learning. The procedure uses a computationally efficient Wasserstein natural gradient (WNG) descent that takes advantage of the geometry induced by a Wasserstein penalty to speed optimization. I will illustrate the differences between different natual gradient descent schemes and discuss experiments on challenging tasks which demonstrate improvements in both computational cost and performance over advanced baselines. This talk is largely based on https://arxiv.org/abs/2010.05380 This talk is part of the Artificial Intelligence Research Group Talks (Computer Laboratory) series. This talk is included in these lists:
Note that ex-directory lists are not shown. |
Other listsJapanese Society in Cambridge ケンブリッジ日本人会 psychology tate modernOther talksThe Biology of Eating How is electrical signal generated? Turning Up the Dial: the Evolution of a Cybercrime Market Through Set-up, Stable, and Covid-19 Eras Bias in AI CANCELLED - Sundry succulents |