Log in

Cambridge users (raven) details

Other users details

No account? details

Information on

Subscribing to talks details

Finding a talk details

Adding a talk details

Disseminating talks details

Help and Documentation details

Wasserstein Natural Gradients for Reinforcement Learning

Add to your list(s) Download to your calendar using vCal

Ferenc Huszár
Tuesday 01 December 2020, 13:15-14:15
Zoom.

If you have a question about this talk, please contact Mateja Jamnik.

Join us on Zoom

Policy Gradient methods can learn complex behaviours in difficult reinforcement learning tasks but often struggle with data-inefficiency: they make slow progress requiring frequent rollouts or simulations of the environment. A key to speeding these methods up is to incorporate the information geometry of policies into the optimisation. This can be done via trust regions (TRPO), additive penalties (PPO), or via natural gradients.

In this talk I present new optimization approach which can be applied to policy optimisation as well as evolution strategies for reinforcement learning. The procedure uses a computationally efficient Wasserstein natural gradient (WNG) descent that takes advantage of the geometry induced by a Wasserstein penalty to speed optimization. I will illustrate the differences between different natual gradient descent schemes and discuss experiments on challenging tasks which demonstrate improvements in both computational cost and performance over advanced baselines.

This talk is largely based on https://arxiv.org/abs/2010.05380

This talk is part of the Artificial Intelligence Research Group Talks (Computer Laboratory) series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

Log in

Information on

Wasserstein Natural Gradients for Reinforcement Learning

This talk is included in these lists:

Other lists

Other talks