Wasserstein Natural Gradients for Reinforcement Learning
- ๐ค Speaker: Ferenc Huszรกr
- ๐ Date & Time: Tuesday 01 December 2020, 13:15 - 14:15
- ๐ Venue: Zoom
Abstract
Policy Gradient methods can learn complex behaviours in difficult reinforcement learning tasks but often struggle with data-inefficiency: they make slow progress requiring frequent rollouts or simulations of the environment. A key to speeding these methods up is to incorporate the information geometry of policies into the optimisation. This can be done via trust regions (TRPO), additive penalties (PPO), or via natural gradients.
In this talk I present new optimization approach which can be applied to policy optimisation as well as evolution strategies for reinforcement learning. The procedure uses a computationally efficient Wasserstein natural gradient (WNG) descent that takes advantage of the geometry induced by a Wasserstein penalty to speed optimization. I will illustrate the differences between different natual gradient descent schemes and discuss experiments on challenging tasks which demonstrate improvements in both computational cost and performance over advanced baselines.
This talk is largely based on https://arxiv.org/abs/2010.05380
Series This talk is part of the Artificial Intelligence Research Group Talks (Computer Laboratory) series.
Included in Lists
- All Talks (aka the CURE list)
- Artificial Intelligence Research Group Talks (Computer Laboratory)
- bld31
- Cambridge Centre for Data-Driven Discovery (C2D3)
- Cambridge Forum of Science and Humanities
- Cambridge Language Sciences
- Cambridge talks
- Chris Davis' list
- Department of Computer Science and Technology talks and seminars
- Guy Emerson's list
- Hanchen DaDaDash
- Interested Talks
- Martin's interesting talks
- ndk22's list
- ob366-ai4er
- PhD related
- rp587
- School of Technology
- Speech Seminars
- Trust & Technology Initiative - interesting events
- yk373's list
- yk449
- Zoom
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)


Tuesday 01 December 2020, 13:15-14:15