University of Cambridge > > Artificial Intelligence Research Group Talks (Computer Laboratory) > Latent Action Space for Offline Reinforcement Learning

Latent Action Space for Offline Reinforcement Learning

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Mateja Jamnik.

Join us on Zoom

The goal of offline reinforcement learning is to learn a policy from a fixed dataset, without further interactions with the environment. This setting will be an increasingly more important paradigm for real-world applications of reinforcement learning such as robotics, in which data collection is slow and potentially dangerous. In this talk, we will discuss the challenges of applying existing off-policy algorithms on static datasets and the reasonings behind the objectives of offline RL. We will then introduce our approach Policy in the Latent Action Space (PLAS) which naturally satisfies the objectives. Our method is evaluated on continuous control benchmarks in simulation and the cloth-sliding task with a physical robot. We demonstrate that our method provides competitive performance consistently across various continuous control tasks and different types of datasets, outperforming previous offline reinforcement learning methods with explicit constraints.

Bio: Wenxuan Zhou is a Ph.D. student at the Robotics Institute at Carnegie Mellon University, advised by Prof. David Held. Her research interests lie at the intersection of robotics and reinforcement learning. Previously, she received her Master’s degree in Robotics at CMU advised by Prof. Abhinav Gupta. Prior to that, she obtained her dual B.S. degrees in Electrical and Computer Engineering from Shanghai Jiao Tong University and Mechanical Engineering from the University of Michigan. She will be joining DeepMind as an intern in Summer 2021.

This talk is part of the Artificial Intelligence Research Group Talks (Computer Laboratory) series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.


© 2006-2024, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity