University of Cambridge > > Machine Learning @ CUED > Optimal Control and Reinforcement Learning with Gaussian Process Models

Optimal Control and Reinforcement Learning with Gaussian Process Models

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Zoubin Ghahramani.

Optimal control and reinforcement learning (RL) have the same objective: optimization of a long-term performance measure. While the system in optimal control problems is usually known, RL has a more general setup, which includes possibly unknown environments. However, after learning a model standard algorithms for optimal control can also be applied to RL.

In this talk a generalization of dynamic programming (DP) to continuous-valued state and action spaces is given. The proposed algorithm (GPDP) combines Gaussian process (GP) models with DP and yields an approximate optimal closed-loop policy on the entire state space. We apply GPDP to the underactuated pendulum swing up. For exactly known environments we show that GPDP yields an close-to optimal solution. Moreover, we show that GPDP can successfully be applied to stochastic optimal control problems.

This talk is part of the Machine Learning @ CUED series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.


© 2006-2022, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity