Optimal Control and Reinforcement Learning with Gaussian Process Models
Add to your list(s)
Download to your calendar using vCal
If you have a question about this talk, please contact Zoubin Ghahramani.
Optimal control and reinforcement learning (RL) have the same objective: optimization of a long-term performance measure. While the system in optimal control problems is usually known, RL has a more general setup, which includes possibly unknown environments. However, after learning a model standard algorithms for optimal control can also be applied to RL.
In this talk a generalization of dynamic programming (DP) to continuous-valued state and action spaces is given. The proposed algorithm (GPDP) combines Gaussian process (GP) models with DP and yields an approximate optimal closed-loop policy on the entire state space. We apply GPDP to the underactuated pendulum swing up. For exactly known environments we show that GPDP yields an close-to optimal solution. Moreover, we show that GPDP can successfully be applied to stochastic optimal control problems.
This talk is part of the Machine Learning @ CUED series.
This talk is included in these lists:
Note that ex-directory lists are not shown.
|