Optimal Reinforcement Learning for Gaussian Systems
Add to your list(s)
Download to your calendar using vCal
If you have a question about this talk, please contact Carl Edward Rasmussen.
The exploration-exploitation trade-off is among the central challenges of reinforcement learning. The optimal Bayesian solution is intractable in general. In this talk I will show that, however, if all beliefs are Gaussian processes, it is possible to make analytic statements about optimal learning of both rewards and transition dynamics, for nonlinear, time-varying systems in continuous time and space, subject to a relatively weak restriction on the dynamics: The solution is described by an infinite-dimensional partial differential equation. An approximate finite-dimensional projection provides a first impression for how this result may be helpful.
This talk is part of the Machine Learning @ CUED series.
This talk is included in these lists:
Note that ex-directory lists are not shown.
|