BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//talks.cam.ac.uk//v3//EN
BEGIN:VTIMEZONE
TZID:Europe/London
BEGIN:DAYLIGHT
TZOFFSETFROM:+0000
TZOFFSETTO:+0100
TZNAME:BST
DTSTART:19700329T010000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=-1SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0100
TZOFFSETTO:+0000
TZNAME:GMT
DTSTART:19701025T020000
RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=-1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
CATEGORIES:Machine Learning Journal Club
SUMMARY:Optimal Bayesian Reinforcement Learning on Trees -
Philipp Hennig (University of Cambridge)
DTSTART;TZID=Europe/London:20090518T150000
DTEND;TZID=Europe/London:20090518T160000
UID:TALK18485AThttp://talks.cam.ac.uk
URL:http://talks.cam.ac.uk/talk/index/18485
DESCRIPTION:The "Q-Learning" algorithm is the classical soluti
on to the so-called "Optimal" Reinforcement Learni
ng Problem. Q-Learning uses samples of future rewa
rds generated by a non-optimal policy to derive po
int estimates of the future rewards from the (unkn
own) optimal policy. \n\nIn the first part of this
talk\, I will show that a Bayesian treatment\, in
forcing us to explicitly define our assumptions\,
reveals some interesting aspects of this problem
that seem to have been overlooked so far. \n\nIn t
he second part\, I will introduce an algorithm tha
t uses Expectation Propagation to generate beliefs
over possible future rewards from the optimal pol
icy if the Markov Environment forms a tree (i.e. "
Bayesian Q-Learning on trees") and will show some
preliminary results for its application to Game Tr
ees.
LOCATION:TCM Seminar Room\, Cavendish Laboratory\, Departme
nt of Physics
CONTACT:Philipp Hennig
END:VEVENT
END:VCALENDAR