BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//talks.cam.ac.uk//v3//EN
BEGIN:VTIMEZONE
TZID:Europe/London
BEGIN:DAYLIGHT
TZOFFSETFROM:+0000
TZOFFSETTO:+0100
TZNAME:BST
DTSTART:19700329T010000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=-1SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0100
TZOFFSETTO:+0000
TZNAME:GMT
DTSTART:19701025T020000
RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=-1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
CATEGORIES:Machine Learning @ CUED
SUMMARY:Mean Field Approaches to Two Sequential Decision M
aking Problems. - Ramki Gummadi
DTSTART;TZID=Europe/London:20150608T170000
DTEND;TZID=Europe/London:20150608T180000
UID:TALK59759AThttp://talks.cam.ac.uk
URL:http://talks.cam.ac.uk/talk/index/59759
DESCRIPTION:In this presentation\, I will present some of my p
ast research on mean field approaches to sequentia
l decision making problems under uncertainty. The
talk has two parts.\n \nIn the first part\, we'll
consider multiarmed bandit games\, where much of t
he classical work focuses on arm rewards that are
stationary over time. By contrast\, I will present
a bandit model with multiple agents where the rew
ards obtained by an agent can also depend on how m
any other agents choose the same arm (as might be
the case in many competitive or cooperative scenar
ios). Such a dynamical system can be non-stationar
y due to the interdependent evolution of agents\,
which makes an analysis using typical equilibrium
concepts intractable. I will introduce a general m
odel of multiarmed bandit games\, and define a not
ion of equilibrium inspired mean field methods\, c
alled the mean field stationary state (MFSS). I wi
ll discuss three key results -- first\, we establ
ish existence of an MFSS under quite general condi
tions. Second\, we show under a contraction condit
ion that the MFE is unique\, and that the populati
on profile converges to it from any initial state.
Finally\, we show that under the contraction cond
ition\, MFE is a good approximation to the behavio
r of finite systems with many agents. The contract
ion condition requires that the agent population i
s sufficiently mixing and that the sensitivity of
the reward function to the population profile is l
ow enough. \n\nThe second part is motivated by dyn
amic ad auction markets (e.g Facebook\, Google\, B
ing) where ad-slots are auctioned off dynamically
at large scale. Using a mean field approximation\,
I will formulate the sequential optimization prob
lem facing a bidder participating in such repeated
auctions in the form of a very general stochastic
control model\, and present an approximate soluti
on to a limiting MDP that allows for efficient com
putation of the optimal bids. I will also discuss
how the optimal bid at any given time in a dynamic
setting reduces to that of an equivalent static a
uction with a shaded private valuation\, for a ver
y general class of auction formats that includes t
he second price auction as a special case.\n\nThis
is based on joint work with Ramesh Johari (Stanfo
rd)\, Peter Key (MSR Cambridge)\, Jia Yuan Yu (IBM
Research) and Alexandre Proutiere (KTH Stockholm)
LOCATION:Engineering Department\, CBL Room BE-438 - Skype
CONTACT:
END:VEVENT
END:VCALENDAR