BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Bellman Optimality of Average-Reward Robust Markov Decision Proces
 ses with a Constant Gain - Shengbo Wang (University of Southern California
 )
DTSTART:20251113T144000Z
DTEND:20251113T152000Z
UID:TALK238534@talks.cam.ac.uk
DESCRIPTION:Learning and optimal control under robust Markov decision proc
 esses (MDPs) have received increasing attention\, yet most existing theory
 \, algorithms\, and applications focus on finite-horizon or discounted mod
 els. The average-reward formulation\, while natural in many operations res
 earch and management contexts\, remains underexplored. This is primarily b
 ecause the dynamic programming foundations are technically challenging and
  only partially understood\, with several fundamental questions remaining 
 open. This paper steps toward a general framework for average-reward robus
 t MDPs by analyzing the constant-gain setting. We study the average-reward
  robust control problem with possible information asymmetries between the 
 controller and an S-rectangular adversary. Our analysis centers on the con
 stant-gain robust Bellman equation\, examining both the existence of solut
 ions and their relationship to the optimal average reward. Specifically\, 
 we identify when solutions to the robust Bellman equation characterize the
  optimal average reward and stationary policies\, and we provide sufficien
 t conditions ensuring solutions&rsquo\; existence. These findings expand t
 he dynamic programming theory for average-reward robust MDPs and lay a fou
 ndation for robust dynamic decision making under long-run average criteria
  in operational environments.
LOCATION:Seminar Room 1\, Newton Institute
END:VEVENT
END:VCALENDAR