University of Cambridge > Talks.cam > Isaac Newton Institute Seminar Series > Stochastic Gradient Descent with Adaptive Data

Stochastic Gradient Descent with Adaptive Data

Add to your list(s) Download to your calendar using vCal

  • UserJing Dong (Columbia University)
  • ClockThursday 25 April 2024, 11:00-11:45
  • HouseExternal.

If you have a question about this talk, please contact nobody.

TMLW02 - SGD: stability, momentum acceleration and heavy tails

Stochastic gradient descent (SGD) is a powerful optimization technique, particularly useful in online learning scenarios. Its convergence analysis/effectiveness is relatively well understood under the assumption that the data samples are independent and identically distributed (iid). However, applying online learning to policy optimization problems in operations research involves a distinct challenge: the policy changes the environment and thereby affects the data used to update the policy. The adaptively generated data stream involves samples that are non-stationary, no longer independent from each other, and are affected by previous decisions. The influence of previous decisions on the environment introduces estimation bias in the gradients, which presents a potential source of instability for online learning. In this paper, we introduce simple criteria for the adaptively generated data stream to guarantee the convergence of SGD . We show that the convergence speed of SGD with adaptive data is largely similar to the classical iid setting, as long as the mixing time of the policy-induced dynamics is factored in. Our Lyapunov-function analysis allows one to translate existing stability analysis of systems studied in operations research into convergence rates for SGD , and we demonstrate this for queuing and inventory management problems. We also showcase how our result can be applied to study an actor-critic policy gradient algorithm. This is joint work with Ethan Che and Xin Tong.  

This talk is part of the Isaac Newton Institute Seminar Series series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2024 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity