COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring. |
University of Cambridge > Talks.cam > Isaac Newton Institute Seminar Series > Stochastic Gradient Descent with Adaptive Data
Stochastic Gradient Descent with Adaptive DataAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact nobody. TMLW02 - SGD: stability, momentum acceleration and heavy tails Stochastic gradient descent (SGD) is a powerful optimization technique, particularly useful in online learning scenarios. Its convergence analysis/effectiveness is relatively well understood under the assumption that the data samples are independent and identically distributed (iid). However, applying online learning to policy optimization problems in operations research involves a distinct challenge: the policy changes the environment and thereby affects the data used to update the policy. The adaptively generated data stream involves samples that are non-stationary, no longer independent from each other, and are affected by previous decisions. The influence of previous decisions on the environment introduces estimation bias in the gradients, which presents a potential source of instability for online learning. In this paper, we introduce simple criteria for the adaptively generated data stream to guarantee the convergence of SGD . We show that the convergence speed of SGD with adaptive data is largely similar to the classical iid setting, as long as the mixing time of the policy-induced dynamics is factored in. Our Lyapunov-function analysis allows one to translate existing stability analysis of systems studied in operations research into convergence rates for SGD , and we demonstrate this for queuing and inventory management problems. We also showcase how our result can be applied to study an actor-critic policy gradient algorithm. This is joint work with Ethan Che and Xin Tong. This talk is part of the Isaac Newton Institute Seminar Series series. This talk is included in these lists:This talk is not included in any other list Note that ex-directory lists are not shown. |
Other listsCoercion Lecture Series Indian National Students Association, Cambridge 'Three Tales' pre-performance talksOther talksInaugural Lecture by Dr Hannah Hasenberger Progress Report Presentations How does the human brain recognize faces? The limiting spectral law of sparse iid matrices E. coli and the Game of Clones Coffee Break |