COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring. |
University of Cambridge > Talks.cam > Isaac Newton Institute Seminar Series > Sharp Characterization and Control of Global Dynamics of SGDs with Heavy Tails
Sharp Characterization and Control of Global Dynamics of SGDs with Heavy TailsAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact nobody. TMLW02 - SGD: stability, momentum acceleration and heavy tails The empirical success of deep learning is often attributed to the mysterious ability of stochastic gradient descents (SGDs) to avoid sharp local minima in the loss landscape, as sharp minima are believed to lead to poor generalization. To unravel this mystery and potentially further enhance such capability of SGDs, it is imperative to go beyond the traditional local convergence analysis and obtain a comprehensive understanding of SGDs’ global dynamics within complex non-convex loss landscapes. In this talk, we characterize the global dynamics of SGDs through the heavy-tailed large deviations and local stability framework. This framework systematically characterizes the rare events in heavy-tailed dynamical systems; building on this, we characterize intricate phase transitions in the first exit times, which leads to the heavy-tailed counterparts of the classical Freidlin-Wentzell and Eyring-Kramers theories. Moreover, applying this framework to SGD , we reveal a fascinating phenomenon in deep learning: by injecting and then truncating heavy-tailed noises during the training phase, SGD can almost completely avoid sharp minima and hence achieve better generalization performance for the test data. This talk is part of the Isaac Newton Institute Seminar Series series. This talk is included in these lists:This talk is not included in any other list Note that ex-directory lists are not shown. |
Other listsLCLU 101 Lectures Forum for Youth Participation and Democracy LMBOther talksWelcome and Introduction to Prob_AI Plenary and Finish Stochastic approximation with heavy tailed noise A Probabilistic View of the LLM Residual Stream Gates Cambridge Annual Lecture 2024: A global turning point: how to escape the permacrisis Applications of proofs to network security |