University of Cambridge > Talks.cam > Isaac Newton Institute Seminar Series > Sharp Characterization and Control of Global Dynamics of SGDs with Heavy Tails

Sharp Characterization and Control of Global Dynamics of SGDs with Heavy Tails

Add to your list(s) Download to your calendar using vCal

  • UserXingyu Wang (Northwestern University)
  • ClockThursday 25 April 2024, 09:45-10:30
  • HouseExternal.

If you have a question about this talk, please contact nobody.

TMLW02 - SGD: stability, momentum acceleration and heavy tails

The empirical success of deep learning is often attributed to the mysterious ability of stochastic gradient descents (SGDs) to avoid sharp local minima in the loss landscape, as sharp minima are believed to lead to poor generalization. To unravel this mystery and potentially further enhance such capability of SGDs, it is imperative to go beyond the traditional local convergence analysis and obtain a comprehensive understanding of SGDs’ global dynamics within complex non-convex loss landscapes. In this talk, we characterize the global dynamics of SGDs through the heavy-tailed large deviations and local stability framework. This framework systematically characterizes the rare events in heavy-tailed dynamical systems; building on this, we characterize intricate phase transitions in the first exit times, which leads to the heavy-tailed counterparts of the classical Freidlin-Wentzell and Eyring-Kramers theories. Moreover, applying this framework to SGD , we reveal a fascinating phenomenon in deep learning: by injecting and then truncating heavy-tailed noises during the training phase, SGD can almost completely avoid sharp minima and hence achieve better generalization performance for the test data.  

This talk is part of the Isaac Newton Institute Seminar Series series.

Tell a friend about this talk:

This talk is included in these lists:

This talk is not included in any other list

Note that ex-directory lists are not shown.

 

© 2006-2024 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity