Log in

Cambridge users (raven) details

Other users details

No account? details

Information on

Subscribing to talks details

Finding a talk details

Adding a talk details

Disseminating talks details

Help and Documentation details

Implicit Regularization in Deep Learning

Add to your list(s) Download to your calendar using vCal

Jezabel Garcia, Alberto Bernacchia (MediaTek Research)
Wednesday 27 October 2021, 11:00-12:30
Cambridge University Engineering Department ,LR3A.

If you have a question about this talk, please contact Elre Oldewage.

Empirically, it has been observed that overparameterized neural networks trained by stochastic gradient descent (SGD) generalize well, even in absence of any explicit regularization. Because of overparameterization, there exist minima of the training loss which generalize poorly, but such bad minima are never encountered in practice. In recent years, a growing body of work suggests that the optimizer (SGD or similar) implicitly regularizes the training process and leads towards good minima that generalize well. In this presentation, we review three (non-exclusive) theories that aim at quantifying this effect: 1) Minibach noise in SGD avoids sharp minima that generalize poorly, 2) Gradient descent finds solutions with minimum norm, 3) SGD is equivalent to regularized gradient flow. These theories may improve our understanding of optimization and generalization in overparameterized models.

Readings:

https://arxiv.org/abs/1611.03530

https://arxiv.org/abs/1710.06451

https://arxiv.org/abs/2002.09277

https://arxiv.org/abs/1905.13655

https://arxiv.org/abs/2101.12176

Zoom link: https://eng-cam.zoom.us/j/82019956685?pwd=WUNSVVcrdC9IZGxQOHFhSThjUjd2dz09

This talk is part of the Machine Learning Reading Group @ CUED series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

Log in

Information on

Implicit Regularization in Deep Learning

This talk is included in these lists:

Other lists

Other talks