Log in

Cambridge users (raven) details

Other users details

No account? details

Information on

Subscribing to talks details

Finding a talk details

Adding a talk details

Disseminating talks details

Help and Documentation details

Natural gradient in deep neural networks

Add to your list(s) Download to your calendar using vCal

Alberto Bernacchia (University of Cambridge)
Wednesday 21 November 2018, 14:00-15:30
Engineering Department, CBL Room 438.

If you have a question about this talk, please contact Robert Pinsler.

We introduce the natural gradient method for stochastic optimization, and discuss whether and how this method could be applied to deep neural networks. We motivate the natural gradient by showing that the performance of stochastic gradient descent depends heavily on the choice of parameters, and it does not take into account the information geometry of the model. We show that this geometry is described by the Fisher information metric, and the steepest descent in the loss function is realized by the natural gradient, which is invariant to changes in parameters. We connect natural gradient with second-order optimization methods and discuss possible applications to deep neural networks. In particular, we present K-FAC, a specific method based on approximating the inverse Fisher information matrix as Kronecker-factorized blocks and independent layers. This allows connecting a variety of different methods under a unified framework (e.g. adaptive gradients, batch normalization, whitening). We describe applications of K-FAC to both standard and convolutional neural networks, and compare with state-of-art methods.

This talk is part of the Machine Learning Reading Group @ CUED series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

Log in

Information on

Natural gradient in deep neural networks

This talk is included in these lists:

Other lists

Other talks