University of Cambridge > > Machine Learning Reading Group @ CUED > Information Geometry — Natural Gradient Descent

Information Geometry — Natural Gradient Descent

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact .

Zoom link available upon request (it is sent out on our mailing list, eng-mlg-rcc [at] Sign up to our mailing list for easier reminders.

Information geometry applies fundamental concepts of differential geometry to probability theory and statistics to study statistical manifolds, which are Riemannian manifolds in which each point in the manifold corresponds to a probability distribution. The perhaps most prominent application of information geometry in machine learning is natural gradient descent (NGD), which can be described as ‘gradient descent which respects the curvature of the statistical manifold’. In this session, we will be looking at NGD from different perspectives. In particular, we will consider NGD as 1) preconditioning the gradient with the inverse Fisher information matrix, 2) second-order optimisation with Newton’s method, 3) the result of stating gradient descent as a valid tensor equation, and (bonus) *) mirror descent in a dual Riemannian manifold. Previous knowledge of differential geometry is NOT required.

Reading (suggested, but not required): Martens, J. (2020). “New Insights and Perspectives on the Natural Gradient Method”.
 Journal of Machine Learning Research, Volume 21, Issue 146. (Sections 5, 8 – 8.1, 9.2, 6 pages)

Raskutti, G., Mukherjee, S. (2015). “The information geometry of mirror descent”.
 IEEE Transactions on Information Theory, Volume 61, Issue 3. (Sections 1, 2, 3 pages)

This talk is part of the Machine Learning Reading Group @ CUED series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.


© 2006-2024, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity