Log in

Cambridge users (raven) details

Other users details

No account? details

Information on

Subscribing to talks details

Finding a talk details

Adding a talk details

Disseminating talks details

Help and Documentation details

A Geometrical Perspective on Deep Neural Networks

Add to your list(s) Download to your calendar using vCal

Stanislav Fort (Stanford University; formerly Google Research)
Monday 13 January 2020, 13:00-14:00
LT2, Computer Laboratory, William Gates Building.

If you have a question about this talk, please contact Ben Day.

Deep neural networks trained with gradient descent have been extremely successful at learning solutions to a broad suite of difficult problems across a wide range of domains such as vision, gameplay, and natural language, many of which had previously been considered to require intelligence. Despite their tremendous success, however, we still do not have a detailed, predictive understanding of how these systems work. In this talk, I will focus on recent insights into the structure of neural network loss landscapes and how they are navigated by gradient descent during training. In particular, I will discuss a phenomenological approach to modelling their large-scale structure [1,2], and its consequences for ensembling, calibration and Bayesian methods in general [3]. In addition, I will make a connection to empirical observations about loss gradients and Hessians [4,5]. I will conclude with an outlook on several interesting open questions in understanding deep networks.

Fort, Stanislav, and Adam Scherlis. “The Goldilocks zone: Towards better understanding of neural network loss landscapes.” Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 33. 2019. arXiv 1807.02581
Stanislav Fort, and Stanislaw Jastrzebski. “Large Scale Structure of Neural Network Loss Landscapes.” Advances in Neural Information Processing Systems 32 (NeurIPS 2019). arXiv 1906.04724
S Fort, H Hu, B Lakshminarayanan. “Deep Ensembles: A Loss Landscape Perspective.” arXiv 1912.02757
Stanislav Fort, Paweł Krzysztof Nowak, Stanislaw Jastrzebski, Srini Narayanan. “Stiffness: A New Perspective on Generalization in Neural Networks.” arXiv 1901.09491
Stanislav Fort, Surya Ganguli. “Emergent properties of the local geometry of neural loss landscapes.” arXiv 1910.05929

This talk is part of the Artificial Intelligence Research Group Talks (Computer Laboratory) series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

Log in

Information on

A Geometrical Perspective on Deep Neural Networks

This talk is included in these lists:

Other lists

Other talks