University of Cambridge > Talks.cam > Artificial Intelligence Research Group Talks (Computer Laboratory) > A Geometrical Perspective on Deep Neural Networks

A Geometrical Perspective on Deep Neural Networks

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Ben Day.

Deep neural networks trained with gradient descent have been extremely successful at learning solutions to a broad suite of difficult problems across a wide range of domains such as vision, gameplay, and natural language, many of which had previously been considered to require intelligence. Despite their tremendous success, however, we still do not have a detailed, predictive understanding of how these systems work. In this talk, I will focus on recent insights into the structure of neural network loss landscapes and how they are navigated by gradient descent during training. In particular, I will discuss a phenomenological approach to modelling their large-scale structure [1,2], and its consequences for ensembling, calibration and Bayesian methods in general [3]. In addition, I will make a connection to empirical observations about loss gradients and Hessians [4,5]. I will conclude with an outlook on several interesting open questions in understanding deep networks.

  1. Fort, Stanislav, and Adam Scherlis. “The Goldilocks zone: Towards better understanding of neural network loss landscapes.” Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 33. 2019. arXiv 1807.02581
  2. Stanislav Fort, and Stanislaw Jastrzebski. “Large Scale Structure of Neural Network Loss Landscapes.” Advances in Neural Information Processing Systems 32 (NeurIPS 2019). arXiv 1906.04724
  3. S Fort, H Hu, B Lakshminarayanan. “Deep Ensembles: A Loss Landscape Perspective.” arXiv 1912.02757
  4. Stanislav Fort, PaweÅ‚ Krzysztof Nowak, Stanislaw Jastrzebski, Srini Narayanan. “Stiffness: A New Perspective on Generalization in Neural Networks.” arXiv 1901.09491
  5. Stanislav Fort, Surya Ganguli. “Emergent properties of the local geometry of neural loss landscapes.” arXiv 1910.05929

This talk is part of the Artificial Intelligence Research Group Talks (Computer Laboratory) series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2024 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity