University of Cambridge > Talks.cam > Machine Learning Reading Group @ CUED > The structure of curvature in neural networks

The structure of curvature in neural networks

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact .

Teams link available upon request (it is sent out on our mailing list, eng-mlg-rcc [at] lists.cam.ac.uk). Sign up to our mailing list for easier reminders via lists.cam.ac.uk.

The curvature of the loss function plays a pivotal role in numerous neural network applications, including second-order optimization, Bayesian deep learning, iterative pruning, and sharpness-aware minimization. However, the curvature matrix is typically intractable, containing O(p²) elements, where p denotes the number of parameters. Existing tractable approximations—such as block-diagonal and Kronecker-factored methods—often suffer from inaccuracy and lack theoretical guarantees. In this work, we introduce a novel theoretical framework that precisely characterizes the full structure of the curvature matrix by exploiting the intrinsic symmetries of neural networks, such as invariance under parameter permutations. For Multi-Layer Perceptrons (MLPs), our approach demonstrates that the global curvature can be represented using only O(d² + L²) independent factors, where d is the number of input/output dimensions and L is the number of layers. This significantly reduces the computational complexity compared to the O(p²) elements of the full matrix. These factors can be efficiently estimated, enabling accurate curvature computations. We further present preliminary extensions of our theory to Transformers and Recurrent Neural Networks (RNNs). To assess the practical impact of our framework, we apply second-order optimization to synthetic datasets, achieving substantially faster convergence than traditional optimization methods. Our findings offer new insights into the loss landscape of neural networks and open avenues for the development of more efficient methodologies in deep learning.

This talk is part of the Machine Learning Reading Group @ CUED series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2025 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity