Log in

Cambridge users (raven) details

Other users details

No account? details

Information on

Subscribing to talks details

Finding a talk details

Adding a talk details

Disseminating talks details

Help and Documentation details

The structure of curvature in neural networks

Add to your list(s) Download to your calendar using vCal

Alberto Bernacchia (MediaTek Research UK)
Wednesday 18 June 2025, 11:00-12:30
Cambridge University Engineering Department, CBL Seminar room BE4-38..

If you have a question about this talk, please contact .

Teams link available upon request (it is sent out on our mailing list, eng-mlg-rcc [at] lists.cam.ac.uk). Sign up to our mailing list for easier reminders via lists.cam.ac.uk.

The curvature of the loss function plays a pivotal role in numerous neural network applications, including second-order optimization, Bayesian deep learning, iterative pruning, and sharpness-aware minimization. However, the curvature matrix is typically intractable, containing O(p²) elements, where p denotes the number of parameters. Existing tractable approximations—such as block-diagonal and Kronecker-factored methods—often suffer from inaccuracy and lack theoretical guarantees. In this work, we introduce a novel theoretical framework that precisely characterizes the full structure of the curvature matrix by exploiting the intrinsic symmetries of neural networks, such as invariance under parameter permutations. For Multi-Layer Perceptrons (MLPs), our approach demonstrates that the global curvature can be represented using only O(d² + L²) independent factors, where d is the number of input/output dimensions and L is the number of layers. This significantly reduces the computational complexity compared to the O(p²) elements of the full matrix. These factors can be efficiently estimated, enabling accurate curvature computations. We further present preliminary extensions of our theory to Transformers and Recurrent Neural Networks (RNNs). To assess the practical impact of our framework, we apply second-order optimization to synthetic datasets, achieving substantially faster convergence than traditional optimization methods. Our findings offer new insights into the loss landscape of neural networks and open avenues for the development of more efficient methodologies in deep learning.

This talk is part of the Machine Learning Reading Group @ CUED series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

Log in

Information on

The structure of curvature in neural networks

This talk is included in these lists:

Other lists

Other talks