![]() |
COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring. | ![]() |
University of Cambridge > Talks.cam > Machine Learning Reading Group @ CUED > The structure of curvature in neural networks
![]() The structure of curvature in neural networksAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact . Teams link available upon request (it is sent out on our mailing list, eng-mlg-rcc [at] lists.cam.ac.uk). Sign up to our mailing list for easier reminders via lists.cam.ac.uk. The curvature of the loss function plays a pivotal role in numerous neural network applications, including second-order optimization, Bayesian deep learning, iterative pruning, and sharpness-aware minimization. However, the curvature matrix is typically intractable, containing O(p²) elements, where p denotes the number of parameters. Existing tractable approximations—such as block-diagonal and Kronecker-factored methods—often suffer from inaccuracy and lack theoretical guarantees. In this work, we introduce a novel theoretical framework that precisely characterizes the full structure of the curvature matrix by exploiting the intrinsic symmetries of neural networks, such as invariance under parameter permutations. For Multi-Layer Perceptrons (MLPs), our approach demonstrates that the global curvature can be represented using only O(d² + L²) independent factors, where d is the number of input/output dimensions and L is the number of layers. This significantly reduces the computational complexity compared to the O(p²) elements of the full matrix. These factors can be efficiently estimated, enabling accurate curvature computations. We further present preliminary extensions of our theory to Transformers and Recurrent Neural Networks (RNNs). To assess the practical impact of our framework, we apply second-order optimization to synthetic datasets, achieving substantially faster convergence than traditional optimization methods. Our findings offer new insights into the loss landscape of neural networks and open avenues for the development of more efficient methodologies in deep learning. This talk is part of the Machine Learning Reading Group @ CUED series. This talk is included in these lists:
Note that ex-directory lists are not shown. |
Other listsCentre of African Studies and the Department of the History and Philosophy of Science LfL Supper Seminar Chutes and Ladders: Supports and Challenges to Teacher Leadership Development SportOther talksChile (part 1) Morning Coffee Poster spotlights All models are wrong and yours are useless: making clinical prediction models impactful for patients Persistent Homology for Inferencing on Cancer Imaging Informal discussion |