Talks.cam will close on 1 July 2026, further information is available on the UIS Help Site
 

University of Cambridge > Talks.cam > Information Theory Seminar > Towards a Theoretical Understanding of Deep Learning via the Minimum Description Length Principle

Towards a Theoretical Understanding of Deep Learning via the Minimum Description Length Principle

Add to your list(s) Download to your calendar using vCal

  • UserDr Yoshinari Takeishi, Kyushu University
  • ClockWednesday 04 February 2026, 14:00-15:00
  • HouseMR5, CMS Pavilion A.

If you have a question about this talk, please contact Prof. Ramji Venkataramanan.

Deep learning is a core machine learning technology that has driven the rapid improvement and broad adoption of artificial intelligence in recent years. It is based on learning with multilayer neural networks, and models with massive numbers of parameters, most notably large language models (LLMs), have shown remarkable performance. However, the theoretical foundations for why such large-scale models can be trained successfully and achieve high generalization performance are still incomplete, and many researchers are actively working on this problem. In particular, there is a gap between the insight from classical information criteria such as AIC and MDL , which suggests that preventing overfitting requires selecting a model of an appropriate size, and the empirical success of modern deep learning. Bridging this gap is an important challenge.

In this talk, we tackle these theoretical challenges in deep learning from the viewpoint of the Minimum Description Length (MDL) principle. We first focus on a simple two-layer neural network and present how one can obtain performance guarantees for an MDL estimator by leveraging a distinctive eigenvalue structure of the Fisher information matrix that we have recently identified. We then discuss prospects for extending this approach to more complex deep neural networks.

(The previous talk on 28 January will be provide useful background, but this talk will be self-contained.)

This talk is part of the Information Theory Seminar series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2026 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity