Talks.cam will close on 1 July 2026, further information is available on the UIS Help Site
 

University of Cambridge > Talks.cam > Isaac Newton Institute Seminar Series > Clustering Dynamics in Mean-Field Models of Transformers

Clustering Dynamics in Mean-Field Models of Transformers

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact nobody.

RCL - Representing, calibrating & leveraging prediction uncertainty from statistics to machine learning

Transformers are a central architecture in modern deep learning, forming the backbone of large language models such as ChatGPT. In this talk, I will present a mathematical framework for studying how information—represented as “tokens”—evolves through the layers of such neural networks. Specifically, we consider a family of partial differential equations that describe how the distribution of tokens—modeled as particles interacting in a mean-field way—changes with depth. Numerical experiments reveal that, under certain conditions, these dynamics exhibit a metastable clustering phenomenon, where tokens group into well-separated clusters that evolve slowly over time. A rigorous analysis of this behavior uncovers a range of open questions and unexpected connections to analysis and geometry.

 

This talk is part of the Isaac Newton Institute Seminar Series series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2025 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity