Talks.cam will close on 1 July 2026, further information is available on the UIS Help Site
 

University of Cambridge > Talks.cam > Statistics > The manifold hypothesis in science & AI

The manifold hypothesis in science & AI

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Qingyuan Zhao.

The manifold hypothesis is a widely accepted tenet of machine learning which asserts that nominally high-dimensional data are in fact concentrated around a low-dimensional manifold. In this talk, I will show some real examples of manifold structure occurring in science and in AI (internal representations of LLMs), and discuss associated research questions, particularly around how observed topology and geometry might map to the real world or human perceptions. I will present a statistical model and associated theory which explains how complex hidden manifold structure might emerge from simple statistical assumptions (e.g. latent variables, correlation, stationarity), exposing different possible mathematical relationships between the manifold and the ground truth (e.g. homeomorphism, isometry), and elucidating the efficacy of popular combinations of tools for data exploration (e.g. PCA followed by t-SNE).

Papers: Nick Whiteley, Annie Gray, Patrick Rubin-Delanchy. “Statistical exploration of the Manifold Hypothesis”. JRSSB (with discussion), to appear. Alexander Modell, Patrick Rubin-Delanchy, Nick Whiteley. “The Origins of Representation Manifolds in Large Language Models”, arXiv:2505.18235

This talk is part of the Statistics series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2025 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity