University of Cambridge > Talks.cam > DAMTP Data Intensive Science Seminar > Creativity in diffusion models: Insights from statistical physics and compositional grammars (Part 2)

Creativity in diffusion models: Insights from statistical physics and compositional grammars (Part 2)

Download to your calendar using vCal

If you have a question about this talk, please contact Sven Krippendorf .

How do generative AI systems, such as diffusion models, learn to create new data? I will argue that natural data – such as images or text – can be described as hierarchical compositions of features, which generative models learn to recombine in novel ways. To analyze this mechanism, I will introduce ensembles of synthetic grammars as models of structured data. Within this framework, I will present a theory of composition that predicts a phase transition in the generative dynamics of diffusion models, confirmed in modern architectures trained on real-world datasets. In the second part of the talk, I will discuss how such grammars can be learned: what statistical correlations a learner can exploit to infer grammatical structure and the sample complexity required to do so. This analysis shows that diffusion models progressively build internal representations corresponding to increasingly abstract latent variables, a procedure reminiscent of the renormalization group in physics.

This talk is part of the DAMTP Data Intensive Science Seminar series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

Š 2006-2025 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity