![]() |
COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring. | ![]() |
University of Cambridge > Talks.cam > Signal Processing and Communications Lab Seminars > Latent Concepts in Large Language Models
![]() Latent Concepts in Large Language ModelsAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact Prof. Ramji Venkataramanan. Large Language Models (LLMs) have achieved remarkable fluency and versatility—but understanding how they represent meaning internally remains a challenge. In this talk, we explore the emerging science of latent concepts in LLMs: the semantic abstractions implicitly encoded in their internal activations. We examine how concepts—such as truthfulness, formality, or sentiment—can be represented as low-dimensional structures, discovered through training dynamics, and understood through the lens of linear algebra and associative memory. We discuss the implications for interpretability, robustness, and control, including how concepts can be steered at test time to adjust model behavior without retraining. Specifically, we explore empirical and theoretical evidence supporting the linear representation hypothesis, where such concepts correspond to vectors or affine subspaces, emerging naturally from training dynamics and next-token prediction objectives. We further show that LLMs behave as associative memory systems, retrieving outputs based on latent similarity rather than logical inference. This behavior underlies phenomena such as context hijacking, where semantically misleading prompts can bias the model’s response. We introduce formal latent concept models that unify these ideas, describe conditions under which concepts are identifiable, and propose learning algorithms for extracting interpretable, controllable representations. We argue that such latent concept modeling offers a principled framework for bridging representation learning with interpretability and model alignment, and offers a promising path toward safer, more controllable, and more trustworthy AI. This talk is part of the Signal Processing and Communications Lab Seminars series. This talk is included in these lists:
Note that ex-directory lists are not shown. |
Other listsCRUK CI Lecture Theatre CamPoS (Cambridge Philosophy of Science) seminar Metallic GlassOther talksLMB Seminar - Title TBC Flora & Fauna of New Zealand (part 1) Afternoon Tea Gates Cambridge presents Dr Leor Zmigrod, author of 'The Ideological Brain' The APGAR score, construct realizations, and the scale of clinical judgments Title TBC |