Efficient Retrieval of Influential LLM Training Examples
- 👤 Speaker: Roger Grosse (University of Toronto)
- 📅 Date & Time: Tuesday 10 March 2026, 11:00 - 12:00
- 📍 Venue: GR05 (English Faculty Building, 9 West Road, Sidgwick Site) and online (https://cam-ac-uk.zoom.us/j/86890624365?pwd=oYGWpY7d5r3JOaUCaJXTD0sRECFxab.1)
Abstract
Abstract: Attributing LLM outputs to the training examples that causally influence their behavior can give us visibility into LLMs’ opaque reasoning and help us understand subtle persona changes. Unfortunately, finding training data attribution algorithms which are both accurate and scalable has remained an elusive goal. I argue for separately studying an Estimation Problem (accurately estimating the causal effect of a training example) and a Retrieval Problem (efficiently finding the highest-scoring training examples). I then present a generic retrieval method for influential sequences which can be paired with a wide range of influence estimators (including EKFAC ) and for which one can obtain high confidence about recall. I discuss how causal training data attribution can be used as a tool to assure LLM alignment.
Bio: Roger is an Associate Professor of Computer Science at the University of Toronto, Schwartz Reisman Chair in Technology and Society, and a founding member of the Vector Institute. He is also a Member of Technical Staff on the Alignment Science Team at Anthropic, where his work focuses on training data attribution. He holds a Schmidt Sciences AI2050 Senior Fellowship, Sloan Fellowship, and Canada CIFAR AI Chair. His research has focused on better understanding neural net training dynamics, and using this understanding to improve training speed, generalization, uncertainty estimation, and automatic hyperparameter tuning. He’s now focusing on applying our understanding of deep learning to AI alignment. Given how fast AI is progressing, the problem of ensuring AIs are robustly aligned with human values seems like the most important thing we can be working on now.
Series This talk is part of the Language Technology Lab Seminars series.
Included in Lists
This talk is not included in any other list.
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)

Roger Grosse (University of Toronto)
Tuesday 10 March 2026, 11:00-12:00