BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Efficient Retrieval of Influential LLM Training Examples - Roger G
 rosse (University of Toronto)
DTSTART:20260310T110000Z
DTEND:20260310T120000Z
UID:TALK245500@talks.cam.ac.uk
CONTACT:Lucas Resck
DESCRIPTION:Abstract: Attributing LLM outputs to the training examples tha
 t causally influence their behavior can give us visibility into LLMs’ op
 aque reasoning and help us understand subtle persona changes. Unfortunatel
 y\, finding training data attribution algorithms which are both accurate a
 nd scalable has remained an elusive goal. I argue for separately studying 
 an Estimation Problem (accurately estimating the causal effect of a traini
 ng example) and a Retrieval Problem (efficiently finding the highest-scori
 ng training examples). I then present a generic retrieval method for influ
 ential sequences which can be paired with a wide range of influence estima
 tors (including EKFAC) and for which one can obtain high confidence about 
 recall. I discuss how causal training data attribution can be used as a to
 ol to assure LLM alignment.\n\nBio: Roger is an Associate Professor of Com
 puter Science at the University of Toronto\, Schwartz Reisman Chair in Tec
 hnology and Society\, and a founding member of the Vector Institute. He is
  also a Member of Technical Staff on the Alignment Science Team at Anthrop
 ic\, where his work focuses on training data attribution. He holds a Schmi
 dt Sciences AI2050 Senior Fellowship\, Sloan Fellowship\, and Canada CIFAR
  AI Chair. His research has focused on better understanding neural net tra
 ining dynamics\, and using this understanding to improve training speed\, 
 generalization\, uncertainty estimation\, and automatic hyperparameter tun
 ing. He's now focusing on applying our understanding of deep learning to A
 I alignment. Given how fast AI is progressing\, the problem of ensuring AI
 s are robustly aligned with human values seems like the most important thi
 ng we can be working on now.
LOCATION:GR05 (English Faculty Building\, 9 West Road\, Sidgwick Site) and
  online (https://cam-ac-uk.zoom.us/j/86890624365?pwd=oYGWpY7d5r3JOaUCaJXTD
 0sRECFxab.1)
END:VEVENT
END:VCALENDAR
