BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//talks.cam.ac.uk//v3//EN
BEGIN:VTIMEZONE
TZID:Europe/London
BEGIN:DAYLIGHT
TZOFFSETFROM:+0000
TZOFFSETTO:+0100
TZNAME:BST
DTSTART:19700329T010000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=-1SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0100
TZOFFSETTO:+0000
TZNAME:GMT
DTSTART:19701025T020000
RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=-1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
CATEGORIES:CL-CompBio
SUMMARY:Understanding the loss landscapes of large neural
networks: scaling\, generalization\, and robustnes
s - Stanislav Fort\, Stanford University
DTSTART;TZID=Europe/London:20211015T160000
DTEND;TZID=Europe/London:20211015T170000
UID:TALK164239AThttp://talks.cam.ac.uk
URL:http://talks.cam.ac.uk/talk/index/164239
DESCRIPTION:Large deep neural networks trained with gradient d
escent have been extremely successful at learning
solutions to a broad suite of difficult problems a
cross a wide range of domains. Despite their treme
ndous success\, we still do not have a detailed\,
predictive understanding of how they work and what
makes them so effective. In this talk\, I will de
scribe recent efforts to understand the structure
of deep neural network loss landscapes and how gra
dient descent navigates them during training. In p
articular\, I will discuss a phenomenological appr
oach to modeling their large-scale structure using
high-dimensional geometry [1]\, the role of their
nonlinear nature in the early phases of training
[2]\, its effects on ensembling\, calibration\, an
d approximate Bayesian techniques [3]\, and the qu
estions of model scaling\, multi-modality\, pre-tr
aining and their connections to out-of-distributio
n robustness and generalization [4].\n\n[1] Stanis
lav Fort\, and Stanislaw Jastrzebski. “Large Scale
Structure of Neural Network Loss Landscapes.” Neu
rIPS 2019. arXiv 1906.04724\n\n[2] Stanislav Fort
et al. "Deep learning versus kernel learning: an e
mpirical study of loss landscape geometry and the
time evolution of the Neural Tangent Kernel". Neur
IPS 2020. arXiv 2010.15110\n\n[3] Stanislav Fort\,
Huiyi Hu\, Balaji Lakshminarayanan. "Deep Ensembl
es: A Loss Landscape Perspective." arXiv 1912.0275
7\n\n[4] Stanislav Fort\, Jie Ren\, and Balaji Lak
shminarayanan. Exploring the Limits of Out-of-Dist
ribution Detection. NeurIPS 2021. arXiv 2106.03004
\n
LOCATION:Department of Computer Science and technology\, Le
cture Theatre 1
CONTACT:Pietro Lio
END:VEVENT
END:VCALENDAR