BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//talks.cam.ac.uk//v3//EN
BEGIN:VTIMEZONE
TZID:Europe/London
BEGIN:DAYLIGHT
TZOFFSETFROM:+0000
TZOFFSETTO:+0100
TZNAME:BST
DTSTART:19700329T010000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=-1SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0100
TZOFFSETTO:+0000
TZNAME:GMT
DTSTART:19701025T020000
RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=-1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
CATEGORIES:Peter Whittle Lecture
SUMMARY:Demystifying Deep Learning - Rob Nowak (U. Wiscons
in)
DTSTART;TZID=Europe/London:20220317T170000
DTEND;TZID=Europe/London:20220317T180000
UID:TALK135121AThttp://talks.cam.ac.uk
URL:http://talks.cam.ac.uk/talk/index/135121
DESCRIPTION:Neural networks have made a startling comeback dur
ing the past decade\, rebranded as "deep learning.
" The empirical success of neural networks is phen
omenal but poorly understood. The state-of-the-art
seems to change every few months\, prompting some
to call it alchemy and others to suggest that who
lly new mathematical approaches are required to un
derstand neural networks. Contrary to this\, I arg
ue that deep learning can be understood by adaptin
g standard nonparametric statistical theory and me
thods to the neural network setting. Our main resu
lt is this: neural networks are exact solutions to
nonparametric estimation problems in "mixed varia
tion" function spaces. The spaces\, characterized
by notions of total variation in the Radon (trans
form) domain\, include multivariate functions that
are very smooth in all but a small number of dire
ctions. Spatial inhomogeneity of this sort leads t
o a fundamental gap between the performance of neu
ral networks and linear methods (which include ker
nel methods)\, explaining why neural networks can
outperform classical methods for high-dimensional
function estimation. Our theory provides new insig
hts into the practices of "weight decay\," "overp
arameterization\," and adding linear connections a
nd layers to network architectures. It yields a de
eper understanding of the role of sparsity and (av
oiding) the curse of dimensionality. And lastly\,
the theory leads to new and improved neural networ
k architectures and regularization methods.\n\nThi
s talk is based on joint work with Rahul Parhi.\n
LOCATION:Centre for Mathematical Sciences MR2
CONTACT:HoD Secretary\, DPMMS
END:VEVENT
END:VCALENDAR