BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//talks.cam.ac.uk//v3//EN
BEGIN:VTIMEZONE
TZID:Europe/London
BEGIN:DAYLIGHT
TZOFFSETFROM:+0000
TZOFFSETTO:+0100
TZNAME:BST
DTSTART:19700329T010000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=-1SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0100
TZOFFSETTO:+0000
TZNAME:GMT
DTSTART:19701025T020000
RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=-1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
CATEGORIES:Machine Learning Reading Group @ CUED
SUMMARY:Learning polynomials with Neural Networks - Aldo
Pacchiano (Berkeley)
DTSTART;TZID=Europe/London:20160526T143000
DTEND;TZID=Europe/London:20160526T160000
UID:TALK66393AThttp://talks.cam.ac.uk
URL:http://talks.cam.ac.uk/talk/index/66393
DESCRIPTION:We study the effectiveness of learning low degree
polynomials using neural networks by the gradient
descent method. While neural networks have been sh
own to have great expressive power\, and gradient
descent has been widely used in prac- tice for lea
rning neural networks\, few theoretical guarantees
are known for such methods. In particular\, it is
well known that gradient descent can get stuck at
local minima\, even for simple classes of target
functions. In this paper\, we present several posi
tive theoretical results to support the effectiven
ess of neural networks. We focus on two- layer neu
ral networks where the bottom layer is a set of no
n-linear hidden nodes\, and the top layer node is
a linear function\, similar to Bar- ron (1993). We
show that for a randomly initialized neural netwo
rk with sufficiently many hidden units\, the gener
ic gradient descent algorithm learns any low degre
e polynomial\, assuming we initialize the weights
randomly
LOCATION:Engineering Department\, CBL Room 438
CONTACT:Yingzhen Li
END:VEVENT
END:VCALENDAR