BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//talks.cam.ac.uk//v3//EN
BEGIN:VTIMEZONE
TZID:Europe/London
BEGIN:DAYLIGHT
TZOFFSETFROM:+0000
TZOFFSETTO:+0100
TZNAME:BST
DTSTART:19700329T010000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=-1SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0100
TZOFFSETTO:+0000
TZNAME:GMT
DTSTART:19701025T020000
RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=-1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
CATEGORIES:Isaac Newton Institute Seminar Series
SUMMARY:Stochastic Quasi-Gradient Methods: Variance Reduct
 ion via Jacobian Sketching - Peter Richtarik (Univ
 ersity of Edinburgh\; King Abdullah University of 
 Science and Technology (KAUST)\; Moscow Institute 
 of Physics and Technology)
DTSTART;TZID=Europe/London:20190620T095000
DTEND;TZID=Europe/London:20190620T104000
UID:TALK126265AThttp://talks.cam.ac.uk
URL:http://talks.cam.ac.uk/talk/index/126265
DESCRIPTION:We develop a new family of variance reduced stocha
 stic gradient descent methods for minimizing the a
 verage of a very large number of smooth functions.
  Our method&mdash\;JacSketch&mdash\;is motivated b
 y novel de- velopments in randomized numerical lin
 ear algebra\, and operates by maintaining a stocha
 stic estimate of a Jacobian matrix composed of the
  gradients of individual functions. In each iterat
 ion\, JacSketch efficiently updates the Jacobian m
 atrix by first obtaining a random linear measureme
 nt of the true Jacobian through (cheap) sketching\
 , and then projecting the previous estimate onto t
 he solution space of a linear matrix equa- tion wh
 ose solutions are consistent with the measurement.
  The Jacobian estimate is then used to compute a v
 ariance-reduced unbiased estimator of the gradient
 \, followed by a stochastic gradient descent step.
  Our strategy is analogous to the way quasi-Newton
  methods maintain an estimate of the Hessian\, and
  hence our method can be seen as a stochastic q ua
 si-gradient method. Indeed\, quasi-Newton methods 
 project the current Hessian estimate onto a soluti
 on space of a linear equation consistent with a ce
 rtain linear (but non-random) measurement of the t
 rue Hessian. Our method can also be seen as stocha
 stic gradient descent applied to a controlled stoc
 hastic optimization reformulation of the original 
 problem\, where the control comes from the Jacobia
 n estimates. We prove that for smooth and strongly
  convex functions\, JacSketch converges linearly w
 ith a meaningful rate dictated by a single converg
 ence theorem which applies to general sketches. We
  also provide a refined convergence theorem which 
 applies to a smaller class of sketches\, featuring
  a novel proof technique based on a stochastic Lya
 punov function. This enables us to obtain sharper 
 complexity results for variants of JacSketch with 
 importance sampling. By specializing our general a
 pproach to specific sketching strategies\, JacSket
 ch reduces to the celebrated stochastic average gr
 adient (SAGA) method\, and<br><br>Co-authors: Robe
 rt Mansel Gower (Telecom ParisTech)\, Francis Bach
  (INRIA - ENS - PSL Research University)<br>
LOCATION:Seminar Room 1\, Newton Institute
CONTACT:INI IT
END:VEVENT
END:VCALENDAR