University of Cambridge > > Isaac Newton Institute Seminar Series > Stochastic Quasi-Gradient Methods: Variance Reduction via Jacobian Sketching

Stochastic Quasi-Gradient Methods: Variance Reduction via Jacobian Sketching

Add to your list(s) Download to your calendar using vCal

  • UserPeter Richtarik (University of Edinburgh; King Abdullah University of Science and Technology (KAUST); Moscow Institute of Physics and Technology)
  • ClockThursday 20 June 2019, 09:50-10:40
  • HouseSeminar Room 1, Newton Institute.

If you have a question about this talk, please contact INI IT.

ASCW03 - Approximation, sampling, and compression in high dimensional problems

We develop a new family of variance reduced stochastic gradient descent methods for minimizing the average of a very large number of smooth functions. Our method—JacSketch—is motivated by novel de- velopments in randomized numerical linear algebra, and operates by maintaining a stochastic estimate of a Jacobian matrix composed of the gradients of individual functions. In each iteration, JacSketch efficiently updates the Jacobian matrix by first obtaining a random linear measurement of the true Jacobian through (cheap) sketching, and then projecting the previous estimate onto the solution space of a linear matrix equa- tion whose solutions are consistent with the measurement. The Jacobian estimate is then used to compute a variance-reduced unbiased estimator of the gradient, followed by a stochastic gradient descent step. Our strategy is analogous to the way quasi-Newton methods maintain an estimate of the Hessian, and hence our method can be seen as a stochastic q uasi-gradient method. Indeed, quasi-Newton methods project the current Hessian estimate onto a solution space of a linear equation consistent with a certain linear (but non-random) measurement of the true Hessian. Our method can also be seen as stochastic gradient descent applied to a controlled stochastic optimization reformulation of the original problem, where the control comes from the Jacobian estimates. We prove that for smooth and strongly convex functions, JacSketch converges linearly with a meaningful rate dictated by a single convergence theorem which applies to general sketches. We also provide a refined convergence theorem which applies to a smaller class of sketches, featuring a novel proof technique based on a stochastic Lyapunov function. This enables us to obtain sharper complexity results for variants of JacSketch with importance sampling. By specializing our general approach to specific sketching strategies, JacSketch reduces to the celebrated stochastic average gradient (SAGA) method, and

Co-authors: Robert Mansel Gower (Telecom ParisTech), Francis Bach (INRIA – ENS - PSL Research University)

This talk is part of the Isaac Newton Institute Seminar Series series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.


© 2006-2024, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity