Log in

Cambridge users (raven) details

Other users details

No account? details

Information on

Subscribing to talks details

Finding a talk details

Adding a talk details

Disseminating talks details

Help and Documentation details

[Special Statslab Seminar] Scalable stochastic optimization and large-scale data

Add to your list(s) Download to your calendar using vCal

Michael W. Mahoney (ICSI and Department of Statistics, UC Berkeley)
Wednesday 08 May 2019, 14:00-15:00
MR11, Centre for Mathematical Sciences, Wilberforce Road, Cambridge.

If you have a question about this talk, please contact HoD Secretary, DPMMS.

Stochastic optimization is widely-used in many areas, most recently in large-scale machine learning and data science, but its use in these areas is quite different than its use in more traditional areas of operations research, scientific computing, and statistics. In particular, second order optimization methods have been ubiquitous historically, but they are rarely used in machine learning and data science, compared to their first order counterparts. Motivated by well-known problems of first order methods, however, recent work has begun to experiment with second order methods for machine learning problems. By exploiting recent results from Randomized Numerical Linear Algebra, we establish improved bounds for algorithms that incorporate sub-sampling as a way to improve computational efficiency, while maintaining the original convergence properties of these algorithms. These results provide quantitative convergence results for variants of Newton’s methods, where the Hessian and/or the gradient is uniformly or non-uniformly sub-sampled, under much weaker assumptions than prior work; and these results include extensions to trust region and cubic regularization algorithms for non-convex optimization problems. When applied to complex machine learning tasks such as training deep neural networks, empirical results demonstrate that these methods perform quite well, both in ways that one would expect (e.g., leading to improved conditioning in the presence of so-called exploding/vanishing gradients) as well as in ways that are more surprising but more interesting (e.g., using so-called adversarial examples to architect the objective function surface to be more amenable to optimization algorithms).

This talk is part of the Statistics series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

Log in

Information on

[Special Statslab Seminar] Scalable stochastic optimization and large-scale data

This talk is included in these lists:

Other lists

Other talks