COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring. |
University of Cambridge > Talks.cam > Statistics > [Special Statslab Seminar] Scalable stochastic optimization and large-scale data
[Special Statslab Seminar] Scalable stochastic optimization and large-scale dataAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact HoD Secretary, DPMMS. Stochastic optimization is widely-used in many areas, most recently in large-scale machine learning and data science, but its use in these areas is quite different than its use in more traditional areas of operations research, scientific computing, and statistics. In particular, second order optimization methods have been ubiquitous historically, but they are rarely used in machine learning and data science, compared to their first order counterparts. Motivated by well-known problems of first order methods, however, recent work has begun to experiment with second order methods for machine learning problems. By exploiting recent results from Randomized Numerical Linear Algebra, we establish improved bounds for algorithms that incorporate sub-sampling as a way to improve computational efficiency, while maintaining the original convergence properties of these algorithms. These results provide quantitative convergence results for variants of Newton’s methods, where the Hessian and/or the gradient is uniformly or non-uniformly sub-sampled, under much weaker assumptions than prior work; and these results include extensions to trust region and cubic regularization algorithms for non-convex optimization problems. When applied to complex machine learning tasks such as training deep neural networks, empirical results demonstrate that these methods perform quite well, both in ways that one would expect (e.g., leading to improved conditioning in the presence of so-called exploding/vanishing gradients) as well as in ways that are more surprising but more interesting (e.g., using so-called adversarial examples to architect the objective function surface to be more amenable to optimization algorithms). This talk is part of the Statistics series. This talk is included in these lists:
Note that ex-directory lists are not shown. |
Other listszangwill cued LCHES Seminars on Human EvolutionOther talksCurator’s guide to the Beggarstaffs exhibition Rhubarb Hour @ Biomedical Postdoc Centre (31/07) Some remarks on mathematical theories of liquid crystals Sleep and vocabulary consolidation: Perspectives from typical and atypical development and sleep deprived teens Habitual Sophia? Retaining one’s ability to learn from within aach lay-professional encounter |