COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring. |
University of Cambridge > Talks.cam > Machine Learning @ CUED > Distributed stochastic optimization for deep learning
Distributed stochastic optimization for deep learningAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact Louise Segar. Via Skype We study the problem of how to distribute the training of large-scale deep learning models in the parallel computing environment. We propose a new distributed stochastic optimization method called Elastic Averaging SGD (EASGD). We analyze the convergence rate of the EASGD method in the synchronous scenario and compare its stability condition with the existing ADMM method in the round-robin scheme. An asynchronous and momentum variant of the EASGD method is applied to train deep convolutional neural networks for image classification on the CIFAR and ImageNet datasets. Our approach accelerates the training and furthermore achieves better test accuracy. It also requires a much smaller amount of communication than other common baseline approaches such as the DOWNPOUR method. We then investigate the limit in speedup of the initial and the asymptotic phase of the mini-batch SGD , the momentum SGD , and the EASGD methods. We find that the spread of the input data distribution has a big impact on their initial convergence rate and stability region. We also find a surprising connection between the momentum SGD and the EASGD method with a negative moving average rate. A non-convex case is also studied to understand when EASGD can get trapped by a saddle point. Finally, we scale up the EASGD method by using a tree structured network topology. We show empirically its advantage and challenge. We also establish a connection between the EASGD and the DOWNPOUR method with the classical Jacobi and the Gauss-Seidel method, thus unifying a class of distributed stochastic optimization methods. (See https://arxiv.org/abs/1605.02216) This talk is part of the Machine Learning @ CUED series. This talk is included in these lists:
Note that ex-directory lists are not shown. |
Other listsCentre for Risk Studies - talks & events CISA Panel on 2013 Italian Elections Correlated quantum systems discussion group Cambridge Neurological Society Contemporary Political Theory The Smuts Memorial Fund LectureOther talksSpeak white, speak black, speak American Making Refuge: Calais and Cambridge Understanding model diversity in CMIP5 projections of westerly winds over the Southern Ocean Activism and scholarship: Fahamu's role in shaping knowledge production in Africa Fluorescence spectroscopy and Microscale thermophoresis |