COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring. |
University of Cambridge > Talks.cam > Microsoft Research Cambridge, public talks > DiCE: The Infinitely Differentiable Monte-Carlo Estimator
DiCE: The Infinitely Differentiable Monte-Carlo EstimatorAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact Microsoft Research Cambridge Talks Admins. Please note, this event may be recorded. Microsoft will own the copyright of any recording and reserves the right to distribute it as required. The score function estimator is widely used for estimating gradients of stochastic objectives in Stochastic Computation Graphs (SCG), eg. in reinforcement learning and meta-learning. While deriving the first-order gradient estimators by differentiating a surrogate loss (SL) objective is computationally and conceptually simple, using the same approach for higher-order gradients is more challenging. Firstly, analytically deriving and implementing such estimators is laborious and not compliant with automatic differentiation. Secondly, repeatedly applying SL to construct new objectives for each order gradient involves increasingly cumbersome graph manipulations. Lastly, to match the first-order gradient under differentiation, SL treats part of the cost as a fixed sample, which we show leads to missing and wrong terms for higher-order gradient estimators. To address all these shortcomings in a unified way, we introduce DiCE, which provides a single objective that can be differentiated repeatedly, generating correct gradient estimators of any order in SCGs. Unlike SL, DiCE relies on automatic differentiation for performing the requisite graph manipulations. We verify the correctness of DiCE both through a proof and through numerical evaluation of the DiCE gradient estimates. We also use DiCE to propose and evaluate a novel approach for multi-agent learning. Our code is available at this URL This talk is part of the Microsoft Research Cambridge, public talks series. This talk is included in these lists:
Note that ex-directory lists are not shown. |
Other listsCambridge Science & Policy Exchange (CUSPE) - Events middle east studiesOther talksExercise intolerance in PAH: more than just the lungs! GOAL-ORIENTED ERROR ESTIMATION FOR PARAMETER-DEPENDENT NONLINEAR PROBLEMS, APPLICATION TO SENSITIVITY ANALYSIS The FABLE simulations: A feedback model for galaxies, groups and clusters 38th Cambridge Epigenetics Seminar Let Your Hands Do the Thinking! A Humble Practitioner's Approach to Lego® Serious Play™ in Business Education Cancer and Metbolism 2018 |