Log in

Cambridge users (raven) details

Other users details

No account? details

Information on

Subscribing to talks details

Finding a talk details

Adding a talk details

Disseminating talks details

Help and Documentation details

Backprop through the Void: Optimizing Control Variates for Black-Box Gradient Estimation.

Add to your list(s) Download to your calendar using vCal

Geoff Roeder (University of Toronto)
Monday 27 November 2017, 11:00-12:00
CBL Seminar Room.

If you have a question about this talk, please contact .

Gradient-based optimization is the foundation of deep learning and reinforcement learning. Even when the mechanism being optimized is unknown or not differentiable, optimization using high-variance or biased gradient estimates is still often the best strategy. We introduce a general framework for learning low-variance, unbiased gradient estimators for black-box functions of random variables. Our method uses gradients of a neural network trained jointly with model parameters or policies, and is applicable in both discrete and continuous settings. We demonstrate this framework for training discrete latent-variable models. We also give an unbiased, action-conditional extension of the advantage actor-critic reinforcement learning algorithm.

This talk is part of the Machine Learning @ CUED series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

Log in

Information on

Backprop through the Void: Optimizing Control Variates for Black-Box Gradient Estimation.

This talk is included in these lists:

Other lists

Other talks