University of Cambridge > Talks.cam > AI+Pizza > AI + Pizza September 2018

AI + Pizza September 2018

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Microsoft Research Cambridge Talks Admins.

Please note, this event may be recorded. Microsoft will own the copyright of any recording and reserves the right to distribute it as required.

Speaker one – Adria Garriga Alonso

Title – Deep Convolutional Networks as shallow Gaussian Processes

Abstract – We show that the output of a (residual) convolutional neural network (CNN) with an appropriate prior over the weights and biases is a Gaussian process (GP) in the limit of infinitely many convolutional filters. The result is an extension of the theorem for dense networks due to Alex Matthews et al. (2018), also presented in this AI+Pizza series.

For a CNN , the equivalent kernel can be computed exactly and, unlike “deep kernels”, has very few parameters: only the hyperparameters of the original CNN . Further, we show that this kernel has two properties that allow it to be computed efficiently; the cost of evaluating the kernel for a pair of images is similar to a single forward pass through the original CNN with only one filter per layer.The kernel equivalent to a 32-layer ResNet obtains 0.84% classification error on MNIST , a new record for GPs with a comparable number of parameters.

This is joint work with Laurence Aitchison and Carl Rasmussen.

Speaker Two – Alexander Gaunt

Title – Fixing Variational Bayes: Deterministic Variational Inference for Bayesian Neural Networks

Abstract – Bayesian neural networks hold great promise as flexible and principled solution to deal with uncertainty when learning from finite data. Among approaches to realize probabilistic inference in deep neural networks, variational Bayes (VB) is principled, generally applicable, and computationally efficient. With wide recognition of potential advantages, why is it that variational Bayes has seen very limited practical use for neural networks in real applications? We argue that variational inference in neural networks is fragile: to get the approach to work requires careful initialization and tuning of prior variances as well as controlling the variance of stochastic gradient estimates. We fix VB and turn it into a robust inference tool for Bayesian neural networks. We achieve this by two innovations: first, we introduce a novel deterministic method to approximate moments in neural networks, reducing gradient variance to zero; second, we introduce a hierarchical prior for parameters and a novel Empirical Bayes procedure for automatically selecting prior variances. Combining these two innovations, the resulting method is highly efficient and robust. On the application of heteroscedastic regression we demonstrate strong predictive performance over alternative approaches.

This talk is part of the AI+Pizza series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2024 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity