COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring. |
AI + Pizza September 2018Add to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact Microsoft Research Cambridge Talks Admins. Please note, this event may be recorded. Microsoft will own the copyright of any recording and reserves the right to distribute it as required. Speaker one – Adria Garriga Alonso Title – Deep Convolutional Networks as shallow Gaussian Processes Abstract – We show that the output of a (residual) convolutional neural network (CNN) with an appropriate prior over the weights and biases is a Gaussian process (GP) in the limit of infinitely many convolutional filters. The result is an extension of the theorem for dense networks due to Alex Matthews et al. (2018), also presented in this AI+Pizza series. For a CNN , the equivalent kernel can be computed exactly and, unlike “deep kernels”, has very few parameters: only the hyperparameters of the original CNN . Further, we show that this kernel has two properties that allow it to be computed efficiently; the cost of evaluating the kernel for a pair of images is similar to a single forward pass through the original CNN with only one filter per layer.The kernel equivalent to a 32-layer ResNet obtains 0.84% classification error on MNIST , a new record for GPs with a comparable number of parameters. This is joint work with Laurence Aitchison and Carl Rasmussen. Speaker Two – Alexander Gaunt Title – Fixing Variational Bayes: Deterministic Variational Inference for Bayesian Neural Networks Abstract – Bayesian neural networks hold great promise as flexible and principled solution to deal with uncertainty when learning from finite data. Among approaches to realize probabilistic inference in deep neural networks, variational Bayes (VB) is principled, generally applicable, and computationally efficient. With wide recognition of potential advantages, why is it that variational Bayes has seen very limited practical use for neural networks in real applications? We argue that variational inference in neural networks is fragile: to get the approach to work requires careful initialization and tuning of prior variances as well as controlling the variance of stochastic gradient estimates. We fix VB and turn it into a robust inference tool for Bayesian neural networks. We achieve this by two innovations: first, we introduce a novel deterministic method to approximate moments in neural networks, reducing gradient variance to zero; second, we introduce a hierarchical prior for parameters and a novel Empirical Bayes procedure for automatically selecting prior variances. Combining these two innovations, the resulting method is highly efficient and robust. On the application of heteroscedastic regression we demonstrate strong predictive performance over alternative approaches. This talk is part of the AI+Pizza series. This talk is included in these lists:Note that ex-directory lists are not shown. |
Other listsBeyond Academia Type the title of a new list here Engineers Without BordersOther talksFrom cognitive neuroscience to the clinic: Translational concerns for mental health research Reused coffins from ancient Egypt: Tomb robbery or recycling of materials? Latent variable models: factor analysis and all that Transchromatic homotopy theory 2 From nuclear transfer to prospects for cell replacement therapy Do conservation efforts aimed at slowing deforestation and improving local wellbeing work - and how do we know? |