COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring. |
Dropout as a Structured Shrinkage PriorAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact Microsoft Research Cambridge Talks Admins. Dropout regularization of deep neural networks has been a mysterious yet effective tool to prevent overfitting. Explanations for its success range from the prevention of “co-adapted” weights to it being a form of cheap Bayesian inference. We propose a novel framework for understanding multiplicative noise in neural networks, considering continuous distributions as well as Bernoulli noise (i.e. dropout). We show that multiplicative noise induces structured shrinkage priors on a network’s weights. We derive the equivalence through reparametrization properties of scale mixtures and without invoking any approximations. We leverage these insights to propose a novel shrinkage framework for resnets, terming the prior “automatic depth determination” as it is the natural analog of “automatic relevance determination” for network depth. This talk is part of the AI+Pizza series. This talk is included in these lists:Note that ex-directory lists are not shown. |
Other listsMotivational speak on ghor jamai MRC Cognition and Brain Sciences Unit- Chaucer Club Modern European History Research SeminarOther talksThe Beginnings of Global Opera Commensal E. coli are a reservoir for the transfer of XDR plasmids into epidemic fluoroquinolone-resistant Shigella sonnei Sparse minors in graphs Learning from sociological case studies Medical heritage as cultural property: pan-African politics and global IP precedents in the 1960s and 1970s |