University of Cambridge > > Microsoft Research Cambridge, public talks > Learning Deep Architectures

Learning Deep Architectures

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Microsoft Research Cambridge Talks Admins.

Abstract Whereas theoretical work suggests that deep architectures might be computationally and statistically more efficient at representing highly-varying functions, training deep architectures was unsuccessful until the recent advent of algorithms based on unsupervised pre-training of each level of a hierarchically structured model. Several unsupervised criteria and procedures were proposed for this purpose, starting with the Restricted Boltzmann Machine (RBM), which when stacked gives rise to Deep Belief Networks (DBN). Although the partition function of RBMs is intractable, inference is tractable and we review several successful learning algorithms that have been proposed, in particular those using weights that change quickly during learning instead of converging. In addition to being impressive as generative models, DBNs have made an impact by being used to initialize deep supervised neural networks. We present surprising empirical results regarding the visualization of the intermediate representations learned, to help understand how these models learn to compose features in a hierarchy of features. Finally, in an attempt to understand the unsupervised pre-training effect, we describe a large set of simulations exploring the apparently conflicting hypotheses that unsupervised pre-training acts like a regularizer or that it helps optimizing a difficult non-convex criterion fraught with local minima.

Biography Yoshua Bengio (PhD`1991, McGill University) is professor at the Department of Computer Science and Operations Research, Universite de Montreal, and Canada Research Chair in Statistical Learning Algorithms, as well as NSERC -CGI Chair, and Fellow of the Canadian Institute for Advanced Research. He was program co-chair for NIPS `2008 and is general co-chair for NIPS `2009. His main ambition is to understand how learning can give rise to intelligence. He has been an early proponent of deep architectures and distributed representations as tools to bypass the curse of dimensionality and learn complex tasks. He contributed to many machine learning areas: neural networks, recurrent neural networks, graphical models, kernel machines, semi-supervised learning, unsupervised learning and manifold learning, pattern recognition, data-mining, natural language processing, machine vision, and time-series models.

This talk is part of the Microsoft Research Cambridge, public talks series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.


© 2006-2023, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity