University of Cambridge > > Peter Whittle Lecture > Demystifying Deep Learning

Demystifying Deep Learning

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact HoD Secretary, DPMMS.

Neural networks have made a startling comeback during the past decade, rebranded as “deep learning.” The empirical success of neural networks is phenomenal but poorly understood. The state-of-the-art seems to change every few months, prompting some to call it alchemy and others to suggest that wholly new mathematical approaches are required to understand neural networks. Contrary to this, I argue that deep learning can be understood by adapting standard nonparametric statistical theory and methods to the neural network setting. Our main result is this: neural networks are exact solutions to nonparametric estimation problems in “mixed variation” function spaces. The spaces, characterized by notions of total variation in the Radon (transform) domain, include multivariate functions that are very smooth in all but a small number of directions. Spatial inhomogeneity of this sort leads to a fundamental gap between the performance of neural networks and linear methods (which include kernel methods), explaining why neural networks can outperform classical methods for high-dimensional function estimation. Our theory provides new insights into the practices of “weight decay,” “overparameterization,” and adding linear connections and layers to network architectures. It yields a deeper understanding of the role of sparsity and (avoiding) the curse of dimensionality. And lastly, the theory leads to new and improved neural network architectures and regularization methods.

This talk is based on joint work with Rahul Parhi.

This talk is part of the Peter Whittle Lecture series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.


© 2006-2024, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity