COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring. |
Demystifying Deep LearningAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact HoD Secretary, DPMMS. Neural networks have made a startling comeback during the past decade, rebranded as “deep learning.” The empirical success of neural networks is phenomenal but poorly understood. The state-of-the-art seems to change every few months, prompting some to call it alchemy and others to suggest that wholly new mathematical approaches are required to understand neural networks. Contrary to this, I argue that deep learning can be understood by adapting standard nonparametric statistical theory and methods to the neural network setting. Our main result is this: neural networks are exact solutions to nonparametric estimation problems in “mixed variation” function spaces. The spaces, characterized by notions of total variation in the Radon (transform) domain, include multivariate functions that are very smooth in all but a small number of directions. Spatial inhomogeneity of this sort leads to a fundamental gap between the performance of neural networks and linear methods (which include kernel methods), explaining why neural networks can outperform classical methods for high-dimensional function estimation. Our theory provides new insights into the practices of “weight decay,” “overparameterization,” and adding linear connections and layers to network architectures. It yields a deeper understanding of the role of sparsity and (avoiding) the curse of dimensionality. And lastly, the theory leads to new and improved neural network architectures and regularization methods. This talk is based on joint work with Rahul Parhi. This talk is part of the Peter Whittle Lecture series. This talk is included in these lists:
Note that ex-directory lists are not shown. |
Other listsPartial Differential Equations seminar Economics & Policy seminars Climate and Sustainable Development Finance for Industrial Sustainability in Developing CountriesOther talksMembers' enthusiasms Approximation on the Cantor set and other related fractals The Future of History: From Cliodynamics to Degenerative Dystopia, via Science Fiction – gloknos Annual Lecture AI4ER-CEDSG group meeting The Quantum Strong Exponential-Time Hypothesis |