![]() |
COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring. | ![]() |
University of Cambridge > Talks.cam > Machine Learning is Linear Algebra > Machine Learning is Linear Algebra
![]() Machine Learning is Linear AlgebraAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact . Correct Talks.cam link: https://talks.cam.ac.uk/talk/index/228163 I will talk about how modelling assumptions manifest themselves as algebraic structure in a variety of settings, including optimization, attention, and network parameters, and how we can algorithmically exploit that structure for better scaling laws with transformers. As part of this effort, I will present a unifying framework that enables searching among all linear operators expressible via an Einstein summation. This framework encompasses previously proposed structures, such as low-rank, Kronecker, Tensor-Train, and Monarch, along with many novel structures. We develop a taxonomy of all such operators based on their computational and algebraic properties, which provides insights into their compute-optimal scaling laws. Combining these insights with empirical evaluation, we identify a subset of structures that achieve better performance than dense layers as a function of training compute, which we then develop into a high-performance sparse mixture-of-experts layer. Talks.cam link – https://talks.cam.ac.uk/talk/index/228163 This talk is part of the Machine Learning is Linear Algebra series. This talk is included in these lists:
Note that ex-directory lists are not shown. |
Other listsPsych Impulse Programme - Maxwell Centre - NanoDTC Innovation Seminar Series Reproduction on Film: OutlawsOther talksConstraints on the Properties of Primordial Protocells Next Generation Sequencing Linton in context: a Granta valley landscape TBC Seminars in Cancer QBS |