Log in

Cambridge users (raven) details

Other users details

No account? details

Information on

Subscribing to talks details

Finding a talk details

Adding a talk details

Disseminating talks details

Help and Documentation details

Random Features for Kernel Approximation

Add to your list(s) Download to your calendar using vCal

Isaac Reid (University of Cambridge)
Wednesday 08 March 2023, 11:00-12:30
Cambridge University Engineering Department, CBL Seminar room BE4-38..

If you have a question about this talk, please contact James Allingham.

Zoom link available upon request (it is sent out on our mailing list, eng-mlg-rcc [at] lists.cam.ac.uk). Sign up to our mailing list for easier reminders.

Though ubiquitous and mathematically elegant, kernel methods notoriously suffer from poor scalability as dataset size grows on account of the need to store and invert the Gram matrix. This has motivated a number of kernel approximation techniques. Chief among them are random features, which construct low-rank decompositions to the Gram matrix via Monte Carlo methods. We begin by discussing Rahimi and Recht’s seminal paper on Random Fourier Features, which approximates stationary kernels with a randomised sum of sinusoids. We briefly draw parallels to the celebrated Johnson-Lindenstrauss transform, before discussing how Orthogonal Random Features enjoy better convergence. We demonstrate the effectiveness of these techniques for approximating attention in Transformers. Finally – if you will humour me – we will briefly discuss how carefully induced correlations between random features can further improve the quality of kernel approximation, describing the recently-introduced class of Simplex Random Features.

Papers:

Rahimi, A. and Recht, B. (2007). Random features for large-scale kernel machines. Advances in neural information processing systems, 20.

Johnson, W. B. (1984). Extensions of Lipschitz mappings into a Hilbert space. Contemp. Math., 26:189–206.

Yu, F. X. X., Suresh, A. T., Choromanski, K. M., Holtmann-Rice, D. N., and Kumar, S. (2016). Orthogonal random features. Advances in neural information processing systems, 29.

Choromanski, K., Likhosherstov, V., Dohan, D., Song, X., Gane, A., Sarlos, T., Hawkins, P., Davis, J., Mohiuddin, A., Kaiser, L., et al. (2020). Rethinking attention with performers. International Conference on Learning Representations, 9.

Reid, I., Choromanski, K., Likhosherstov, V., and Weller, A. (2023). Simplex random features. arXiv preprint arXiv:2301.13856

This talk is part of the Machine Learning Reading Group @ CUED series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

Log in

Information on

Random Features for Kernel Approximation

This talk is included in these lists:

Other lists

Other talks