University of Cambridge > Talks.cam > Statistics > Oja's algorithm for sparse PCA

Oja's algorithm for sparse PCA

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Dr Sergio Bacallado.

Oja’s algorithm for streaming Principal Component Analysis (PCA) for n data points in a d-dimensional space achieves the same sin-squared error as the offline algorithm in O(d) space and O(nd) time and a single pass through the data points. Under this computational budget, we consider the problem of sparse PCA , where the principal eigenvector of the population covariance matrix Sigma is s-sparse, and the effective rank of Sigma can be large. In this setting, to our knowledge, there are no known single-pass algorithms that achieve the minimax error bound in O(d) space and O(nd) time without either requiring strong initialisation conditions or assuming further structure (e.g., spiked) of the covariance matrix. We show that a simple single-pass procedure that thresholds the output of Oja’s algorithm (the Oja vector) can achieve the minimax error bound under some regularity conditions in the aforementioned computational budget.

Our results are centred around a nontrivial and novel analysis of the entries of the unnormalised Oja vector, which involves projecting a product of independent random matrices on a random initial vector. This is completely different from previous analyses of Oja’s algorithm and matrix products, which have been done when the effective rank is relatively small or bounded.

This is joint work with Syamantak Kumar.

This talk is part of the Statistics series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2024 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity