University of Cambridge > Talks.cam > Isaac Newton Institute Seminar Series > Information-theoretic perspectives on learning algorithms

Information-theoretic perspectives on learning algorithms

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact info@newton.ac.uk.

STS - Statistical scalability

In statistical learning theory, generalization error is used to quantify the degree to which a supervised machine learning algorithm may overfit to training data. We overview some recent work [Xu and Raginsky (2017)] that bounds generalization error of empirical risk minimization based on the mutual information I(S;W) between the algorithm input S and the algorithm output W. We leverage these results to derive generalization error bounds for a broad class of iterative algorithms that are characterized by bounded, noisy updates with Markovian structure, such as stochastic gradient Langevin dynamics (SGLD). We describe certain shortcomings of mutual information-based bounds, and propose alternate bounds that employ the Wasserstein metric from optimal transport theory. We compare the Wasserstein metric-based bounds with the mutual information-based bounds and show that for a class of data generating distributions, the former leads to stronger bounds on the generalization error.



This talk is part of the Isaac Newton Institute Seminar Series series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2018 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity