Log in

Cambridge users (raven) details

Other users details

No account? details

Information on

Subscribing to talks details

Finding a talk details

Adding a talk details

Disseminating talks details

Help and Documentation details

Coresets for scalable Bayesian logistic regression

Add to your list(s) Download to your calendar using vCal

Tamara Broderick (Massachusetts Institute of Technology)
Tuesday 04 July 2017, 13:30-14:15
Seminar Room 1, Newton Institute.

If you have a question about this talk, please contact INI IT.

SINW01 - Scalable statistical inference

Co-authors: Jonathan H. Huggins (MIT), Trevor Campbell (MIT)

The use of Bayesian methods in large-scale data settings is attractive because of the rich hierarchical models, uncertainty quantification, and prior specification they provide. However, standard Bayesian inference algorithms are computationally expensive, so their direct application to large datasets can be difficult or infeasible. Rather than modify existing algorithms, we instead leverage the insight that data is often redundant via a pre-processing step. In particular, we construct a weighted subset of the data (called a coreset) that is much smaller than the original dataset. We then input this small coreset to existing posterior inference algorithms without modification. To demonstrate the feasibility of this approach, we develop an efficient coreset construction algorithm for Bayesian logistic regression models. We provide theoretical guarantees on the size and approximation quality of the coreset—both for fixed, known datasets, and in expectation for a wide class o f data generative models. Our approach permits efficient construction of the coreset in both streaming and parallel settings, with minimal additional effort. We demonstrate the efficacy of our approach on a number of synthetic and real-world datasets, and find that, in practice, the size of the coreset is independent of the original dataset size.

This talk is part of the Isaac Newton Institute Seminar Series series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

Log in

Information on

Coresets for scalable Bayesian logistic regression

This talk is included in these lists:

Other lists

Other talks