COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring. |
University of Cambridge > Talks.cam > Machine Learning @ CUED > Learning via Data Compression: Bayesian Coresets and Sparse Variational Inference
Learning via Data Compression: Bayesian Coresets and Sparse Variational InferenceAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact Robert Peharz. We have reached a point in many fields of science and technology where we create data at a pace that far outstrips our capacity to process it. While a boon from a statistical perspective, this wealth of data presents a computational challenge: how might we design a model-based inference system that learns forever, retains important past information, doesn’t get bogged down by a persistent stream of new data, and makes inferences with guaranteed statistical quality? The human nervous system provides inspiration; to handle the astounding amount of perceptual data it constantly receives, the nervous system filters and compresses the data significantly before passing it along to the brain where learning occurs. Although a seemingly simple solution, it does raise interesting questions for the design of a computational inference system: how should we decide what data to retain, how should we compress it, and what degree of compression should we apply before learning from it? This talk will cover recent work on Bayesian coresets (“core of a dataset”), a methodology for statistical inference via data compression. Coresets achieve compression by forming a small weighted subset of data that replaces the full dataset during inference, leading to significant computational gains with provably minimal loss in inferential quality. In particular, the talk will present numerous methods for Bayesian coreset construction, from previously-developed subsampling, greedy, and sparse linear regression-based techniques to novel algorithms based on sparse variational inference (VI). In contrast to past algorithms, the sparse VI-based algorithms are fully automated, requiring only the dataset and probabilistic model specification as inputs. The talk will additionally provide a unifying view and statistical analysis of these methods using the theory of exponential families and Riemannian information geometry. The talk will conclude with empirical results showing that despite requiring much less user input than past methods, sparse VI coreset construction provides state-of-the-art data summarization for Bayesian inference. This talk is part of the Machine Learning @ CUED series. This talk is included in these lists:
Note that ex-directory lists are not shown. |
Other listsType the title of a new list here The LMS Hardy Lecture Celebrating 40 years of women at TrinityOther talksKeynote - Overview of Programme and what Components of Landscape Systems need to be Represented in Models? Lung Cancer: Part 1. Patient pathway and intervention. Part 2. Lung Cancer: Futurescape Transformation of perception: What are psychotic disorders and why are ethnic minorities at increased risk? Host-induced gene silencing vs RNA silencing suppressors - another battleground in host-pathogen arms race State-of-the-art in Hydrological Modelling for Landscape Decisions - Hydro-JULES: Next Generation Land Surface and Hydrological Predictions Epigenetic memory over geological timescales |