University of Cambridge > Talks.cam > Machine Learning @ CUED > Noise-Aware Differentially Private Synthetic Data

Noise-Aware Differentially Private Synthetic Data

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Dr R.E. Turner.

Synthetic data generated under differential privacy (DP) promises to significantly simplify analysis of sensitive personal data. Existing work has shown that simply analysing DP synthetic data as if it were real does not produce valid inferences of population-level quantities, leading to too narrow confidence intervals and thereby risking false discoveries. We propose using multiple imputation techniques to avoid these problems. This requires simulating multiple synthetic data sets from the Bayesian posterior predictive distribution over data sets. We propose a novel noise-aware Bayesian DP synthetic data generation mechanism for discrete data that enables generating such a distribution of data sets. Our experiments demonstrate that the method is able to produce accurate confidence intervals from DP synthetic data.

This talk is part of the Machine Learning @ CUED series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2022 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity