Noise-Aware Differentially Private Synthetic Data
Add to your list(s)
Download to your calendar using vCal
If you have a question about this talk, please contact Dr R.E. Turner.
Synthetic data generated under differential privacy (DP) promises to significantly simplify analysis of sensitive personal data. Existing work has shown that simply analysing DP synthetic data as if it were real does not produce valid inferences of population-level quantities, leading to too narrow confidence intervals and thereby risking false discoveries. We propose using multiple imputation techniques to avoid these problems. This requires simulating multiple synthetic data sets from the Bayesian posterior predictive distribution over data sets. We propose a novel noise-aware Bayesian DP synthetic data generation mechanism for discrete data that enables generating such a distribution of data sets. Our experiments demonstrate that the method is able to produce accurate confidence intervals from DP synthetic data.
This talk is part of the Machine Learning @ CUED series.
This talk is included in these lists:
Note that ex-directory lists are not shown.
|