University of Cambridge > Talks.cam > Cambridge ML Systems Seminar Series > Navigating Privacy Risks in Language Models

Navigating Privacy Risks in Language Models

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Nic Lane.

The emergence of large language models (LLMs) presents significant opportunities in content generation, question answering, and information retrieval. Nonetheless, training, fine-tuning, and deploying these models entails privacy risks. This talk will address these risks, outlining privacy principles inspired by known LLM vulnerabilities when handling user data. We demonstrate how techniques like federated learning and user-level differential privacy (DP) can systematically mitigate many of these risks at the cost of increased computation. In scenarios where only moderate-to-weak user-level DP is achievable, we propose a strong (task-and-model-agnostic) membership inference attack that allows us to quantify risk by estimating the actual privacy leakage (empirical epsilon) accurately in a single training run. The talk will conclude with a few projections and compelling research directions.

This talk is part of the Cambridge ML Systems Seminar Series series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2024 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity