Defending Against Adversarial Attacks
Add to your list(s)
Download to your calendar using vCal
If you have a question about this talk, please contact jg801.
Adversarial examples are inputs which have been maliciously perturbed to induce inappropriate responses from a machine learning system, but which are generally indistinguishable from innocent inputs by humans. They thus represent a substantial threat to the reliability and practicability of ML applications, as systems vulnerable to manipulation in this way cannot be trusted with important decisions. Despite this, surprisingly little is understood about the mechanisms by which adversarial examples arise, and how we might construct systems which are resilient to attack by these samples. We chart the evolution of the literature on adversarial attacks by considering some initially proposed explanations for how they arise. We discuss some defence mechanisms such as adversarial training and the less obvious approach of network distillation. We then briefly summarise the current state of the field.
This talk is part of the Machine Learning Reading Group @ CUED series.
This talk is included in these lists:
Note that ex-directory lists are not shown.
|