Log in

Cambridge users (raven) details

Other users details

No account? details

Information on

Subscribing to talks details

Finding a talk details

Adding a talk details

Disseminating talks details

Help and Documentation details

Provably Safe Certification for Machine Learning Models under Adversarial Attacks

Add to your list(s) Download to your calendar using vCal

Prof. Miguel Rodrigues, UCL
Wednesday 22 November 2023, 14:00-15:00
MR5, CMS Pavilion A.

If you have a question about this talk, please contact Prof. Ramji Venkataramanan.

It is widely known that state-of-the-art machine learning models — including vision and language ones — can be seriously compromised by adversarial perturbations, so it is also increasingly relevant to develop capability to certify their performance in the presence of the most effective adversarial attacks.

This talk will introduce an approach inspired by distribution-free risk controlling procedures to certify the performance of machine learning models in the presence of adversarial attacks, with population level risk guarantees. In particular, given a specific attack, we will introduce the notion of a machine learning model (alpha, zeta)—safety guarantee: this guarantee, which is supported by a testing procedure based on the availability of a calibration set, entails one will only declare that a machine learning model adversarial (population) risk is less than alpha (i.e. the model is safe) given that the model adversarial (population) risk is higher than alpha (i.e. the model is in fact unsafe), with probability less than zeta. We will also introduce Bayesian optimization oriented approaches to determine very efficiently whether or not a machine learning model is (alpha, zeta)-safe in the presence of an adversarial attack, along with their statistical guarantees.

This talk will also illustrate how to apply our framework to a range of machine learning models — including various sizes of vision Transformer (ViT) and ResNet models — impaired by a variety of adversarial attacks.

This talk is part of the Information Theory Seminar series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

Log in

Information on

Provably Safe Certification for Machine Learning Models under Adversarial Attacks

This talk is included in these lists:

Other lists

Other talks