Log in

Cambridge users (raven) details

Other users details

No account? details

Information on

Subscribing to talks details

Finding a talk details

Adding a talk details

Disseminating talks details

Help and Documentation details

Misleading meta-objectives and hidden incentives for distributional shift

Add to your list(s) Download to your calendar using vCal

Paolo Bova (University of Cambridge)
Wednesday 08 May 2019, 17:00-19:00
Engineering Department, CBL Seminar room BE4-38.

If you have a question about this talk, please contact Adrià Garriga Alonso.

This week: “Misleading meta-objectives and hidden incentives for distributional shift.” David Krueger, Tegan Maharaj, Shane Legg and Jan Leike. [Paper] [BibTeX]

The authors aim to show that Meta-Learning can create hidden incentives for agents to change their task rather than solving the task we tell them to. An example would be an agent that predicts when someone wants coffee: after learning that the person has coffee in the morning they learn to wake them up when they try to sleep in, so following a seemingly suboptimal policy (wake up the human) results in a better prediction. Their paper runs experiments to show that Meta-Learning agents with Population-Based Training (PBT) learn to exhibit non-myopic behaviour even when their reward is myopic. They also demonstrate for these agents a method for eliminating this non-myopic behaviour that they call Environment Swapping.

As always, there will be free pizza. The first half hour is for stragglers to finish reading.

Invite your friends to join the mailing list (https://lists.cam.ac.uk/mailman/listinfo/eng-safe-ai), the Facebook group (https://www.facebook.com/groups/1070763633063871) or the talks.cam page (https://talks.cam.ac.uk/show/index/80932). Details about the next meeting, the week’s topic and other events will be advertised in these places.

This talk is part of the Engineering Safe AI series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

Log in

Information on

Misleading meta-objectives and hidden incentives for distributional shift

This talk is included in these lists:

Other lists

Other talks