COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring. |
University of Cambridge > Talks.cam > Engineering Safe AI > Motivation for this group, Goodhart's Law
Motivation for this group, Goodhart's LawAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact AdriĆ Garriga Alonso. How can we design AI systems that reliably act according to the true intent of their users, even as the capability of the systems increases? Come to this reading group with free pizza! This week we will get started by motivating why we are doing this. In part, this is Goodhart’s Law [1] and its implications for evaluating AI systems, and designing their objectives. The session will go as follows. At 17:00, we will start reading the material (see bottom), mostly individually. At 17:30, the discussion leader will start going through the paper, making sure everyone understands, and encouraging discussion about its contents and implications. A basic understanding of machine learning is helpful, but detailed knowledge of the latest techniques is not required. Each session will have a brief recap of immediate necessary knowledge. The goal of this series is to get people to know more about the existing work in AI research, and eventually contribute to the field. Join the mailing list (https://lists.cam.ac.uk/mailman/listinfo/eng-safe-ai), the Facebook group (https://www.facebook.com/groups/1070763633063871) or the talks.cam page (https://talks.cam.ac.uk/show/index/80932). Announcements about the week’s topic and other events will be sent there. Consider also inviting your friends! READING MATERIAL : “Building safe artificial intelligence: specification, robustness, and assurance” (2018), by Pedro A. Ortega, Vishal Maini, and the DeepMind safety team https://medium.com/@deepmindsafetyresearch/building-safe-artificial-intelligence-52f5f75058f1 “On the folly of rewarding A, while hoping for B” (1975), by Steven Kerr http://web.mit.edu/curhan/www/docs/Articles/15341_Readings/Motivation/Kerr_Folly_of_rewarding_A_while_hoping_for_B.pdf “Categorizing Variants of Goodhart’s Law” (2018), by David Manheim and Scott Garrabrant (arXiv https://arxiv.org/abs/1803.04585) If you have already read the material in your own time, feel free to come by at 17:30. [1] https://en.wikipedia.org/wiki/Goodhart%2527s_law This talk is part of the Engineering Safe AI series. This talk is included in these lists:
Note that ex-directory lists are not shown. |
Other listsFrontiers in Artificial Intelligence Series Microsoft Research Computational Science Seminars Cambridge Usability GroupOther talksWill climate-volcano interactions be modulated by ongoing climate change? Perspective from explosive eruption column rise. PROFESSIONAL REGISTRATION WORKSHOP Making the tiniest machines Inducements to technical innovation in the British Industrial Revolution: markets, materiality and the invention of the spinning jenny |