Vibe checks and red teaming: why ML researchers are increasingly reverting to manual evaluation
- đ¤ Speaker: Arduin Findeis (University of Cambridge)
- đ Date & Time: Tuesday 16 January 2024, 13:00 - 14:00
- đ Venue: Lecture Theatre 2, Computer Laboratory, William Gates Building
Abstract
There is a curious trend in machine learning (ML): researchers developing the most capable large language models (LLMs) increasingly evaluate them using manual methods such as red teaming. In red teaming, researchers hire workers to manually try to break the LLM in some form by interacting with it. Similarly, some users pick their preferred LLM assistant by manually trying out various models â checking each LLM ’s “vibe”. Considering that LLM researchers and users both actively seek to automate all sorts of other tasks, red teaming and vibe checks are surprisingly manual evaluation processes. This trend towards manual evaluation hints at fundamental problems that prevent more automatic evaluation methods, such as benchmarks, to be used effectively for LLMs. In this talk, I aim to give an overview of the problems preventing LLM benchmarks from being a fully satisfactory alternative to more manual approaches.
Series This talk is part of the Artificial Intelligence Research Group Talks (Computer Laboratory) series.
Included in Lists
- All Talks (aka the CURE list)
- Artificial Intelligence Research Group Talks (Computer Laboratory)
- bld31
- Cambridge Centre for Data-Driven Discovery (C2D3)
- Cambridge Forum of Science and Humanities
- Cambridge Language Sciences
- Cambridge talks
- Chris Davis' list
- Department of Computer Science and Technology talks and seminars
- Guy Emerson's list
- Hanchen DaDaDash
- Interested Talks
- Lecture Theatre 2, Computer Laboratory, William Gates Building
- Martin's interesting talks
- ndk22's list
- ob366-ai4er
- PhD related
- rp587
- School of Technology
- Speech Seminars
- Trust & Technology Initiative - interesting events
- yk373's list
- yk449
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)


Tuesday 16 January 2024, 13:00-14:00