University of Cambridge > Talks.cam > NLIP Seminar Series > Measuring Political Bias in Large Language Models

Measuring Political Bias in Large Language Models

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Suchir Salhan.

Large language models (LLMs) are helping millions of users to learn and write about a diversity of issues. In doing so, LLMs may expose users to new ideas and perspectives, or reinforce existing knowledge and user opinions. This creates concerns about political bias in LLMs, and how these biases might influence LLM users and society. In my talk, I will first discuss why measuring political biases in LLMs is difficult, and why most evidence so far should be approached with skepticism. Using the Political Compass Test as a case study, I will demonstrate critical issues of robustness and ecological validity when applying such tests to LLMs. Second, I will present our approach to building a more meaningful evaluation dataset called IssueBench, to measure biases in how LLMs write about political issues. I will describe the steps we took to make IssueBench realistic and robust. Then, I will outline our results from testing state-of-the-art LLMs with IssueBench, including clear evidence for issue bias, striking similarities in biases across models, and strong alignment with Democrat over Republican voter positions on a subset of issues. Bio: Paul is a postdoctoral researcher in the MilaNLP Lab at Bocconi University, working on evaluating and improving the alignment and safety of large language models, as well as measuring their societal impacts. For his recent work in this area, he won Outstanding Paper at ACL and Best Paper at NeurIPS D&B. Before coming to Milan, Paul completed his PhD at the University of Oxford, where he worked on LLMs for hate speech detection. During his PhD, Paul also co-founded Rewire, a start-up building AI for content moderation, which was acquired by another large online safety company in 2023.

This talk is part of the NLIP Seminar Series series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2025 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity