| COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring. | ![]() |
University of Cambridge > Talks.cam > Language Technology Lab Seminars > Explanations as a Catalyst: Leveraging Large Language Models to Embrace Human Label Variation
Explanations as a Catalyst: Leveraging Large Language Models to Embrace Human Label VariationAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact Shun Shao. Abstract: Human label variation (HLV)—the phenomenon where multiple annotators provide different yet valid labels for the same data—is a rich source of information often dismissed as noise. Capturing this variation is crucial for building robust NLP systems, but doing so is typically resource-intensive. This talk presents a series of studies on how Large Language models (LLMs) can serve as a catalyst to embrace and model HLV , moving from scalable approximation to a deeper analysis of the reasoning process itself. First, I will discuss how LLMs can approximate full Human Judgment Distributions (HJDs) from just a few human-provided explanations. Our work shows that this explanation-based approach significantly improves alignment with human judgments. This investigation also reveals the limitations of traditional, instance-level distribution metrics and highlights the importance of complementing them with global-level measures to more effectively evaluate alignment. Building on this, the second part of the talk addresses the high cost of collecting human explanations by asking: can LLM -generated explanations serve as a viable proxy? We demonstrate that when guided by a few human labels, explanations generated by LLMs are indeed effective proxies, achieving comparable performance to human-written ones in approximating HJDs. This finding opens up a scalable and efficient pathway for modeling HLV , especially for datasets where human explanations are not available. Finally, I will shift from post-hoc explanation (justifying a given answer) to a forward-reasoning paradigm. I will introduce CoT2EL, a novel pipeline that extracts explanation-label pairs directly from an LLM ’s Chain-of-Thought (CoT) process before a final answer is selected. This method allows us to analyze the model’s reasoning across multiple plausible options. To better assess these nuanced judgments, I will also present a new rank-based evaluation framework that prioritizes the ordering of answers over exact distributional scores, showing a stronger alignment with human decision-making. Bio Beiduo Chen is a PhD student at the MaiNLP lab at LMU Munich, supervised by Prof. Barbara Plank. He is also a member of the European Laboratory for Learning and Intelligent Systems (ELLIS) PhD Program, co-supervised by Prof. Anna Korhonen at University of Cambridge. He received his Master’s and Bachelor’s degrees from the University of Science and Technology of China. His research focuses on human-centered NLP , with a special emphasis on the uncertainty, trustworthiness, and evaluation of Large Language Models. He has published several papers in top-tier NLP conferences, including ACL and EMNLP . This talk is part of the Language Technology Lab Seminars series. This talk is included in these lists:
Note that ex-directory lists are not shown. |
Other listsEquality, Diversity and Inclusion Mental Health Week 2013 Visual Rhetoric and modern South Asian History, Michaelmas 2016Other talksInterpolating Abelian and Non-Abelian Codes for Non-Clifford Operations Mapping the Cosmic Web in Lyα emission around quasars and non-AGN galaxies Next Steps Recent progress on the cutoff phenomenon Computational Biology: Seminar Series - Dr Roser Vento-Torno Categorical Conserved Currents |