University of Cambridge > Talks.cam > Language Technology Lab Seminars > Value Reasoning and Test-Time Verification for Trustworthy LLMs

Value Reasoning and Test-Time Verification for Trustworthy LLMs

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Shun Shao.

Abstract: Despite their impressive capabilities, large language models (LLMs) continue to face significant limitations in complex real-world settings, particularly when navigating high-stakes moral reasoning or when efficient and trustworthy test-time behavior is required. This talk explores two complementary directions that address these challenges: evaluating limitations in value reasoning and scaling verification through efficient process supervision.

First, I introduce CLASH , a new benchmark that examines how well LLMs reason about dilemmas involving conflicting values. CLASH enables a structured analysis of decision ambivalence, psychological discomfort, and value shifts over time. The benchmark reveals the difficulty LLMs have in representing nuanced human value reasoning, especially in ambiguous or temporally dynamic contexts.

Second, I present ThinkPRM, a generative process reward model that enables step-by-step verification using long chain-of-thought reasoning. Unlike traditional discriminative PRMs that require extensive labeled supervision, ThinkPRM is trained on only a fraction of the process data by leveraging LLMs’ inherent reasoning abilities to generate and verify each step in a solution. This approach supports more scalable and efficient test-time oversight, outperforming strong baselines in various domains.

Bio: Lu Wang is an Associate Professor in Computer Science and Engineering at University of Michigan, Ann Arbor. Previously, she was an Assistant Professor in Khoury College of Computer Sciences at Northeastern University. She received her Ph.D. in Computer Science from Cornell University. Her research focuses on building trustworthy large language models that produce factual, accurate, and safe content. She has been working on problems of summarization, reasoning, evaluation, as well as applications in AI for education and computational social science. Lu has received paper awards at ACL , CHI, and SIGDIAL . She won the NSF CAREER award in 2021.

This talk is part of the Language Technology Lab Seminars series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2025 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity