Talks.cam will close on 1 July 2026, further information is available on the UIS Help Site
 

University of Cambridge > Talks.cam > Language Technology Lab Seminars > Retrieving and Sampling Diverse Outputs

Retrieving and Sampling Diverse Outputs

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Shun Shao.

Abstract: Real-world user queries often contain questions that admit a wide range of valid answers without a single ground truth. However, large language models (LLMs) often struggle to generate diverse and comprehensive responses. In this talk, we will discuss two paths towards this goal, (1) retrieving a diverse set of documents and (2) sampling a large number of responses from LLMs. In the first part of the talk, I will first quantify the limitations of existing dense retrievers which generate one query vector. Many strong retrievers all struggle when the gold document set contains dissimilar targets. To address this, we present a new retriever architecture that autoregressively generates multiple, distinct query vectors, and each query vector is used to retrieve documents from the corpus. In the second part of the talk, I will discuss inference strategies for sampling diverse outputs from LLMs. Prompting LLMs to sequentially generate a diverse set of answers works well for simpler factoid queries, but is less effective for more complex queries. We further explore merging outputs from multiple LLMs, showing its potential and challenges. I will conclude by discussing a multi-turn agentic framework interleaving retrieval and generation from LLMs to craft a comprehensive answer.

Bio: Eunsol Choi is an assistant professor of computer science and data science at New York University. Her research spans natural language processing and machine learning, with a focus on interpreting and reasoning about text in dynamic real-world contexts. Prior to joining NYU , she was an assistant professor at the University of Texas at Austin and a visiting researcher at Google. She holds a Ph.D. in computer science and engineering from the University of Washington. She is a recipient of a Facebook research fellowship, Google faculty research award, Sony faculty award, NSF CAREER award and an outstanding paper award at EMNLP .

This talk is part of the Language Technology Lab Seminars series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2025 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity