Log in

Cambridge users (raven) details

Other users details

No account? details

Information on

Subscribing to talks details

Finding a talk details

Adding a talk details

Disseminating talks details

Help and Documentation details

Similarity-based Methods for Language Model Analysis and Prediction

Add to your list(s) Download to your calendar using vCal

Julius Cheng (University of Cambridge)
Tuesday 18 March 2025, 13:00-14:00
Lecture Theatre 2, Computer Laboratory, William Gates Building.

If you have a question about this talk, please contact Mateja Jamnik.

In natural language, there are usually many ways to say the same thing: the answer to a question can be said multiple ways, and there are many good translations of the same sentence. As a result, language models (LMs) trained on large corpora often spread probability mass across a vast number of generations, containing mostly minor variations. This raises problems for LM applications; for prediction, probability is loosely correlated with quality, so various heuristics must be added to beam search to achieve adequate results. For uncertainty quantification, commonly used measures like Shannon entropy can overestimate uncertainty when probability is spread across functionally equivalent texts. In this talk, I will present my PhD thesis work which addresses these shortcomings using methods which incorporate measurements of semantic similarity. In prediction, returning a “protoypical” prediction according to semantic similarity outperforms high probability predictions. In uncertainty quantification, generalizing the classic Shannon entropy with semantic similarity leads to a more trustworthy measure. Lastly, we apply Bayesian optimization to translation reranking, which uses kernel similarity to efficiently search for high quality translations.

You can also join us on Zoom

This talk is part of the Artificial Intelligence Research Group Talks (Computer Laboratory) series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

Log in

Information on

Similarity-based Methods for Language Model Analysis and Prediction

This talk is included in these lists:

Other lists

Other talks