COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring. |
University of Cambridge > Talks.cam > Artificial Intelligence Research Group Talks (Computer Laboratory) > Similarity-based Methods for Language Model Analysis and Prediction
Similarity-based Methods for Language Model Analysis and PredictionAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact Mateja Jamnik. In natural language, there are usually many ways to say the same thing: the answer to a question can be said multiple ways, and there are many good translations of the same sentence. As a result, language models (LMs) trained on large corpora often spread probability mass across a vast number of generations, containing mostly minor variations. This raises problems for LM applications; for prediction, probability is loosely correlated with quality, so various heuristics must be added to beam search to achieve adequate results. For uncertainty quantification, commonly used measures like Shannon entropy can overestimate uncertainty when probability is spread across functionally equivalent texts. In this talk, I will present my PhD thesis work which addresses these shortcomings using methods which incorporate measurements of semantic similarity. In prediction, returning a “protoypical” prediction according to semantic similarity outperforms high probability predictions. In uncertainty quantification, generalizing the classic Shannon entropy with semantic similarity leads to a more trustworthy measure. Lastly, we apply Bayesian optimization to translation reranking, which uses kernel similarity to efficiently search for high quality translations. This talk is part of the Artificial Intelligence Research Group Talks (Computer Laboratory) series. This talk is included in these lists:
Note that ex-directory lists are not shown. |
Other listsC.U. Ethics in Mathematics Society (CUEiMS) jcu21's list Cambridge Café ScientifiqueOther talksSeminars in Cancer Optimizing the diffusion for sampling with overdamped Langevin dynamics Vector Quantization in Deep Neural Networks for Speech and Image Processing Joint CSER/CEENRG webinar with Dr Luca Mavelli Linking Vision Science to Decision Making in Safety-Critical Scenarios Future-proofing the Fens |