University of Cambridge > Talks.cam > NLIP Seminar Series > Evaluating Large Language Models as Model Systems for Language

Evaluating Large Language Models as Model Systems for Language

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Richard Diehl Martinez.

In this talk, we investigate the potential of Large Language Models to serve as model systems for language. Model systems for language should first and foremost perform the relevant function, i.e., use language in the right way. In the first part of the talk we investigate this claim in two ways. First, we critically look at model evaluation. To investigate model performance, it is often beneficial to evaluate and compare the models’ performance on controlled sentence generation benchmarks. Here, we argue that Masked Language Model performance has been systematically underestimated due to a bias in the most commonly used sentence/word scoring method: Pseudo-log likelihood. We introduce an improved version of the scoring method which mitigates the observed bias. Then we evaluate if LLMs use language in a way consistent with humans’ generalized knowledge of common events, which is tightly linked with their language behavior. Overall, our results show that important aspects of event knowledge naturally emerge from distributional linguistic patterns, but also highlight a gap between representations of possible/impossible and likely/unlikely events. In the second part of the talk we shift gears and investigate LLMs as model systems more directly: We leverage artificial neural network language models as computational hypotheses of language processing in the human brain and measure the degree of alignment between the two systems when processing a variety of language stimuli. We find substantial alignment between the two systems and systematically investigate features that drive the observed similarity.

This talk is part of the NLIP Seminar Series series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2024 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity