Log in

Cambridge users (raven) details

Other users details

No account? details

Information on

Subscribing to talks details

Finding a talk details

Adding a talk details

Disseminating talks details

Help and Documentation details

Evaluating Large Language Models as Model Systems for Language

Add to your list(s) Download to your calendar using vCal

Carina Kauf, MIT Brain and Cognitive Sciences
Friday 07 June 2024, 14:00-15:00
https://cam-ac-uk.zoom.us/j/88532356932?pwd=IVo8GI7wBssnObu7in3aNXBjwufHnJ.1.

If you have a question about this talk, please contact Richard Diehl Martinez.

In this talk, we investigate the potential of Large Language Models to serve as model systems for language. Model systems for language should first and foremost perform the relevant function, i.e., use language in the right way. In the first part of the talk we investigate this claim in two ways. First, we critically look at model evaluation. To investigate model performance, it is often beneficial to evaluate and compare the models’ performance on controlled sentence generation benchmarks. Here, we argue that Masked Language Model performance has been systematically underestimated due to a bias in the most commonly used sentence/word scoring method: Pseudo-log likelihood. We introduce an improved version of the scoring method which mitigates the observed bias. Then we evaluate if LLMs use language in a way consistent with humans’ generalized knowledge of common events, which is tightly linked with their language behavior. Overall, our results show that important aspects of event knowledge naturally emerge from distributional linguistic patterns, but also highlight a gap between representations of possible/impossible and likely/unlikely events. In the second part of the talk we shift gears and investigate LLMs as model systems more directly: We leverage artificial neural network language models as computational hypotheses of language processing in the human brain and measure the degree of alignment between the two systems when processing a variety of language stimuli. We find substantial alignment between the two systems and systematically investigate features that drive the observed similarity.

This talk is part of the NLIP Seminar Series series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

Log in

Information on

Evaluating Large Language Models as Model Systems for Language

This talk is included in these lists:

Other lists

Other talks