COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring. |
University of Cambridge > Talks.cam > NLIP Seminar Series > Evaluating Large Language Models as Model Systems for Language
Evaluating Large Language Models as Model Systems for LanguageAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact Richard Diehl Martinez. In this talk, we investigate the potential of Large Language Models to serve as model systems for language. Model systems for language should first and foremost perform the relevant function, i.e., use language in the right way. In the first part of the talk we investigate this claim in two ways. First, we critically look at model evaluation. To investigate model performance, it is often beneficial to evaluate and compare the models’ performance on controlled sentence generation benchmarks. Here, we argue that Masked Language Model performance has been systematically underestimated due to a bias in the most commonly used sentence/word scoring method: Pseudo-log likelihood. We introduce an improved version of the scoring method which mitigates the observed bias. Then we evaluate if LLMs use language in a way consistent with humans’ generalized knowledge of common events, which is tightly linked with their language behavior. Overall, our results show that important aspects of event knowledge naturally emerge from distributional linguistic patterns, but also highlight a gap between representations of possible/impossible and likely/unlikely events. In the second part of the talk we shift gears and investigate LLMs as model systems more directly: We leverage artificial neural network language models as computational hypotheses of language processing in the human brain and measure the degree of alignment between the two systems when processing a variety of language stimuli. We find substantial alignment between the two systems and systematically investigate features that drive the observed similarity. This talk is part of the NLIP Seminar Series series. This talk is included in these lists:
Note that ex-directory lists are not shown. |
Other listsCharity ISOC Computational Radiotherapy Lionhead StudiosOther talksTransitions to vortex condensate growth in buoyancy-driven rotating turbulence Statistics Clinic Summer 2024 II Gates Cambridge Annual Lecture 2024: A global turning point: how to escape the permacrisis Exciton optics, dynamics and transport in layered perovskites fMRI vs. Electrophysiology in Humans |